IRJET- Hand Talk- Assistant Technology for Deaf and DumbIRJET Journal
This document describes a smart glove system that translates sign language gestures into speech to help deaf and mute people communicate. The glove uses flex sensors on each finger to detect finger bending motions. An Arduino microcontroller processes the sensor data and sends it wirelessly via Bluetooth to an Android app. The app displays the sign language gesture and converts it to speech output. The goal is to help deaf and mute individuals communicate with hearing people by interpreting their sign language gestures into audible speech in real-time. The system is intended to bridge communication between those who understand sign language and those who do not.
Touchscreen Typing Accessibility for the Blind in IndiaAdit Gupta
Capstone research project on current HCI technologies like BrailleTouch and thhe Eyes-Free TalkingDialer project, Interviews with the Blind professionals and Future work plan
Communication between normal and handicapped person such as deaf people, dumb people, and blind people has always been a challenging task. Above portion shows the real communication between two societies. Our approach is important for deaf, dumb & blind person's decision-making and Human-Computer Interaction (HCI). It's a gift for a person who would like to learn language. It is useful for deaf, dumb & blind persons for their communication. The invention aims to facilitate people by means glove based deaf, dumb and blind communication interpreter system. The glove is internally equipped with flex sensor. For each specific gesture, the flex sensor produces a proportional change in resistance according to bending of finger of hand. The processing of these hand gestures is done in microcontroller. In addition, the system also includes a text to speech conversion block which translates the matched gestures i.e. text to voice output which help blind person during communication
Communication is the process of sending and receiving messages between parties. It involves senders and receivers, messages and channels, encoding and decoding of information, and feedback. Effective communication in organizations requires adapting messages to the purpose and audience through appropriate language, format and style. Both formal and informal networks exist within organizations to facilitate communication flows both vertically and horizontally. Grapevine networks also develop to share information through unofficial channels.
This document summarizes research on speaker recognition in noisy environments. It begins with an introduction discussing the goals of speaker identification and verification and their applications. It then provides details on the basic components of a speaker recognition system, including feature extraction and classification. The document focuses on methods for modeling noise, including generating multiple noisy training conditions and focusing matching on unaffected features. Experimental results are shown through snapshots of a prototype system interface that allows adding and recognizing speakers based on voice samples. The system is able to identify speakers in the presence of noise by comparing features to stored codebooks generated during training.
IRJET- Communication Aid for Deaf and Dumb PeopleIRJET Journal
This document describes a communication aid system for deaf and mute people that translates sign language gestures to text and speech. The system uses a glove with flex sensors that detect hand gestures. When a gesture is made, the sensors produce a signal that is matched to stored gesture inputs to translate letters, words, and sentences to speech and text. This helps remove communication barriers for the deaf by allowing them to convey meanings through gestures that are automatically translated. The system aims to bridge the gap between those who can hear and those with speech and hearing impairments.
IRJET- Hand Gesture based Recognition using CNN MethodologyIRJET Journal
This document summarizes a research paper on hand gesture recognition using convolutional neural networks (CNN). The paper aims to develop a system to recognize American Sign Language (ASL) to help facilitate communication for deaf individuals. The system would capture hand gestures via video and translate them into text. The researchers conducted a literature review on previous work using CNNs and 3D convolutional models for sign language recognition. They intend to implement a 3D CNN model on ASL data and analyze the results to improve recognition accuracy for communicating via sign language.
A Translation Device for the Vision Based Sign Languageijsrd.com
The Sign language is very important for people who have hearing and speaking deficiency generally called Deaf and Mute. It is the only mode of communication for such people to convey their messages and it becomes very important for people to understand their language. This paper proposes the method or algorithm for an application which would help in recognizing the different signs which is called Indian Sign Language. The images are of the palm side of right and left hand and are loaded at runtime. The method has been developed with respect to single user. The real time images will be captured first and then stored in directory and on recently captured image and feature extraction will take place to identify which sign has been articulated by the user through SIFT(scale invariance Fourier transform) algorithm. The comparisons will be performed in arrears and then after comparison the result will be produced in accordance through matched key points from the input image to the image stored for a specific letter already in the directory or the database the outputs for the following can be seen in below sections. There are 26 signs in Indian Sign Language corresponding to each alphabet out which the proposed algorithm provided with 95% accurate results for 9 alphabets with their images captured at every possible angle and distance.
IRJET- Hand Talk- Assistant Technology for Deaf and DumbIRJET Journal
This document describes a smart glove system that translates sign language gestures into speech to help deaf and mute people communicate. The glove uses flex sensors on each finger to detect finger bending motions. An Arduino microcontroller processes the sensor data and sends it wirelessly via Bluetooth to an Android app. The app displays the sign language gesture and converts it to speech output. The goal is to help deaf and mute individuals communicate with hearing people by interpreting their sign language gestures into audible speech in real-time. The system is intended to bridge communication between those who understand sign language and those who do not.
Touchscreen Typing Accessibility for the Blind in IndiaAdit Gupta
Capstone research project on current HCI technologies like BrailleTouch and thhe Eyes-Free TalkingDialer project, Interviews with the Blind professionals and Future work plan
Communication between normal and handicapped person such as deaf people, dumb people, and blind people has always been a challenging task. Above portion shows the real communication between two societies. Our approach is important for deaf, dumb & blind person's decision-making and Human-Computer Interaction (HCI). It's a gift for a person who would like to learn language. It is useful for deaf, dumb & blind persons for their communication. The invention aims to facilitate people by means glove based deaf, dumb and blind communication interpreter system. The glove is internally equipped with flex sensor. For each specific gesture, the flex sensor produces a proportional change in resistance according to bending of finger of hand. The processing of these hand gestures is done in microcontroller. In addition, the system also includes a text to speech conversion block which translates the matched gestures i.e. text to voice output which help blind person during communication
Communication is the process of sending and receiving messages between parties. It involves senders and receivers, messages and channels, encoding and decoding of information, and feedback. Effective communication in organizations requires adapting messages to the purpose and audience through appropriate language, format and style. Both formal and informal networks exist within organizations to facilitate communication flows both vertically and horizontally. Grapevine networks also develop to share information through unofficial channels.
This document summarizes research on speaker recognition in noisy environments. It begins with an introduction discussing the goals of speaker identification and verification and their applications. It then provides details on the basic components of a speaker recognition system, including feature extraction and classification. The document focuses on methods for modeling noise, including generating multiple noisy training conditions and focusing matching on unaffected features. Experimental results are shown through snapshots of a prototype system interface that allows adding and recognizing speakers based on voice samples. The system is able to identify speakers in the presence of noise by comparing features to stored codebooks generated during training.
IRJET- Communication Aid for Deaf and Dumb PeopleIRJET Journal
This document describes a communication aid system for deaf and mute people that translates sign language gestures to text and speech. The system uses a glove with flex sensors that detect hand gestures. When a gesture is made, the sensors produce a signal that is matched to stored gesture inputs to translate letters, words, and sentences to speech and text. This helps remove communication barriers for the deaf by allowing them to convey meanings through gestures that are automatically translated. The system aims to bridge the gap between those who can hear and those with speech and hearing impairments.
IRJET- Hand Gesture based Recognition using CNN MethodologyIRJET Journal
This document summarizes a research paper on hand gesture recognition using convolutional neural networks (CNN). The paper aims to develop a system to recognize American Sign Language (ASL) to help facilitate communication for deaf individuals. The system would capture hand gestures via video and translate them into text. The researchers conducted a literature review on previous work using CNNs and 3D convolutional models for sign language recognition. They intend to implement a 3D CNN model on ASL data and analyze the results to improve recognition accuracy for communicating via sign language.
A Translation Device for the Vision Based Sign Languageijsrd.com
The Sign language is very important for people who have hearing and speaking deficiency generally called Deaf and Mute. It is the only mode of communication for such people to convey their messages and it becomes very important for people to understand their language. This paper proposes the method or algorithm for an application which would help in recognizing the different signs which is called Indian Sign Language. The images are of the palm side of right and left hand and are loaded at runtime. The method has been developed with respect to single user. The real time images will be captured first and then stored in directory and on recently captured image and feature extraction will take place to identify which sign has been articulated by the user through SIFT(scale invariance Fourier transform) algorithm. The comparisons will be performed in arrears and then after comparison the result will be produced in accordance through matched key points from the input image to the image stored for a specific letter already in the directory or the database the outputs for the following can be seen in below sections. There are 26 signs in Indian Sign Language corresponding to each alphabet out which the proposed algorithm provided with 95% accurate results for 9 alphabets with their images captured at every possible angle and distance.
Speech Recognition: Transcription and transformation of human speechSubmissionResearchpa
The specified subfield of computational linguistics and computer science can said to be linked with speech recognition. Speech recognition can develop new variation technologies as well as methodologies generated as interdisciplinary concept. It can be considered to translate and recognize and satisfy the capability towards understanding and translating the words that are already spoken. It is more preciously said that in the most recent times this field has secured positive feedback by intense learning of voice recognition. Such evidences shows the proof that it has more market demand for implementing the application of specific data as voice recognition. Deployment of speech recognition systems can be utilized as the evidence shown to its analyzing methods that is helpful for designing each and every individual’s future. It is said that the computer plays an important role for this process as by this all the translated words can be acknowledged by the texts also. Vishal Dineshkumar Soni 2019. Speech Recognition: Transcription and transformation of human speech . International Journal on Integrated Education. 2, 6 (Dec. 2019), 257-262. DOI:https://doi.org/10.31149/ijie.v2i6.497. Pdf Url : https://journals.researchparks.org/index.php/IJIE/article/view/497/478 Paper Url : https://journals.researchparks.org/index.php/IJIE/article/view/497
IRJET - Gesture based Communication Recognition SystemIRJET Journal
This document describes a proposed gesture-based communication recognition system that aims to translate between finger spelling and speech to help facilitate communication between deaf and hearing individuals. It discusses using techniques like mel frequency cepstral coefficients (MFCCs) to extract features from speech for recognition purposes. The system architecture involves preprocessing and modeling input signals, extracting features, and performing feature matching. Challenges with vision-based hand motion recognition are also presented, and the motivation for the project is to help reduce dependence on sign language interpreters for deaf individuals.
Speech recognition is an advanced technology that uses desired equipment and a service which can be controlled through voice without touching the screen of the smart phone. In current century, there are many researches with the help of speech recognition on mobile devices. In this system, mobile phone users can command with their voice to easily make phone call. Google's cloud speech API is used to recognize the incoming user voice. The speech API recognizes over 120 languages but it cannot correctly provide Myanmar Language still now. The system will classify the Myanmar proper name recognized by the Googles speech API to get the correct name with the help of Naïve Bayesian Classifier. The contact name classified by Naive Bayes can only meet user's desired one just written in English script and it cannot provide the name written in Myanmar script. This system uses hybrid transliteration approach to solve the contact name recorded by Myanmar script. Therefore the system can make phone call to the contact name typed with not only English script but also Myanmar script. The system applies Jaro Winker distance measure to outperform the accuracy of system output. Success rate is used to measure the performance of each process contained in the system. This system is implemented with Android programming language. Aye Thida | Yee Wai Khaing "Voice Command Mobile Phone Dialer" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26814.pdfPaper URL: https://www.ijtsrd.com/computer-science/artificial-intelligence/26814/voice-command-mobile-phone-dialer/aye-thida
This document is a project report submitted by three students - Ashwani Kumar, Ankit Raj, and Anand Abhishek - to Cochin University of Science & Technology in partial fulfillment of their Bachelor of Technology degree in Information Technology. The report describes a voice recognition mobile application called HandOVRS designed for physically handicapped users that can recognize common sounds in the home like doorbells, phones, and alarms and allow the user to select notification options like sending text messages.
Communication for and translation device for deaf-blind person. The glove translates hand-touched alphabet "Lorm", a common form of communication used by people with both hearing and sight impairment , into text and vice versa. A mobile lorm glove which enables deaf-blind person to compose message and even browsing internet , and e-book reading. Lorm glove is made up of a capacitive touch sensor which when pressed the corresponding alphabet is generated and in reception the vibrators are used
Development of a Novel Conversational Calculator Based on Remote Online Compu...toukaigi
This document describes the development of a novel conversational calculator that uses speech recognition and remote online computation. The researchers conducted a Wizard of Oz experiment with 100 people to collect natural language queries about calculations. They used this data to build a language model specific to conversations with a calculator. This improved the speech recognition accuracy compared to general purpose systems. The conversational calculator allows for multi-step calculations, currency conversions, and relies on the Wolfram Alpha computational engine instead of local processing.
This document describes a smart glove system that translates sign language gestures into speech and text to help deaf and mute people communicate. The system uses flex sensors on a glove to detect hand gestures, which are processed by an Arduino microcontroller. The Arduino identifies letters and words from the gestures and outputs them as speech from a connected speaker and as text on an Android phone app. The goal is to help deaf-mute individuals effectively convey information to people without sign language training by translating their gestures into audio and text in real-time.
Lights and Drums: an Unplugged-style ActivityGeorge Boukeas
Summary: In this unplugged-style activity, each group will transmit a message to the other groups, using a different code (known to the recipients) and a different communication medium. The aim is to familiarize the participants with the concept of information representation and especially digital binary representation. At the same time, the part of the activity pertaining to message transmission introduces important issues regarding the role of symbols in each code, as well as the role of the physical carrier used for transmitting or storing information.
Goals: Associate representations that employ the symbols 0 and 1 with other, more familiar, binary representations, thus establishing a view of the former as yet another binary representation. Highlight the physical form of information in the systems that transmit, store or process this information.
Duration: at least 30 minutes
A man-machine interaction project is described which aims to establish an automated voice to sign
language translator for communication with the deaf using integrated open technologies. The first
prototype consists of a robotic hand designed with OpenSCAD and manufactured with a low-cost 3D
printer which smoothly reproduces the alphabet of the sign language controlled by voice only. The core
automation comprises an Arduino UNO controller used to activate a set of servo motors that follow
instructions from a Raspberry Pi mini-computer having installed the open source speech recognition engine
Julius. We discuss its features, limitations and possible future developments.
The document introduces Telescript technology, which allows software agents to transport themselves across networks. It describes the electronic marketplace enabled by Telescript, where agents of consumers and providers can interact. The Telescript language allows agents to travel between "places" using a single "go" instruction and meet other agents using a "meet" instruction. Examples show how a warehouse and shopper agents use these instructions to browse and purchase products in the electronic marketplace.
The document discusses mobile agents and how they enable new types of communicating applications. Mobile agents are programs that can move from computer to computer in a network to perform tasks. They allow applications to distribute parts of themselves across multiple computers. This overcomes barriers that exist with today's networks that do not allow third-party software developers to distribute applications across user and server computers in the network. The document introduces the concept of mobile agents and how Telescript technology implements this through a new programming language and model called remote programming. Mobile agents will allow new powerful applications to be developed that can operate across networks in more efficient and timely ways than is possible today.
The document discusses voice browsers, which allow users to access websites and information using voice commands rather than a graphical user interface. It describes key components of voice browsers like VoiceXML for creating voice interfaces, speech recognition, text-to-speech synthesis, and call control. The document also outlines possible applications of voice browsers and standards developed by the W3C to make voice interfaces compatible across platforms.
Our speech to text conversion project aims to help the nearly 20% of people worldwide with disabilities by allowing them to control their computer and share information using only their voice. The system uses acoustic and language models with a speech engine to recognize speech and convert it to text. It can perform operations like opening calculator and wordpad. Speech recognition has applications in areas like cars, healthcare, education and daily life. Accuracy depends on factors like vocabulary size, speaker dependence, and speech type (isolated, continuous). The system aims to improve accessibility while reducing costs.
IRJET- A Smart Glove for the Dumb and DeafIRJET Journal
1) The document describes a smart glove that can translate sign language gestures into speech to help deaf people communicate.
2) The glove uses flex sensors to detect finger movements, and an accelerometer and gyroscope to detect hand movements.
3) The sensors' data is processed by a Raspberry Pi microprocessor which analyzes the gestures and outputs text on a screen and speech through a speaker to translate the sign language into a form hearing people can understand.
The document summarizes a presentation on automatic speech recognition systems. It includes an introduction defining ASR as the transcription of spoken language into text in real time. It shows the basic block diagram of an ASR system and explains how it works similarly to the human process of hearing, transmitting signals to the brain, and understanding. Some key uses of ASR are in smartphones, AI robots, home automation, and computers. The benefits mentioned are hands-free use, aiding reading and spelling, and easy operation.
The document discusses various terms used to describe visual impairments including partially sighted, low vision, legally blind, and totally blind. It then provides information on organizations, products, and services that support those with visual impairments including screen readers, braille devices, magnification software, and global positioning systems.
This report provides an overview of speech recognition technology, including how speech recognition systems work, common applications, and future uses. It discusses key concepts such as utterances, pronunciation, grammar, accuracy, and training. The report also examines speech recognition software and provides examples of free and commercial speech recognition programs. Overall, the report finds that speech recognition has various applications in fields like education, healthcare, gaming, and more, and the technology is expected to continue advancing to support additional future applications.
GGULIVRR: Touching Mobile and Contextual LearningeLearning Papers
1) Project GGULIVRR explores using mobile technologies like NFC tags and QR codes to link physical objects and locations to digital educational games.
2) The project aims to develop 21st century skills through creating and playing contextual mobile games on topics like a city's underground infrastructure.
3) Games are built in a generic framework that allows non-technical users to author new games by combining multimedia content and scripted gameplay rules.
IRJET- Smart Speaking Glove for Speech Impaired PeopleIRJET Journal
This document describes a smart speaking glove system for speech impaired people that uses flex sensors on a glove to detect gestures and convert them to synthesized speech output. The flex sensors detect finger bending and send signals to a microcontroller. The microcontroller matches the signals to predefined gestures and messages stored in its database and outputs the corresponding message to an LCD display and speaker. It also includes an emergency function using a GPS and GSM modules to track the user's location and send a message if they activate a panic switch.
THE PSYCHOLOGY OF MICRO-INTERACTIONS: How to make users love, not like, your appSarah Eva Monroe
Above and beyond feedback, micro-interactions can also add a human element to your interface. Do you have an opinion on the heart v. star interaction on the Twitter app? If so, you're tuned into micro-interactions and their power.
As described in Dan Saffer's book Microinteractions, these tiny details typically serve a range of essential functions in your design, from communicating feedback, demonstrating accomplishment of an individual task, or helping users visualize the results of their actions and prevent errors.
In this session, we'll workshop your own interfaces and look for opportunities to improve, add, or subtract micro-interactions.
Key Take Aways
1. Understand the dos and don'ts of micro-interactions
2. Learn about why micro-interactions make us feel so good
3. Understand how colors and icons support micro-interactions
**note that this presentation has many animations that don't comes through in a static format**
My Oral Village is a social enterprise dedicated to achieving financial inclusion among nearly 1 billion illiterate adults world-wide by designing retail financial systems that they can understand and use. Similar to the deployment of Braille on ATMs, our human-centred design solutions address the cognitive needs of a large, vulnerable population without disadvantaging other users, opening up a large potential market segment for participation in the cash economy and the financial system.
Speech Recognition: Transcription and transformation of human speechSubmissionResearchpa
The specified subfield of computational linguistics and computer science can said to be linked with speech recognition. Speech recognition can develop new variation technologies as well as methodologies generated as interdisciplinary concept. It can be considered to translate and recognize and satisfy the capability towards understanding and translating the words that are already spoken. It is more preciously said that in the most recent times this field has secured positive feedback by intense learning of voice recognition. Such evidences shows the proof that it has more market demand for implementing the application of specific data as voice recognition. Deployment of speech recognition systems can be utilized as the evidence shown to its analyzing methods that is helpful for designing each and every individual’s future. It is said that the computer plays an important role for this process as by this all the translated words can be acknowledged by the texts also. Vishal Dineshkumar Soni 2019. Speech Recognition: Transcription and transformation of human speech . International Journal on Integrated Education. 2, 6 (Dec. 2019), 257-262. DOI:https://doi.org/10.31149/ijie.v2i6.497. Pdf Url : https://journals.researchparks.org/index.php/IJIE/article/view/497/478 Paper Url : https://journals.researchparks.org/index.php/IJIE/article/view/497
IRJET - Gesture based Communication Recognition SystemIRJET Journal
This document describes a proposed gesture-based communication recognition system that aims to translate between finger spelling and speech to help facilitate communication between deaf and hearing individuals. It discusses using techniques like mel frequency cepstral coefficients (MFCCs) to extract features from speech for recognition purposes. The system architecture involves preprocessing and modeling input signals, extracting features, and performing feature matching. Challenges with vision-based hand motion recognition are also presented, and the motivation for the project is to help reduce dependence on sign language interpreters for deaf individuals.
Speech recognition is an advanced technology that uses desired equipment and a service which can be controlled through voice without touching the screen of the smart phone. In current century, there are many researches with the help of speech recognition on mobile devices. In this system, mobile phone users can command with their voice to easily make phone call. Google's cloud speech API is used to recognize the incoming user voice. The speech API recognizes over 120 languages but it cannot correctly provide Myanmar Language still now. The system will classify the Myanmar proper name recognized by the Googles speech API to get the correct name with the help of Naïve Bayesian Classifier. The contact name classified by Naive Bayes can only meet user's desired one just written in English script and it cannot provide the name written in Myanmar script. This system uses hybrid transliteration approach to solve the contact name recorded by Myanmar script. Therefore the system can make phone call to the contact name typed with not only English script but also Myanmar script. The system applies Jaro Winker distance measure to outperform the accuracy of system output. Success rate is used to measure the performance of each process contained in the system. This system is implemented with Android programming language. Aye Thida | Yee Wai Khaing "Voice Command Mobile Phone Dialer" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26814.pdfPaper URL: https://www.ijtsrd.com/computer-science/artificial-intelligence/26814/voice-command-mobile-phone-dialer/aye-thida
This document is a project report submitted by three students - Ashwani Kumar, Ankit Raj, and Anand Abhishek - to Cochin University of Science & Technology in partial fulfillment of their Bachelor of Technology degree in Information Technology. The report describes a voice recognition mobile application called HandOVRS designed for physically handicapped users that can recognize common sounds in the home like doorbells, phones, and alarms and allow the user to select notification options like sending text messages.
Communication for and translation device for deaf-blind person. The glove translates hand-touched alphabet "Lorm", a common form of communication used by people with both hearing and sight impairment , into text and vice versa. A mobile lorm glove which enables deaf-blind person to compose message and even browsing internet , and e-book reading. Lorm glove is made up of a capacitive touch sensor which when pressed the corresponding alphabet is generated and in reception the vibrators are used
Development of a Novel Conversational Calculator Based on Remote Online Compu...toukaigi
This document describes the development of a novel conversational calculator that uses speech recognition and remote online computation. The researchers conducted a Wizard of Oz experiment with 100 people to collect natural language queries about calculations. They used this data to build a language model specific to conversations with a calculator. This improved the speech recognition accuracy compared to general purpose systems. The conversational calculator allows for multi-step calculations, currency conversions, and relies on the Wolfram Alpha computational engine instead of local processing.
This document describes a smart glove system that translates sign language gestures into speech and text to help deaf and mute people communicate. The system uses flex sensors on a glove to detect hand gestures, which are processed by an Arduino microcontroller. The Arduino identifies letters and words from the gestures and outputs them as speech from a connected speaker and as text on an Android phone app. The goal is to help deaf-mute individuals effectively convey information to people without sign language training by translating their gestures into audio and text in real-time.
Lights and Drums: an Unplugged-style ActivityGeorge Boukeas
Summary: In this unplugged-style activity, each group will transmit a message to the other groups, using a different code (known to the recipients) and a different communication medium. The aim is to familiarize the participants with the concept of information representation and especially digital binary representation. At the same time, the part of the activity pertaining to message transmission introduces important issues regarding the role of symbols in each code, as well as the role of the physical carrier used for transmitting or storing information.
Goals: Associate representations that employ the symbols 0 and 1 with other, more familiar, binary representations, thus establishing a view of the former as yet another binary representation. Highlight the physical form of information in the systems that transmit, store or process this information.
Duration: at least 30 minutes
A man-machine interaction project is described which aims to establish an automated voice to sign
language translator for communication with the deaf using integrated open technologies. The first
prototype consists of a robotic hand designed with OpenSCAD and manufactured with a low-cost 3D
printer which smoothly reproduces the alphabet of the sign language controlled by voice only. The core
automation comprises an Arduino UNO controller used to activate a set of servo motors that follow
instructions from a Raspberry Pi mini-computer having installed the open source speech recognition engine
Julius. We discuss its features, limitations and possible future developments.
The document introduces Telescript technology, which allows software agents to transport themselves across networks. It describes the electronic marketplace enabled by Telescript, where agents of consumers and providers can interact. The Telescript language allows agents to travel between "places" using a single "go" instruction and meet other agents using a "meet" instruction. Examples show how a warehouse and shopper agents use these instructions to browse and purchase products in the electronic marketplace.
The document discusses mobile agents and how they enable new types of communicating applications. Mobile agents are programs that can move from computer to computer in a network to perform tasks. They allow applications to distribute parts of themselves across multiple computers. This overcomes barriers that exist with today's networks that do not allow third-party software developers to distribute applications across user and server computers in the network. The document introduces the concept of mobile agents and how Telescript technology implements this through a new programming language and model called remote programming. Mobile agents will allow new powerful applications to be developed that can operate across networks in more efficient and timely ways than is possible today.
The document discusses voice browsers, which allow users to access websites and information using voice commands rather than a graphical user interface. It describes key components of voice browsers like VoiceXML for creating voice interfaces, speech recognition, text-to-speech synthesis, and call control. The document also outlines possible applications of voice browsers and standards developed by the W3C to make voice interfaces compatible across platforms.
Our speech to text conversion project aims to help the nearly 20% of people worldwide with disabilities by allowing them to control their computer and share information using only their voice. The system uses acoustic and language models with a speech engine to recognize speech and convert it to text. It can perform operations like opening calculator and wordpad. Speech recognition has applications in areas like cars, healthcare, education and daily life. Accuracy depends on factors like vocabulary size, speaker dependence, and speech type (isolated, continuous). The system aims to improve accessibility while reducing costs.
IRJET- A Smart Glove for the Dumb and DeafIRJET Journal
1) The document describes a smart glove that can translate sign language gestures into speech to help deaf people communicate.
2) The glove uses flex sensors to detect finger movements, and an accelerometer and gyroscope to detect hand movements.
3) The sensors' data is processed by a Raspberry Pi microprocessor which analyzes the gestures and outputs text on a screen and speech through a speaker to translate the sign language into a form hearing people can understand.
The document summarizes a presentation on automatic speech recognition systems. It includes an introduction defining ASR as the transcription of spoken language into text in real time. It shows the basic block diagram of an ASR system and explains how it works similarly to the human process of hearing, transmitting signals to the brain, and understanding. Some key uses of ASR are in smartphones, AI robots, home automation, and computers. The benefits mentioned are hands-free use, aiding reading and spelling, and easy operation.
The document discusses various terms used to describe visual impairments including partially sighted, low vision, legally blind, and totally blind. It then provides information on organizations, products, and services that support those with visual impairments including screen readers, braille devices, magnification software, and global positioning systems.
This report provides an overview of speech recognition technology, including how speech recognition systems work, common applications, and future uses. It discusses key concepts such as utterances, pronunciation, grammar, accuracy, and training. The report also examines speech recognition software and provides examples of free and commercial speech recognition programs. Overall, the report finds that speech recognition has various applications in fields like education, healthcare, gaming, and more, and the technology is expected to continue advancing to support additional future applications.
GGULIVRR: Touching Mobile and Contextual LearningeLearning Papers
1) Project GGULIVRR explores using mobile technologies like NFC tags and QR codes to link physical objects and locations to digital educational games.
2) The project aims to develop 21st century skills through creating and playing contextual mobile games on topics like a city's underground infrastructure.
3) Games are built in a generic framework that allows non-technical users to author new games by combining multimedia content and scripted gameplay rules.
IRJET- Smart Speaking Glove for Speech Impaired PeopleIRJET Journal
This document describes a smart speaking glove system for speech impaired people that uses flex sensors on a glove to detect gestures and convert them to synthesized speech output. The flex sensors detect finger bending and send signals to a microcontroller. The microcontroller matches the signals to predefined gestures and messages stored in its database and outputs the corresponding message to an LCD display and speaker. It also includes an emergency function using a GPS and GSM modules to track the user's location and send a message if they activate a panic switch.
THE PSYCHOLOGY OF MICRO-INTERACTIONS: How to make users love, not like, your appSarah Eva Monroe
Above and beyond feedback, micro-interactions can also add a human element to your interface. Do you have an opinion on the heart v. star interaction on the Twitter app? If so, you're tuned into micro-interactions and their power.
As described in Dan Saffer's book Microinteractions, these tiny details typically serve a range of essential functions in your design, from communicating feedback, demonstrating accomplishment of an individual task, or helping users visualize the results of their actions and prevent errors.
In this session, we'll workshop your own interfaces and look for opportunities to improve, add, or subtract micro-interactions.
Key Take Aways
1. Understand the dos and don'ts of micro-interactions
2. Learn about why micro-interactions make us feel so good
3. Understand how colors and icons support micro-interactions
**note that this presentation has many animations that don't comes through in a static format**
My Oral Village is a social enterprise dedicated to achieving financial inclusion among nearly 1 billion illiterate adults world-wide by designing retail financial systems that they can understand and use. Similar to the deployment of Braille on ATMs, our human-centred design solutions address the cognitive needs of a large, vulnerable population without disadvantaging other users, opening up a large potential market segment for participation in the cash economy and the financial system.
Anand Kumar is a mathematician from Patna, Bihar who developed a love of mathematics and exceptional mathematical abilities. He dreamed of attending Cambridge University but could not afford it after his father died. He started tutoring students and opened his own mathematics institute with 500 students. In 2003, he started "Super 30" which provides free coaching to 30 underprivileged students each year to help them get into IIT. Over 270 students have been coached, with over 90% gaining admission to IIT. Anand Kumar has received much praise and awards for his work but refuses financial assistance, wanting to sustain Super 30 through his own efforts.
The document describes research into designing a mobile phone application to help illiterate users send and receive SMS messages. It discusses previous related work, interviews with potential users, and the development and testing of a voice-assisted SMS app called Easy Texting. The app uses text-to-speech, icons, sound effects, and tap gestures to allow composing and reading SMS without reading skills. A lab study with 9 participants found that after 10 minutes of training, 2 out of 3 users could successfully use the karaoke feature to read messages, but some had difficulties with simple tap versus double tap gestures and screen navigation.
The document is a summer training report submitted by Naresh Kumar to Pacific Business School in partial fulfillment of an MBA program. It discusses a study of the strategy and functioning of field forces at Bajaj Allianz General Insurance. The report includes an introduction, certificate from the sales manager, acknowledgements, index, and initial sections on introduction and the insurance organization.
Group Fun is a Facebook application that allows friends to collaboratively create music playlists for events. The author redeveloped the application, learning human-computer interaction principles. They created prototypes, organized the codebase using MVC pattern, and technologies like PHP, HTML and JavaScript. Screenshots show interfaces for creating groups, inviting friends, uploading and listening to music. The author learned about challenges beyond coding like user experience design.
This document is a summer training report submitted by Naresh Kumar to Pacific Business School about a study of the strategy and functioning of field forces at Bajaj Allianz General Insurance. It includes an introduction, acknowledgements, index, and sections on the insurance need and introduction to the organization. The report was submitted in partial fulfillment of an MBA program and provides an overview of Bajaj Allianz General Insurance and the insurance sector in India.
Review Paper on Two Way Communication System with Binary Code Medium for Peop...IRJET Journal
1) The document discusses a proposed system to aid communication between blind, deaf, and mute individuals using binary code.
2) It reviews existing research on communication systems using Morse code and tactile methods.
3) The proposed system would convert speech to visual contexts and vibrations, and vice versa, using a multimodal approach to allow communication across disabilities.
This document presents two studies that investigate whether mobile phone-based access to complex financial services can reach the unbanked, and if so, what type of user interface is best. The first study was an ethnographic exploration involving 90 subjects across 4 countries that examined how non-literate and semi-literate populations currently use existing mobile payment systems. The second study was a formal usability test with 58 subjects in India that compared text-based, spoken dialog, and rich multimedia interfaces for a mobile banking system. The results showed that non-text designs were preferred over text-based designs, and while task completion rates were better for rich multimedia, the spoken dialog interface was faster and required less assistance.
SMARCOS Abstract Paper submitted to ICCHP 2012Smarcos Eu
This study is part of the European project "Smarcos" (http://www.smarcos-project.eu/) that includes among its goals the development of services which are specifically designed and accessible for blind users.
In this paper we present the prototype application designed to make the main phone features available in a way which is accessible for a blind user. The prototype has been developed to firstly evaluate the interaction modalities based on gestures, audio and vibro-tactile feedback.
SignConnect is a software called SignSense that uses artificial intelligence to interpret sign language gestures in real-time, with the goal of improving communication for deaf individuals. It supports multiple sign languages and has an intuitive interface. By accurately translating sign language to text or speech, SignSense aims to break down barriers and allow deaf people to communicate seamlessly and participate more in society. The software uses Python, TensorFlow, and OpenCV to handle its functions, machine learning models, and computer vision capabilities needed to recognize sign language gestures.
While a hearing-impaired individual depends on sign language and gestures, non-hearing-impaired person uses verbal language. Thus, there is need for means of arbitration to forestall situation when a non-hearing-impaired individual who does not understand the sign language wants to communicate with a hearing-impaired person. This paper is concerned with the development of a PC-based sign language translator to facilitate effective communication between hearing-impaired and non-hearing-impaired persons. Database of hand gestures in American sign language (ASL) is created using Python scripts. TensorFlow (TF) is used in the creation of a pipeline configuration model for machine learning of annotated images of gestures in the database with the real time gestures. The implementation is done in Python software environment and it runs on a PC equipped with a web camera to capture real time gestures for comparison and interpretations. The developed sign language translator is able to translate ASL/gestures to written texts along with corresponding audio renderings at an average duration of about one second. In addition, the translator is able to match real time gestures with the equivalent gesture images stored in the database even at 44% similarity.
Mobile devices are increasingly becoming part of everyday
life for many different uses. These devices are mainly based
on using touch-screens, which is challenging for people
with disabilities. For visually-impaired people interacting
with touch-screens can be very complex because of the lack
of hardware keys or tactile references. Thus it is necessary
to investigate how to design applications, accessibility
supports (e.g. screen readers) and operating systems for
mobile accessibility. Our aim is to investigate interaction
modality so that even those who have sight problems can
successfully interact with touch-screens. A crucial issue
concerns the lack of HW buttons on the numpad. Herein
we propose a possible solution to overcome this factor. In
this work we present the results of evaluating a prototype
developed for the Android platform used on mobile
devices. 20 blind users were involved in the study. The
results have shown a positive response especially with
regard to users who had never interacted with touchscreens
The current presentation presents the approach followed for the derivation of the Use Cases developed in the context of ÆGIS Integrating Project (Grant Agreement: 224348) of the 7th Framework Programme, which constitute the core outputs of the user needs phase of the project, together with the Personas and the conceptual models, setting the basis for the upcoming development and evaluation phases of the project. ÆGIS aims to embed support for accessibility through the development of an Open Accessibility Framework (OAF), upon which, open source accessibility interfaces and applications for the users as well as accessibility toolkits for the developers will be built. Within ÆGIS, three mainstream markets are targeted, namely the desktop, rich Internet applications and mobile devices/applications market segments. The Use Cases developed address all three application areas, targeted by ÆGIS. The User Centred Design (UCD) plan defined from the early beginning of the initiative constituted the cornerstone of the work for the Use Cases, Personas and conceptual models. Following this plan, the project Use Cases have built on the outcomes of the field trials and the workshops (national and Pan-European), where a representative sample of all ÆGIS targeted user groups has participated and the valuable expertise of the ÆGIS Consortium members. Following a concrete development methodology, 36 Use Cases (12 for desktop, 15 for mobile and 9 for rich Internet applications), accompanied by Unified Modeling Language (UML) diagrams, 17 Personas covering all major target groups of the project and 13 conceptual models, which are mapped to the Use Cases, have emerged. The Use Cases will be further elaborated to specific application scenarios that will orient the evaluation to take place in ÆGIS in three iterative phases and across 4 Pilot sites (Belgium, Spain, Sweden and in the UK). The Use Cases are seen as working document, which may be subject to updates and revisions throughout the project, following and keeping up with the progress noticed in ÆGIS project and the overall open source accessibility community.
The document describes a hand gesture recognition system for deaf persons to communicate their thoughts to others. It aims to bridge the communication gap between deaf-mute people and the general public by converting gestures captured in real-time via camera, which are trained using a convolutional neural network (CNN), into text output. The system allows deaf-mute users to interact with computer applications using gestures detected by their webcam without needing to install additional applications. It discusses the background and relevance of the project, as well as objectives like designing the gesture training, extracting features from images, and recognizing gestures to translate them to text.
The document describes a hand gesture recognition system for deaf persons to communicate their thoughts to others. It aims to bridge the communication gap between deaf-mute people and the general public by converting gestures captured in real-time via camera, which are trained using a convolutional neural network (CNN), into text output. The system allows deaf-mute users to interact with computer applications using gestures detected by their webcam without needing to install additional applications. It discusses the background and relevance of the project, as well as objectives like designing the gesture training, extracting features from images, and recognizing gestures to translate them to text.
The document describes a proposed voice-based email system for blind users. The system would use speech recognition to allow users to compose and send emails solely through voice commands. It would also use text-to-speech to read incoming emails aloud. The system aims to make email more accessible for blind and visually impaired users by eliminating the need to use keyboards. It could also help illiterate users. The document outlines the objectives, modules, algorithms, and technologies used in the proposed system, such as speech-to-text, text-to-speech, and interactive voice response.
Recently more & more hearing impaired people started using sign language. There are about 70 million people in the whole World that are not able to speak (dumb). A dumb person makes communication with other people using their motion of the hand or expressions. . Sign language helps the dumb people to make communication like normal people. The sign language translator which has been already developed uses a glove fitted with sensors that can interpret the 16 English letters in American Sign Language (ASL). Accelerometers and flex sensors are used in this system which increases its overall cost. We proposed a solution as a prototype called as “smart glove-for speech impaired people” which will translate sign language into text. It will help dump and deaf people to express their thoughts in more convenient way. As a sign language we have used traditional finger movements with contact switch wrapped around the user’s fingers. An IR transmitter receiver pair, HT12E and HT12D IC and, Arduino (Micro Controller) board helps transmitting data to PC. Moreover, use of contact switches reduces the system’s overall cost.
Keywords: - Arduino, HT12E IC & HT12D IC, IR transmitter receiver, contact switch.
AGE BASED USER INTERFACE IN MOBILE OPERATING SYSTEMIJCSEA Journal
The mobile phones are becoming now an irreplaceable utility of every household. It serves as wall clock, alarm clock, calculator, calendar, timer and many more, but have this multi-functionality overloaded the interface of the new generation of the mobile phones. The youth have adapted well to these multiple functionalities graphical user interface, but the interface has now haunting effects for usage by the two age groups i.e. elderly and kids. The interface may end up leading the new generation mobiles in the market useless or of very little use to the elderly and kids. This leads towards the need of age based user interface in the mobile operating system which will consist of interface selection home screen which further directs to age oriented interfaces.
AGE BASED USER INTERFACE IN MOBILE OPERATING SYSTEMIJCSEA Journal
The mobile phones are becoming now an irreplaceable utility of every household. It serves as wall clock, alarm clock, calculator, calendar, timer and many more, but have this multi-functionality overloaded the interface of the new generation of the mobile phones. The youth have adapted well to these multiple functionalities graphical user interface, but the interface has now haunting effects for usage by the two age groups i.e. elderly and kids. The interface may end up leading the new generation mobiles in the market useless or of very little use to the elderly and kids. This leads towards the need of age based user interface in the mobile operating system which will consist of interface selection home screen which further directs to age oriented interfaces.
Mobile speech and advanced natural language solutionsSpringer
This document discusses two frameworks for semantic interpretation in natural language technology for mobile devices: a rule-based framework and a statistical framework. The rule-based framework draws from expert systems and uses production rules and ontologies. The statistical framework uses data-driven methods. Both frameworks have advantages and drawbacks, and the document speculates that future systems may combine aspects of both frameworks to better understand user intent and resolve ambiguities.
Efficiency or Quality of Experience: A Laboratory Study of Three Eyes- Free T...Vladimir Kulyukin
A growing number of individuals who are blind or visually impaired is using smartphones in their daily
activities. The touchscreen is a standard component of smartphones. While benefitting people with low vision by enhancing control of the text style and color and the size of images and text, the touchscreen has the downside for visually
impaired users in that physical buttons for input of command selection and text entry are replaced with the touchscreen’s soft buttons. To overcome this limitation, we are investigating eyes-free approaches to using the smartphone’s touchscreen for information browsing. In this article, we present a laboratory study of three eyes-free touchscreen user
interfaces for browsing menu hierarchies. Our findings indicate that quality of experience and familiarity may be as important as the time efficiency of completing tasks.
The document discusses the rise of conversational interfaces and artificial intelligence. It notes that with billions of users on messaging platforms, businesses can no longer ignore the potential of conversational bots. New methods have produced unprecedented accuracy in natural language understanding. The document then provides a brief history of conversational agents since 1966 and discusses how conversational interfaces are evolving to become more integrated, ubiquitous and driven by artificial intelligence. It emphasizes selecting a strong platform like Chatlayer to build versatile, intelligent conversational solutions.
Interactive speech based games for autistic children with asperger syndromeAmal Abduallah
This document describes an interactive speech-based game system for autistic children with Asperger Syndrome. The system aims to help with communication challenges for these children by allowing interaction through speech, text, or pointing. It includes a web application and two desktop games that incorporate Microsoft's speech recognition and synthesis technologies. The goals are to create an intuitive interface using speech and to provide support for those with disabilities.
Technology Infrastructure For The Pervasive Vision, Does It Exist Yet?Olivia Moran
This document will explore the technologies used for pervasiveness in an attempt to determine whether or not the technology infrastructure needed to implement the pervasive vision is really there yet. The different hardware and software used by professionals to create pervasive solutions will be examined.
It will focus on the limitations of mobile devices, the operating systems they will use, Wireless Application Protocol (W.A.P.), Transmission Control Protocol and the Internet Protocol (TCP/IP). Also examined is the over use of ad hoc solutions. Wireless networks and protocols as well as the software used for pervasive application development will be examined.
It will illustrate how seamless communication occurs and the role that network operators and the handover process play in the achievement of this goal. It will consider how a lack of standards is impacting on the success and growth of the pervasive industry as well as the issue of user acceptance.
Similar to ACM DEV '12 Proceedings of the 2nd ACM Symposium on Computing for Development (20)
2. 2. BACKGROUND Otherwise users might think the icons represent locations or
Literacy can be defined in many ways. The U.N. defines a literate objects, for instance a kitchen instead of cooking [13]. Prasad et
person as someone who can “…with understanding, both read al. found that the metaphor of a postcard symbol worked well in a
and write a short simple statement in his or her everyday life” video mail application and helped users overcome their
[21]. Semi-literates represent another large group struggling with difficulties in understanding the notion of asynchronicity in a
reading simple text passages [7]. Many illiterate people have basic video mail application [15]. Bhamidipaty & Deepak improved
numeracy skills, i.e., they can to some degree understand, read contact management for illiterates by adding symbols to the
and write numbers. Huenerfauth distinguished between phone’s physical number keypad, which allowed users to filter
technological illiteracy and written language illiteracy [4]. contacts through combinations of these symbols [1]. The meaning
of an icon represents a learnt concept some of which can be more
A large number of development initiatives now evolve around the easily understood and recalled than others. However, if additional
use of mobile phones but so far most of the commercial offerings modalities are available in the UI to explain their meaning
(see [16] for an overview on mobile information services for designers often discount concerns about their lack of
agriculture) require their users to be literate. A number of ICTD appropriateness or intuitiveness.
projects have aimed at improving rural communication and
knowledge building for illiterate users, e.g. through audio wikis Audio feedback and voice annotation support represent two such
[8], discussion forums that extend existing mass media coverage modalities, which are used in many designs for illiterate users e.g.
such as community radios [14], and spoken web interfaces for [2], [11], [12], [18], [13], [14]. Audible instructions given to
user generated content [9]. However, to the best of our knowledge illiterate users needed to be short and simple and instructions
no previous work has tried to empower illiterate users to use text- containing multiple steps have to be avoided [18], [10], [15].
based communication through mobile phones. Possibly the closest When given audio instructions with multiple steps, illiterate users
to our idea is Shankar’s work on speech writing or spriting [17]. usually performed only the first or the last one. Findlater et al.
reported that the combination of text and audio disturbed illiterate
As a hardware platform for ICT the mobile phone poses various users but that semi-literates - with rudimentary reading skills -
challenges to illiterate users. According to Chipchase work benefitted from them in transitions from audio+text to text-only
turning on the mobile phone and accepting incoming calls were interactions [3]. To illiterate users text in the UI simply
the most successfully completed tasks by illiterate users of low- represented visual noise. In the search for optimal audio-visual
end or feature phones [2]. Dialing numbers for an outgoing call representations for illiterate users of health kiosks Medhi et al.
already proved more difficult. More complicated features such as created visual representations of health symptoms. Voice
contact management or asynchronous text messaging were outside annotations helped the user in speed of comprehension and
their current reach. To make mobile phones accessible for increased correct responses. If not told, however, some
illiterate users he proposed: a design not recognizable as targeting participants did not understand that the voice annotations were
illiterate people due to the associated stigma and a minimal meant to explain the visuals [12].
feature set by supporting only incoming and outgoing calls and a
simplified way to store contacts through call logs [2]. The As an input method voice has yet to overcome some hurdles.
Motorola Motofone F3 fit many checkboxes to be designed for During a longitudinal field trial of Avaaj Otalo - an interactive
poor, illiterate people. It was light, very rugged, and provided voice forum for small farmers accessed through voice calls – the
audio feedback for its functions from power on throughout its users could choose between voice commands and touchtone as
main (minimal) menu. Its e-ink screen could easily be read in input methods to navigate menus. Touchtone input was preferred
bright sunlight, it had a phenomenal battery life (nominally 30 in the large majority of cases over voice and users unanimously
days on standby) useful in rural areas with long power cuts and, at preferred touchtone navigation. Users found voice input more
around 20 USD it was affordable. However, it was not a success. error prone. However, this could have been due to the low
According to an unnamed Motorola source the company had accuracy of the speech recognizer, which was trained on
underestimated the aspirational aspects of the device. Given that American English and often faced inputs with noisy background
many people see mobile phones as extensions of themselves they [14].
did not want to be seen with a cheap phone. Common UI conventions and elements presented problems for
Most studies that we surveyed employed illiterate users who were lingual and technologically illiterate people. Chipchase cautioned
numerate but all agreed that they could not understand text-based against the use of soft-buttons and suggested that each hardware
UIs. Researchers advocated minimal use of text and some even button on mobile phones should map to one task only. Prasad et
text-free UIs [13]. In the greater socio-technical context, however, al. found that users were confused when faced with modes e.g.,
many illiterate people rely on proxy-literacy and seek out literate when creating a mail required them to choose from: video, audio,
helpers to mitigate encounters with text. These helpers can benefit drawn images and text as input methods [15]. Lalji & Good found
from the existence of text, making their involvement less onerous the use of lists far more effective than a hierarchical classification.
in comparison to text-free UIs. Fore example, Chipchase argued According to this study, participants remembered that they could
for the value of textual descriptions to accompany icons and use the ‘up’ and ‘down’ buttons but easily forgot how to access
deemed icon-only interfaces inferior for use by illiterate users. features when presented with a menu-based interface accessible
through soft keys [10]. They warned that color-coding was
The value of icons in UIs for illiterate users has been insufficient if users’ instructions were based on identifying
acknowledged and demonstrated in many studies, e.g. [2], [12], different colored buttons. In their lab studies, users often pressed
[10], [3] but they are not universally recognizable and need to be the green button when instructed to press the blue, and vice versa
adapted culturally. For instance, participants understood the [10]. In a study by Prasad et al. participants were likely to click on
“house” icon as a village hut and mistook musical notes for birds anything green when asked to click on a green arrow [15]. Medhi
[10]. Hand-drawn icons were preferred to realistic photos in et al. mentioned that scrollbars were not initially understood in the
studies by Medhi et al. and she noted that icons, which indicate an sense that subjects did not realize that there were functions
action may require visual cues for indicating motion [12]. displayed below the fold. These users coincided with the ones
3. who had mobile phones restricted to making voice calls [11]. with illiterate people before and her involvement eventually
Screen navigation was an issue frequently quoted in previous proved helpful to establish a trusted connection with the schools.
work [11], [10], [15]. To curb confusion from abrupt screen
changes, Prasad et al. proposed that navigation employed 3.1 Participants
animation to transition from one screen to the next [15] - now We carried out semi-structured interviews (60-90 minutes in
supported and common in e.g. iPhone and Android UIs. Katre duration) in cafés or if they felt comfortable with it in the
argued focusing on thumb-based interaction in the design of participants’ home. All of the 9 participants (7f, 2m) living in
applications for semi- and illiterate users on smart phones with Switzerland had immigration backgrounds from Africa and Brasil
touch screen [5]. He claimed that this user group lacked fine and had only very recently started a course to learn how to read
motor skills due to non-practice in writing. This made stylus and and write. Most of them did not currently hold a job and were
index finger based input slower compared with literates. supported by either partners or the state. Except one retired
woman all participants had enrolled in the school to be able to
Methodologically, involving illiterate users in HCI studies is more
find jobs. They received 20 CHF/h as compensation for their time.
challenging than with literate users in advanced economies.
Participants in previous studies typically had no faith in The interview script included the description of a typical day in
technology [13], had difficulties understanding abstract questions their life, problems or inconveniences faced, technology used in
and were not used to being tested [18], lacked self-confidence and the home, their use of means of communication, interacting with
felt they were not clever enough to use technology and wanted to necessary machinery e.g. automated teller machines (ATMs) and
observe and be taught [10]. Sherwani et al. proposed incremental with a special focus on the use of their mobile phones including
tutorials for participants before the study in order to better prepare receiving and placing calls, SMS, managing contacts and other
them to use a UI [18]. Prasad et al. reported that congratulatory functionality used.
audio messages after users performed a task seemed to produce
encouragement and excitement in order for them to continue 3.2 Results
navigation the application with more self-confidence. For In this paper we focus on the use and coping mechanisms for text
instance, after successfully logging in to an application, an audio messaging. The broader results from the interviews are reported in
congratulatory message informing the users that they had [1]. Living in a foreign country our Swiss participants needed to
successfully entered their inbox and that they could now retrieve stay in touch with family and friends in their home countries.
their mails [15]. Calling abroad was expensive and they often used internet cafes’
In summary previous research on designing for illiterate users has to make calls through special operators or VoIP which required
produced many recommendations, which are, however, often synchronizing with the called party to be at a place at a certain
remind us of the problems faced by all novice users of computers time. Many regarded asynchronous communication such as SMS
such as conventions of UIs and the affordances controls have. The as a convenient and cost-efficient alternative to stay in a touch. As
recommendations were often derived from usability studies in one woman whose daughter was living in Morocco stated:
which people encounter systems for the very first time and outside “I would love to send an SMS to my daughter such as ‘I’m
the context in which they typically discover and learn a new thinking of you’, but unfortunately, this is far too complicated for
technology. Most of the previous work focused on mobile phones me.”
with keypads that soon might become obsolete. In particular, Moreover, some had been asked by others to send texts rather than
some of the described hurdles in basic and feature phones e.g. the call. However, its reliance on literacy seemed an insurmountable
problems with soft keys could be overcome if illiterate users were barrier to using it to contact people. All of them had received text
able to find out what effect a button press would have analogously messages often unsolicited. Dealing with received text messages
to a mouse-over help text, i.e. without the need for pressing the varied and depended to some degree on the content. Three of our
button and carrying out its action. In order to look at requirements interviewees had stored SMS that contained telephone numbers
we wanted to find out more about illiterate people actual use, for months as another way of looking up contacts.
specifically with respect to text messaging and possibly on more
advanced mobile phones that were used six years ago when “I know X sent me this text message that has the telephone
Chipchase conducted his seminal work on illiterate people’s number from a friend of mine in Togo. So I go back here [to the
mobile phone use. inbox of his messages] and need to find his message. Here this is
it, he wrote this text in front of the number – my wife read it to me.
3. STUDY 1 It’s the name of the friend.”
We conducted interviews with illiterate immigrants in Switzerland Some had developed simple heuristics in detecting unsolicited
to study their use of mobile phones. We got access to them SMS through the length of the sending telephone numbers and the
through schools in Switzerland that taught adults how to read and fact that it contained lots of text. Mostly interviewees responded
write French. We told the school directors that we were interested to an incoming SMS by calling the sender – either they had
in illiterate peoples’ coping strategies as well as their use of memorized how to do this through the context menu or they noted
mobile phones and the tricks they employ to overcome their down the number and typed it into the phone again. Some
inability to read and write. Coming from a scientifically reputable interviewees treated all text messages as spam and had learned
school helped only to some extent as the teachers and directors of how to either exit the mode into which the phone switched on
the schools found it hard to understand what would come out of reception or how to quickly delete them without checking the
this study, whether their students would be treated with respect, content or their origin. Others asked for help with the content of
anonymity would be guaranteed and overall what benefit the the text messages. None of the interviewees felt bad about being
students and the school would enjoy in return. Some of the read to but one of them who was in a new relationship found
teachers were not even sure that many of their students were using asking close friends to read SMS with romantic content exciting at
mobile phones. A retired researcher that had worked extensively first but recently increasingly annoying. One participant wondered
whether it would be possible to forward the SMS to a service and
4. listen to the content on the phone through a human or machine To compose a new message, the user could re-use words from
voice. previous messages or rely on icons. In the former case when the
The signing of receipts for the payment turned out to be a problem user tapped on the “Edit” button (see fig 2 left), the whole
for three of our participants. They never signed any documents message was added to the New Message editor area. By tapping
unless a trusted person was present to make sure they are not on the pencil button the application switched into the edit mode
being taken advantage of. and the user could select only some parts of the previous
messages. In the latter case by tapping on the smiley icon, the user
4. EasyTexting could navigate to two selection screens:
Inspired by the findings around illiterate people’s use and non-use 1. The Quick sender screen contained nine icons representing the
of SMS and their interest in this form of communication we most frequent messages sent such as “ok”, “no”, “I miss you”.
developed a prototype for a voice-assisted SMS application The icons did not have text labels, however, each of them had
dubbed EasyTexting. This idea was born during a design course, a sound support. When the user selected one icon to be added
which evolved around four expert reviews. The first author of this to the message, the icon in itself was added to the message
paper developed the conceptual idea iteratively during this course editor (see the question mark icon “Why” in Figure 1 left).
and obtained feedback from interaction designers and researchers
2. The Customize screen contained multi-sentence icons
who had published in the area of ICT4D.
arranged by topics such as “Places and activities, feelings”.
First explorations of the concept were carried out with paper and Each icon had multiple meanings: For instance, the skyscraper
post-it notes to simulate screen navigation and later Powerpoint icon, had three associated sentences: “1_ I am at work. 2_ I
slides to simulate interactivity with audio. These early evaluations cannot answer, I am busy. 3_ I am doing some shopping in the
were based on the idea of sending an SMS through icons only. city”. By long tapping on this icon the user could listen to all
the sentences associated to it. If the user wanted to use the
4.1 First prototype second sentence, he had to tap the icon twice. There was no
In order to be able to test the application with users we developed visual/audio feedback on how many times he had already
an interactive prototype (see Figure 1) with Microsoft Expression tapped. Voice prompts read out the content but did not
Blend on an LG Optimus 7 - a WVGA (800x480 at 246ppi) multi- provide any action cues. There was the possibility to “add a
touch screen phone running the Windows Phone 7 (WP7) new entry” to extend the repertoire and add sentences from
operating system. It allowed for users to ‘read’ SMS through text- previous messages to some of the existing icons. For instance,
to-speech audio rendition and compose SMS through a range of after reception of a message e.g. “I really miss you today”, the
icons the user can drag into the message editor and by re-using user could add this sentence to his repertoire in the Feelings
words from previous messages. The icons represented common section.
text messages such as ‘Yes’ and ‘No’. We used generic faces
drawing to represent contacts in the phone.
This prototype was composed of five main screens: the thread
overview or Inbox screen (entry point), the individual thread or
Conversation screen (see Figure 1, left), the Quick Sender screen,
the Multi-sentences icon or Customize icon screen and the
Message reviewing (see Figure 1, right).
(1) The Inbox screen enabled users to see an overview of message
threads by contact.
(2) The Conversation screen allowed users to see all the messages
with a contact chronologically ordered. It also allowed users to
read incoming messages and to create a new message by:
selecting icons for most frequently used messages (in the Quick
sender screen), selecting multi-sentence icons by category
(Customize screen) and turning into an edit mode to reuse words
from previous messages.
(4) The Message reviewing screen helped users reviewing the
content of messages during their creation. Figure 1: First prototype Conversation screen (left) and
Message reviewing screen (right)
(5) A part from selecting multi-sentences icons by category the
Customize screen enabled users to create new entries for multi- After composing a new message, the user could navigate to the
sentence icons (to extend the repertoire of available sentences Message reviewing screen and listen to its audio rendition. The
through icons). icons were transformed into text and read out through text-to-
speech. Each group of words played and its corresponding icon
The Inbox screen contained all the inbox messages displayed as a was highlighted. For instance, in Figure 1 (right), the skyscraper
vertical list. It contained an iconic picture of a person, their first icon and the sentence “I am at work” are highlighted while the
name and envelopes that represented the new messages (closed) phone plays this group of words.
and read messages (open). Tapping on a thread in the list brought
up the Conversation screen Figure 1 (left). We carried out walk-throughs with four experts, brainstorming
sessions and corridor testing with students to improve the design.
The Conversation screen represented the history of all the We did not test this prototype with any illiterate user. The biggest
messages the user exchanged with a particular person. Each concern was that there were too many screens to go through. For
message could be listened to and visually contain combination of instance the Conversation screen could be combined with the
icons and text (cf. Figure 1, left). Message reviewing screen. The edit mode was deemed too
5. complicated, especially for illiterate users and gestures available language schools. According to them none knew how to read or
on touch screens such as scrolling and drag and drop could reduce write in their mother tongue either. A 40-year old woman from
the amount of taps required. Angola was married and her husband was literate. She had a
feature phone without a touch-screen. A 35-year old Moroccan
4.2 Second prototype widow (her late husband was a researcher) and mother of a three
The second prototype was simplified yet contained additional year old had had an iPhone, which broke after being thrown in the
information. It included the same main screens as the first toilet by her toddler. She was planning on buying a newer iPhone
prototype: Inbox screen, Conversation screen, Quick sender, model with an Internet subscription in the near future but for the
Customize screen but we removed the Message reviewing screen. transition was using a Nokia feature phone. A 35-year old
The Inbox screen included the date of the last message received Senegalese father of a six-month old, married to a literate nurse
and the telephone numbers of the contact. By tapping on a thread, had an iPhone with an international subscription.
the application brought up the Conversation screen that lists all
SMS with a given contact. We removed the need for an editing 5.2 Method
mode by turning every word into a button. This allowed the users We invited the participants for lunch before their session to make
to single tap on them to listen to its spoken form and to reuse them them feel at ease. We started by introducing the motivation and
by dragging them into the New message editor area that was fixed the purpose of the application. We guaranteed anonymity and
on the bottom of the Conversation screen (see Figure 2). During explained that our goal was not to test them but to obtain their
the audio playback of a read out word it was visually highlighted feedback as illiterate users. To boost their confidence, we stressed
in synch. We will refer to this assistive function as karaoke from that they were ‘the experts’ who tested applications designed by
hereon. This represented a new way of reviewing the message students. For data collection we used note taking by the
composed and due to its fixed placement allowed for removing experimenter in-situ while the participants were performing the
the Message reviewing screen. tasks. With their permission we video recorded the interaction of
their hands with the UI of the phone along with the soundtrack.
Each session lasted about 40 minutes and consisted of four parts:
1. a socio-demographic questionnaire,
2. a semi-structured interview,
3. a usability test of the application including a participatory
element around the design of the employed icons, and
4. a debrief interview.
Before starting the questionnaire we introduced ourselves and
tried to establish some common ground with the participants. The
teachers assured us that the participants did not know how to read
or write simple sentences in French. Hence we did not perform
additional literacy tests. Since we tried to establish a setting in
which the participants were encouraged to provide feedback in a
Figure 2: Second prototype: Message details screen confident way we deemed literacy tests counterproductive to this
end.
To select icons, the user could horizontally scroll the top part of
the Conversation screen to bring up the screens containing icons: The semi-structure interview focused on their use of mobile
Quick sender, Feelings and Places and Activities. While the top phones and SMS in their every day lives. We asked them to show
part of the screen was horizontally scrollable, the New message us their mobile phones and the main functions they used.
editor area remained fixed on the bottom of the screen. To select Specifically we probed how they checked call logs, if they stored
an icon, the user simply had to drag it into the New message contacts on their phones and how they interacted with SMS. We
editor and to listen to its meaning he had to single tap on it. As in used note taking by the experimenter in-situ and a video camera
the previous prototype, icons themselves were appended to the that recorded the participant’s actions on the mobile phone and the
message editor. discussions we had with them.
We introduced a second approach for multi-sentences icons. It We started the usability test by demonstrating the application, the
required the user to tap and hold a multi-sentence icon to open a content of which was entirely in French. We demonstrated
pop-up on the right hand side of the icon with all associated navigating through the different screens to check for new
sentences. A small play button at the end of each sentence allowed messages, listen to a new message and reply to a message by
for playing it. To append the sentence the user had to tap on it in double tapping on icons and re-using existing words from
the pop-up. We tried both this pop-up based version and the pre- previous messages. Before having them listen to the meaning of
listen version described in section 4.1 in study 2. selected icons, we asked them what their meaning might be. When
they could not infer the meaning of an icon we used for a
5. STUDY 2 particular phrase we asked them to sketch or to explain us how
they would represent this idea visually. We then demonstrated
5.1 Participants how the audio counterpart of an icon was invoked by tapping on
We conducted exploratory lab-based tests with three paid (20
it. After this demonstration we asked our participants to repeat the
CHF/h) participants that had participated in study 1. All of them
same actions and encouraged them to take the phone to scroll, tap
spoke French as their second language and had recently started a
and double tap to get familiar with the touch screen UI. This
course to learn how to read and write in French. All three of them
watch and repeat approach was supposed to emulate their
were from a course for beginners from one of the aforementioned
learning strategy when confronted with new technology with a
6. literate helper, as mentioned in study 1. Throughout the session For the usability test, the video recordings were our first source of
we tried using simple, non-technical language for all explanations. data collection. We reviewed the recordings and for each
For example, the participants were not familiar with terms like performed task reviewed the kind of errors they made and on
application and icons. The pictograms and icons we referred to as which screen it occurred. Due to the low number of participants
‘little pictures’, for example. We encouraged them to talk-out- we did not conduct any statistical tests, however. During the
loud especially about any problems they encountered or parts they debrief interviews, we elicited if they found the application easy
found unclear. We stressed that if they did not understand the to use and whether it would be useful for them in their everyday
application or parts of it was not their fault but the programmers. lives.
Once we felt that they were confident and understood the main
features of the application we started the usability test, which
5.3 Results
All three participants were comfortable with their own phones.
focused on multi-sentence icons, message composition and
They navigated very quickly on it and used several functionalities
reading, and specifically the karaoke feature.
such as radio, photo camera apart from making and receiving
We started with the two different versions of the multi-sentence calls. The man from Senegal even used a football app on his
icons both of which were available from different icons on the iPhone to check the outcome of football matches and who scored
same screen. We compared the pre-listen version we introduced in since he understood both the number format of scores and the
the first prototype with the pop-up based one introduced in the roster, which featured players’ head shots along with icons for
second prototype. Recall that in the pre-listen version users had to goals scored. In terms of SMS all of them knew how to handle
tap and hold on the icon to listen to all the sentences associated to and open incoming SMS and used literate helpers for the content.
it in a row. Then, to select the sentence number i, users had to tap The three of them were numerate and knew how to read date and
on the icon i times. In the pop-up version users were required to time but they found the latter easier from a digital than from an
tap and hold on the icon and a little pop-up appeared on the screen analog clock with handles. When asked if she knew how to search
with all the associated sentences. To select a sentence, users for her messages one participant proudly showed how quick she
simply had to tap on it. For both versions we asked them to long- was at searching for new SMS. She knew how to create a new
tap on the multi-sentence icon and queried whether they had an SMS but could not compose text in it. She used SMS very often
idea on how to append one of the offered sentences to the message with the help of literate friends and had 256 SMS in her inbox. “I
editor. know how to check the call logs, how to delete, how to do almost
For the composition, we situated them in the following scenario: everything on my cell phone, the only problem I have is reading
“Let’s suppose you received an SMS from Amisha a friend of and writing SMS.” The Senegalese man never used SMS since it
yours. You can see you have a new message from Amisha in your was too long and too complicated for him to try composing one
INBOX [participants are in the thread screen]. Now, you can tap but he had a number of SMS in his inbox, which mostly contained
on this message to see why Amisha is sending you this SMS telephone numbers of people along with their names. His wife had
[participants navigate to the message details screen (see Figure read the SMS to him and he consulted them when he needed the
1)].” After they had navigated to the Conversation screen we phone number of that contact.
made them listen to what Amisha had sent by tapping on the play
The two iPhone owners succeeded in sending the SMS “Tonight,
button next to the text message “Cinema tonight?” that was on top
no”. The third participant seemed not as confident. She hardly
of the list. To double check that they had understood the audio
touched the phone during the whole interview even if we
message we asked them to explain why Amisha had sent them an
encouraged her several times to do so. Worrying that this might
SMS. After their explanation, we asked them to reply that they
embarrass or stress her too much we refrained from pushing her
were not free tonight with “Tonight, no.” To make it easier, we
further through the scenario. This participant found it difficult to
broke this task into two subtasks through which we walked the
come up with possible meanings of the icons and struggled with
participants:
the concept of text being associated with the icons. For her icons
(1) We asked them to reuse the word “tonight” from the previous represented or were related to actions: “this [pointing to the
message by first finding it in the previous message and to smiling emoticon] means I am talking with someone and this
append it to the message editor. When necessary we reminded pointing to [sad emoticon] represents the person I am talking
them to use a double tap on the word. with”. The two iPhone owners roughly understood the meaning of
(2) We asked them to find the icon “No” from the list of icons in the icons but were not entirely sure. Asked about the meaning of
the Quick sender screen and to append it to the editor. the call icon (depicting a receiver) one said: “This might mean
Before sending they had to review the composed message by ‘Call me’ or maybe ‘I will call you later’”. Hearing the audio
tapping on the “play” button. counterpart removed any doubts for them.
For reading we tested what happened if we removed the karaoke
The idea of having multiple meaning for an icon and making them
function (the words currently played were highlighted in red). We
available (in both versions) through multiple taps was challenging
had two versions of the application: one with karaoke support and
for all participants. None of them succeeded in appending a
another one in which the whole sentence was played out but with
sentence to the editor and asked for help on what they had to do.
no visual feedback in the UI. We tested two sentences in French
In the pre-listen version the length of the entire prompt “One:
“When do you come back?“ (“Tu rentres quand?”) and “Cinema
sentence 1, Two: sentence 2, Three: sentence 3” was too long and
tonight?” (“Cine ce soir?”). First we asked the participants to play
at the end the participants could not remember the first sentence
out the sentences and identify as many words as they could in the
anymore. In the version with the pop-up, they were surprised by it
karaoke version. Then we asked them to repeat this with the same
and did not know where to tap to listen to the several associated
sentences in the version without the karaoke. They could listen to
sentences. The corresponding play buttons at the end of each
the message as many times as they wanted. At the end of each
sentence in the pop-up were relatively small but clearly visible.
sub-task completed we provided congratulatory or encouraging
feedback. We tested playing back a message with and without the karaoke.
With the karaoke, all of them succeeded matching some words to
7. the played sound. While the karaoke was playing, the woman avoid misinterpretations especially when seeing them for the first
from Morocco remarked: “Oh, yes, cinema, this word is cinema… time. Additionally, any mistakes could easily be corrected by
Ci ne ma” she pointed at the word and tapped on it to check she deleting erroneously added words from the editor. The corpus of
was right. Without the karaoke, the participants did not even icons was limited but we hope that this will provide an initial
realize there was a link between what they were hearing and the entry point for illiterate users to create words through which they
sentence played by the phone. can express themselves. Obviously a speech recognition facility
None of our participants seemed uncomfortable with being tested could be more versatile and powerful.
but their confidence varied. The woman from Angola often asked The participants struggled with the concept of multi-sentence
“Am I right? Am I saying the right thing?” while the other two icons. For the pre-listen version none our participants understood
were more self-confident. The man from Senegal immediately that the numbers “1, 2, 3” corresponded to the number of times
wanted to touch the phone, play the messages, drag some icons they had to tap on the icon to add the sentence to the message
into the message editor and scroll to go through all the screens. editor. Thus, after long-tapping on the icons and listening to the
When he and the other iPhone user succeeded in sending the three options, the users did not know what to do since the voice
SMS, they asked: “That's it? Is my message really sent?” They prompts did not provide any action cues. Instead of having the
seemed surprised by the simplicity. rather abstract guideline “One: I will be late. Two: …” we should
From the beginning, the woman from Morocco was excited about have given an action cue such as “Tap this icon once for I will be
the application: “This could be wonderful for people like me, is it late, tap it twice for…”. The combined prompts were too long and
possible to get the application on my mobile phone today?” The in hindsight reminded us of Medhi et al.’s recommendations about
other iPhone owner called us one hour after the interview to thank short and simple audio instructions [12]. Our participants did not
us about dedicating our time to help “people like him” and succeed in memorizing the three different meanings for a single
expressing his interest in obtaining the application. icon. Once reminded of the sentence and that the numbers
corresponded on how many times they had to tap on the icon,
At the end of the test, they seemed proud for helping us and for their main problem was that there was no feedback on how many
being useful to help researchers from a respected university. The times they had already tapped on the icon. This behavior was also
feedback we obtained from the teachers of the school was very inconsistent with how words and regular icons responded to taps.
positive and conveyed that the man from Senegal was
“transformed” after the session and for the first time he learned Compared to voice mail or voice-based SMS services (e.g. India’s
his lesson for the next day. VoiceSMS) our application offers additional value. In voice mail,
users need to have network access to compose a new SMS, with
6. DISCUSSION our application however, users can review and compose their
Although our two studies were based on a small number of users SMS offline. Standard SMS are cheap or even free (e.g. a hundred
we found consistently how proficient illiterate users were in SMS per day) as part of certain prepaid contracts. Most
navigating and using their mobile phones – be it low-end, feature importantly voice mails offer no potential learning whereas our
or smart phones. Since two out of three users in study 2 had application provides an audio-visual matching between text and
iPhones already, our results were biased compared to users who audio, which can represent a source of learning for users.
had never used smartphones before but it added to the existing According to Srivastava [20] an India NGO has started
evidence that using a smartphone proficiently is not a cognitive encouraging women to buy mobile phones English because of the
matter but a matter of habits. Illiterate users who are used to potential to learn various alphabets through them. We do not want
smartphones can be as proficient as literate users in using their to claim that illiterate users will learn how to read and write with
mobile phones at least for the functions that are important to them. this application alone. But we see potential for it in providing
additional encounters with text with concrete short-term goals
Similar to Lalji & Good’s finding in which participants were
providing reading practice and thereby incentivizing and
uncomfortable with touch screens our feature phone owner from
catalyzing literacy acquisition. Particularly the fact that our
Angola, despite encouragement was disinclined to touch the
participants were not able to identify words after removing the
phone. In the few cases when she did her styled, long and curved
karaoke function convince us that illiterate, neo-literate and semi-
fingernails made interactions with the touch screen seem a little
literate users will find this application helpful. Every day exposure
awkward because they would click on the screen first and the
to text in conjunction with audio in same language subtitles of
angle for touching was quite low. But both iPhone users had no
movie content was also shown to improve reading and writing
problem interacting with our touch screen based application
skills in neo-literates [7]. Semi-literates in Findlater et al.’s study
whatsoever and with initial explanations managed to compose
benefited from combination of text and audio and had superior
messages successfully. Like Medhi et al., we do not believe this to
word recognition at the end of each session after the second day of
be a cognitive issue since the other two participants were
use [3].
confident using the touch screen even with an application that
they had no previous experience with. Before testing touch-screen Chipchase recommended that phones for illiterates should not be
applications users need to be taught the basics of touch screen recognizable as such because of the associated stigma [2]. The
interaction. In contrast to Katre’s study, our participants had no only thing that might reveal a user’s illiteracy to by-standers while
problems using their index fingers for interacting with the touch using our application is the sound played when tapping on words
screen although some of the icons were relatively small. We are and icons. This can be mitigated by headphone use. Moreover, all
aware that these differences with Katre’s observations might be the SMS sent from our application are regular SMS. If an
due to the difference between our users (Swiss immigrants from EasyTexting user sends an SMS there will be no way for the
developing countries) and the rural farmers he studied. recipient to know that it was written with an application for
illiterates.
As can be expected, the icons we used – although carefully chosen
- were not self-explanatory. Each participant had his own Recruiting and running studies with illiterate users in Western
representation of an idea. Audio support for icons was helpful to countries is a challenge since they are not numerous and since
8. they usually try to hide their illiteracy. Our way to get in contact approach for blind users. We used monochrome simple icons to
with them was via schools. Establishing initial contacts with the mimic the WP7 metro design’s look and feel.
schools and to gain the trust of the staff and teachers took time.
Partly, they wanted to make sure we were going to treat their
students with respect and without a patronizing attitude. Despite
the testimonials of teachers only some of the students volunteered
to participate despite remuneration. Almost all of our participants
were financially relying on their partners.
7. FINAL DESIGN
The entry screen of the application depicted in Figure 3 is the
Inbox screen, which contains all the threads of received messages.
Each thread item contains the picture, phone number, name of the
contact and one line of the last exchanged message. The last three
digits are highlighted by putting them apart from the rest to aid
recognition of contacts by phone numbers as mentioned in [6].
Tapping on a list item brings up all the messages exchanged with
this particular contact.
Figure 4: Third prototype: Conversation screen (left) Quick
sender sub-screen (right)
The final design of EasyTexting application is composed of two
main screens: the Inbox and the Conversation, the latter of which
extends to the sub-screens providing access to icons (Quick
sender, Feelings, Places and activites).
The Inbox screen is similar to the existing SMS composition tool
on WP7 except for the added contact picture and the last three
digits of the phone numbers that are visually separated.
The Conversation screen with the history of all the previous
messages exchanged with someone differs in various points from
the WP7 counterpart. The picture, phone number and name of the
contact users are exchanging SMS with are displayed at the top of
Figure 3: Third prototype: Inbox screen the page. While the middle of the screen is scrollable, this part at
Contrary to the other prototypes, we added a text label underneath the top is fixed. We followed the Windows convention and
each icon (cf. Figure 4, right). For the composition of a message displayed SMS in speech bubbles. However, each word is a button
the user can rely on icons, re-use of words or both. Double the user can reuse in a new message. This removes the need for
tapping on a word in a previous message results in appending the copy and paste functionality of the regular SMS version. But
word to the message editor (the grey speech bubbles in Figure 4). words can only be added sequentially to the end of the message
Icons only have one meaning and the user has to scroll editor. The current application does not allow users to use the
horizontally to the Quick Sender screen as illustrated in Figure 4. keyboard, attach a picture (MMS) or to save it as a draft.
Analogously to words, single taps on icons play the sound of the Comparing to the standard SMS application on WP7, our
words or sentence associated with it. Contrary to the previous application includes sound support. Each word is a playable
prototypes in which double tapping on an icon placed the icon button and each SMS can be played with karaoke support.
itself on the message editor, double taps on icons place their Without this feature, users cannot “read” or understand the
corresponding words in the message editor. As with received content of an SMS by themselves and not make use of text
messages each word of the sentence under composition is a button messaging.
and on tap delivers its audio. Double tapping on words in the From the Conversation screen, users can directly access the icons
message editor results in its deletion. We enlarged the word dictionaries screens by scrolling horizontally. Each icon is
borders to improve tapping on single letter words and punctuation. playable and has a predefined sentence associated with it.
Initially we had experimented with single taps on icons and words
to add them to the editor and long taps for the audio. But after 8. CONCLUSION
some corridor tests long-taps proved to be too time-consuming. Along this research we discovered that illiterate people did use
Since the equivalent of a mouse over event does not exist on touch their mobile phones a lot but were unable to use text-based
screens we needed to find a way to provide its audio rendition applications. Managing their contacts and dealing with SMS were
without triggering another action. We settled with single taps to the two things they struggled with most or could not do at all.
play the sound and double tap to add the icon’s associated text to However when it came to SMS, they used some tricks such as
the message editor. Thus, a single tap was used to represent a asking their relatives to read SMS or calling back the senders. On
mouse over event. This is akin to the iPhones accessibility our prototype, we kept many UI conventions that we had found
usable for illiterate users such as the threaded view of SMS and
9. the main presentation of the inbox screen. No previous http://research.nokia.com/bluesky/non-literacy-001-
applications on touch-screen phones for illiterate users were 2005/index.html.
developed before. Our findings from two studies add to the 3. Findlater, L., Balakrishnan, R., and Toyama, K. Comparing
evidence that using touch-screen phones does not represent a semiliterate and illiterate users’ ability to transition from
cognitive problem for illiterate users but only a problem in terms audio+text to text-only interaction. Proceedings of the 27th
of lacking confidence or technological literacy. We found international conference on Human factors in computing
promising first evidence that illiterate users can use text systems, ACM (2009), 1751–1760.
messaging in conjunction with audio, text and visuals when initial 4. Huenerfauth, M.P. Design approaches for developing user-
training is provided. Overall, users we interviewed were interested interfaces accessible to illiterate users. University College
in making use of text messaging and some of them wanted to take Dublin, Ireland, (2002).
the application home with them. From our findings we argue that 5. Katre, D. One-handed thumb use on smart phones by semi-
ICTD research should not reduce mobile phones to mere literate and illiterate users in India: A usability report with
telephones with simplified storage for contacts. This restrictive design improvements for precision and ease. Proceedings of
approach would most likely fail in the market place because it Workshop on Cultural Usability and Human Work Interaction
denies illiterates to enjoy other functions such as entertainment Design, NordiCHI Conference, Lund, Sweden, (2008).
through music, pictures and video. Touch screen phones with on- 6. Knoche, H., Huang, J. Text is not the enemy - How illiterates
demand voice feedback can enable illiterate users to use use their mobile phones. NUIs for New Worlds: New
potentially important information services by leveraging the Interaction Forms and Interfaces for Mobile Applications in
affordances of multimedia UIs on touch screen phones. Chipchase Developing Countries - CHI’2012 workshop, (2012).
concluded that to improve literacy skills the best solution would 7. Kothari, B., Takeda, J., Joshi, A., and Pandey, A. Same
be a phone. We aimed at this by providing an application that language subtitling: a butterfly for literacy? International
allows illiterates to compose and listen to SMS. We combined Journal of Lifelong Education 21, 1 (2002), 55–66.
icons, audio and text and in-synch highlighting of read out words 8. Kotkar, P., Thies, W., and Amarasinghe, S. An audio wiki for
to aid recognition and possibly reading acquisition. In our publishing user-generated content in the developing world.
application words are objects that react to taps and reveal their HCI for Community and International Development
meaning in audio form. Initial tests with touch-screen experienced (Workshop at CHI 2008), Florence, Italy, (2008).
participants showed potential for this approach. 9. Kumar, A., Agarwal, S.K., and Manwani, P. The spoken web
application framework: user generated content and service
9. FUTURE WORK creation through low-end mobiles. Proceedings of the 2010
We plan to further add to this application by improving the input International Cross Disciplinary Conference on Web
of text a) through keyboard entries, e.g. for numbers b) through Accessibility (W4A), ACM (2010), 1–10.
speech recognition c) by reusing words from previous SMS from 10. Lalji, Z. and Good, J. Designing new technologies for illiterate
all threads d) providing tactile feedback when words are added to populations: A study in mobile phone interface design.
the message composer e) by providing a movable insertion point. Interacting with Computers 20, 6 (2008), 574–586.
We would like to improve the contact manager for illiterate users 11. Medhi, I., Gautama, S.N., and Toyama, K. A comparison of
both for picking contacts and the management itself. Searching mobile money-transfer UIs for non-literate and semi-literate
through a contact long list of contacts is time consuming for users. Proceedings of the 27th international conference on
illiterate users since the search is based on alphabetic order. Human factors in computing systems, (2009), 1741–1750.
Moreover, creating a new entry can be difficult when written 12. Medhi, I., Prasad, A., and Toyama, K. Optimal audio-visual
names are required for a contact – see [6] for more details. representations for illiterate users of computers. Proceedings
of the 16th international conference on World Wide Web,
We plan to port the application to the Android platform and (2007), 882.
extend it with speech recognition for the composition of messages 13. Medhi, I., Sagar, A., and Toyama, K. Text-free user interfaces
and carry out field studies. We would like to evaluate the for illiterate and semiliterate users. Information Technologies
application with illiterate and semi-literate users was well as and International Development 4, 1 (2007), 37–50.
elderly. 14. Patel, N., Chittamuru, D., Jain, A., Dave, P., and Parikh, T.S.
Avaaj Otalo—A Field Study of an Interactive Voice Forum
10. ACKNOWLEDGMENTS for Small Farmers in Rural India. Proceedings of the
We would like to express our gratitude to Oscar Bolanos and Proceedings of the 28th international conference on Human
Lukas Frelich for helping with the implementation; Anne factors in computing systems (Atlanta, GA, USA, 2010). ACM,
Marquis, Catherine Wick, Annick Mello Spano and the teachers (2010).
from Lire-et-écrire and Français-en-jeu and all interviewees for 15. Prasad, A., Medhi, I., Toyama, K., and Balakrishnan, R.
their time; Jeffrey Huang, Jan Blom, Florian Egger, Mairi Willis, Exploring the feasibility of video mail for illiterate users.
Daniel Keller, Gunnar Harboe and Saket Sathe for providing Proceedings of the working conference on Advanced visual
valuable feedback and guidance. This research has been funded by interfaces, (2008), 103–110.
the Swiss Development Council in collaboration with 16. Rao, K.V. and Sonar, R.M. M4D Applications in Agriculture:
cooperation@EPFL. Some Developments and Perspectives in India. Defining the
“D”in ICT4D, (2009), 104–111.
11. REFERENCES 17. Shankar, T.M.R. Speaking on the Record. 2004.
1. Bhamidipaty, A. Symab: Symbol-based address book for the
18. Sherwani, J., Palijo, S., Mirza, S., Ahmed, T., Ali, N., and
semi-literate mobile user. Human-Computer Interaction–
Rosenfeld, R. Speech vs. touch-tone: Telephony interfaces for
INTERACT 2007, (2007), 389–392.
information access by low literate users. Proc. IEEE/ACM
2. Chipchase, J. Understanding non-literacy as a barrier to
Int’l Conference on Information and Communication
mobile phone communication.
Technologies and Development, (2009).
10. 19. Smyth, T.N., Kumar, S., Medhi, I., and Toyama, K. Where 21. UNESCO. Gender and Education for All: The Leap to
there’s a will there’s a way: mobile media sharing in urban equality. 2003.
india. Proceedings of the 28th international conference on http://www.unesco.org/new/en/education/themes/leading-the-
Human factors in computing systems, (2010), 753–762. international-agenda/efareport/reports/20034-gender/.
20. Srivastava, Kendra. Indian Women Learn Alphabets on
Handsets. Mobiledia.
http://www.mobiledia.com/news/122456.html