SlideShare a Scribd company logo
1 of 28
Human Computer Interaction
1 - Lecture
Chapter 10: Universal Design and User Support
Universal Design Principles
Speech Recognition and Speech Synthesis
Recognitions, Gestures, Users with Disabilities
Objectives
At the end of the lesson, the students are
able to:
• Identify the tools that can be used by
users with disabilities
• Classify the multimodal system used in
a system and the different user support
tools
• Design a system and create a user’s
manual so that they can be used by
anyone in any circumstances
Universal Design Principles - NCSW
• equitable use
• flexibility in use
• simple and intuitive to use
• perceptible information
• tolerance for error
• low physical effort
• size and space for approach and use
Multi-Sensory Systems
• More than one sensory channel in
interaction
– e.g. sounds, text, hypertext, animation,
video, gestures, vision
• Used in a range of applications:
– particularly good for users with special
needs, and virtual reality
• Will cover
– general terminology
– speech
– non-speech sounds
Usable Senses
The 5 senses (sight, sound, touch, taste
and smell) are used by us every day
– each is important on its own
– together, they provide a fuller interaction with the
natural world
Computers rarely offer such a rich
interaction
Can we use all the available senses?
– ideally, yes
– practically – no
We can use • sight • sound • touch
Multi-modal vs. Multi-media
• Multi-modal systems
– use more than one sense (or mode ) of
interaction
e.g. visual and aural senses: a text processor may speak the
words as well as echoing them to the screen
• Multi-media systems
– use a number of different media to
communicate information
e.g. a computer-based teaching system:may use video,
animation, text and still images: different media all using
the visual mode of interaction; may also use sounds, both
speech and non-speech: two more media, now using a different
Speech
Human beings have a great and
natural mastery of speech
– makes it difficult to appreciate the
complexities
but
– it’s an easy medium for communication
Structure of Speech
phonemes
– 40 of them
– basic atomic units
– sound slightly different depending on the
context they are in, these larger units are …
allophones
– all the sounds in the language
– between 120 and 130 of them
– these are formed into …
morphemes
– smallest unit of language that has meaning.
Speech (cont’d)
Other terminology:
• prosody
– alteration in tone and quality
– variations in emphasis, stress, pauses and
pitch
– impart more meaning to sentences.
• co-articulation
– the effect of context on the sound
– transforms the phonemes into allophones
• syntax – structure of sentences
• semantics – meaning of sentences
Speech Recognition Problems
• Different people speak differently:
– accent, intonation, stress, idiom, volume, etc.
• The syntax of semantically similar sentences
may vary.
• Background noises can interfere.
• People often “ummm.....” and “errr.....”
• Words not enough - semantics needed as well
– requires intelligence to understand a sentence
– context of the utterance often has to be known
– also information about the subject and speaker
e.g. even if “Errr.... I, um, don’t like this” is recognised,
it is a fairly useless piece of information on it’s own
The Phonetic Typewriter
• Developed for Finnish (a phonetic language,
written as it is said)
• Trained on one speaker, will generalise
to others.
• A neural network is trained to cluster
together similar sounds, which are then
labelled with the corresponding
character.
• When recognising speech, the sounds
uttered are allocated to the closest
corresponding output, and the character
The Phonetic Typewriter (ctd)
Speech Recognition: useful?
• Single user or limited vocabulary
systems
e.g. computer dictation
• Open use, limited vocabulary systems can
work satisfactorily
e.g. some voice activated telephone systems
• general user, wide vocabulary systems …
… still a problem
• Great potential, however
– when users hands are already occupied
e.g. driving, manufacturing
– for users with physical disabilities
Speech Synthesis
The generation of speech
Useful
– natural and familiar way of receiving
information
Problems
– similar to recognition: prosody particularly
Additional problems
– intrusive - needs headphones, or creates
noise in the workplace
– transient - harder to review and browse
Speech Synthesis: useful?
Successful in certain constrained applications
when the user:
– is particularly motivated to overcome problems
– has few alternatives
Examples:
• screen readers
– read the textual display to the user
utilised by visually impaired people
• warning signals
– spoken information sometimes presented to pilots
whose visual and haptic skills are already fully
occupied
Non-Speech Sounds
boings, bangs, squeaks, clicks etc.
• commonly used for warnings and
alarms
• Evidence to show they are useful
– fewer typing mistakes with key clicks
– video games harder without sound
• Language/culture independent, unlike
Non-Speech Sounds: useful?
• Dual mode displays:
– information presented along two different
sensory channels
– redundant presentation of information
– resolution of ambiguity in one mode through
information in another
• Sound good for
– transient information
– background status information
e.g. Sound can be used as a redundant mode in
the Apple Macintosh; almost any user action
(file selection, window active, disk insert,
Auditory Icons
• Use natural sounds to represent
different types of object or action
• Natural sounds have associated semantics
which can be mapped onto similar
meanings in the interaction
e.g. throwing something away
~ the sound of smashing glass
• Problem: not all things have associated
meanings
• Additional information can also be
presented:
SonicFinder for the Macintosh
• items and actions on the desktop
have associated sounds
• folders have a papery noise
• moving files – dragging sound
• copying – a problem …
sound of a liquid being poured into a
receptacle
rising pitch indicates the progress of
the copy
Earcons
• Synthetic sounds used to convey information
• Structured combinations of notes (motives )
represent actions and objects
• Motives combined to provide rich information
– compound earcons
– multiple motives combined to make one more
complicated earcon
Earcons (ctd)
• family earcons
similar types of earcons represent similar classes
of action or similar objects: the family of “errors”
would contain syntax and operating system errors
• Earcons easily grouped and refined due to
compositional and hierarchical nature
• Harder to associate with the interface task
since there is no natural mapping
touch
• haptic interaction
– cutaneous perception
• tactile sensation; vibrations on the skin
– kinesthetics
• movement and position; force feedback
• information on shape, texture,
resistance, temperature, comparative
spatial factors
• example technologies
– electronic braille displays
– force feedback devices e.g. Phantom
• resistance, texture
Handwriting recognition
Handwriting is another communication mechanism
which we are used to in day-to-day life
• Technology
– Handwriting consists of complex strokes and
spaces
– Captured by digitising tablet
• strokes transformed to sequence of dots
– large tablets available
• suitable for digitising maps and technical
drawings
– smaller devices, some incorporating thin
screens to display the information
• PDAs such as Palm Pilot
Handwriting recognition (ctd)
• Problems
– personal differences in letter formation
– co-articulation effects
• Breakthroughs:
– stroke not just bitmap
– special ‘alphabet’ – Graffeti on PalmOS
• Current state:
– usable – even without training
– but many prefer keyboards!
gesture
• applications
– gestural input - e.g. “put that there”
– sign language
• technology
– data glove
– position sensing devices e.g MIT Media Room
• benefits
– natural form of interaction - pointing
– enhance communication between signing and non-
signing users
• problems
– user dependent, variable and issues of
Users with disabilities
• visual impairment
– screen readers, SonicFinder
• hearing impairment
– text communication, gesture, captions
• physical impairment
– speech I/O, eyegaze, gesture, predictive systems
(e.g. Reactive keyboard)
• speech impairment
– speech synthesis, text communication
• dyslexia
– speech input, output
• autism
– communication, education
…plus…
• age groups
– older people e.g. disability aids, memory
aids, communication tools to prevent social
isolation
– children e.g. appropriate input/output
devices, involvement in design process
• cultural differences
– influence of nationality, generation, gender,
race, sexuality, class, religion, political
persuasion etc. on interpretation of
interface features
– e.g. interpretation and acceptability of
language, cultural symbols, gesture and
Reference/s
Dix, A., Finlay, J., Abowd,
G.D., & Beale, R. (2004).
Human computer interaction
(3rd ed.). Prentice Hall.
ISBN 0-13-046109-1.
Preece, J., Rogers, Y., &
Sharp, H. (2015).
Interaction design: Beyond
human-computer interaction
(4th ed.) John Wiley & Sons
Ltd. ISBN 978-1-119-02075-2.
Interaction Design: Beyond

More Related Content

Similar to Chapter 10 - Universal Design and User Support.pptx

Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01girishjoshi1234
 
Human Factors and User Interface Design
Human Factors and User Interface DesignHuman Factors and User Interface Design
Human Factors and User Interface DesignSaggitariusArrow
 
DTUI6_chap09_accessiblePPT.pptx
DTUI6_chap09_accessiblePPT.pptxDTUI6_chap09_accessiblePPT.pptx
DTUI6_chap09_accessiblePPT.pptxHetaSuto
 
UXD Multimedia output
UXD Multimedia outputUXD Multimedia output
UXD Multimedia outputrolfHRO
 
Assistive technology
Assistive technology  Assistive technology
Assistive technology janiceaughey
 
Inclusive E-Learning good practice becoming general practice
Inclusive E-Learning good practice becoming general practiceInclusive E-Learning good practice becoming general practice
Inclusive E-Learning good practice becoming general practiceStaff Development
 
NLP,expert,robotics.pptx
NLP,expert,robotics.pptxNLP,expert,robotics.pptx
NLP,expert,robotics.pptxAmanBadesra1
 
Assistive Technology - Module 4 gwen kenny
Assistive Technology - Module 4 gwen kennyAssistive Technology - Module 4 gwen kenny
Assistive Technology - Module 4 gwen kennygkenny1
 
Tutorial: Text Analytics for Security
Tutorial: Text Analytics for SecurityTutorial: Text Analytics for Security
Tutorial: Text Analytics for SecurityTao Xie
 
Automatic Speech Recognition.ppt
Automatic Speech Recognition.pptAutomatic Speech Recognition.ppt
Automatic Speech Recognition.pptRudraSaraswat3
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognitionboddu syamprasad
 
Digital speech processing lecture1
Digital speech processing lecture1Digital speech processing lecture1
Digital speech processing lecture1Samiul Parag
 

Similar to Chapter 10 - Universal Design and User Support.pptx (20)

Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01
 
Human Factors and User Interface Design
Human Factors and User Interface DesignHuman Factors and User Interface Design
Human Factors and User Interface Design
 
ICS3211_lecture 08_2023.pdf
ICS3211_lecture 08_2023.pdfICS3211_lecture 08_2023.pdf
ICS3211_lecture 08_2023.pdf
 
Amadou
AmadouAmadou
Amadou
 
DTUI6_chap09_accessiblePPT.pptx
DTUI6_chap09_accessiblePPT.pptxDTUI6_chap09_accessiblePPT.pptx
DTUI6_chap09_accessiblePPT.pptx
 
ICS3211 lecture 07
ICS3211 lecture 07ICS3211 lecture 07
ICS3211 lecture 07
 
UXD Multimedia output
UXD Multimedia outputUXD Multimedia output
UXD Multimedia output
 
Assistive technology
Assistive technology  Assistive technology
Assistive technology
 
Inclusive E-Learning good practice becoming general practice
Inclusive E-Learning good practice becoming general practiceInclusive E-Learning good practice becoming general practice
Inclusive E-Learning good practice becoming general practice
 
NLP,expert,robotics.pptx
NLP,expert,robotics.pptxNLP,expert,robotics.pptx
NLP,expert,robotics.pptx
 
Assistive Technology - Module 4 gwen kenny
Assistive Technology - Module 4 gwen kennyAssistive Technology - Module 4 gwen kenny
Assistive Technology - Module 4 gwen kenny
 
Proposal presentation.pptx
Proposal presentation.pptxProposal presentation.pptx
Proposal presentation.pptx
 
Tutorial: Text Analytics for Security
Tutorial: Text Analytics for SecurityTutorial: Text Analytics for Security
Tutorial: Text Analytics for Security
 
Automatic Speech Recognition.ppt
Automatic Speech Recognition.pptAutomatic Speech Recognition.ppt
Automatic Speech Recognition.ppt
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
Digital speech processing lecture1
Digital speech processing lecture1Digital speech processing lecture1
Digital speech processing lecture1
 
L1 nlp intro
L1 nlp introL1 nlp intro
L1 nlp intro
 
ICS3211 Lecture 07
ICS3211 Lecture 07 ICS3211 Lecture 07
ICS3211 Lecture 07
 
Assistive Media.pptx
Assistive Media.pptxAssistive Media.pptx
Assistive Media.pptx
 
Chapter 4
Chapter 4 Chapter 4
Chapter 4
 

Recently uploaded

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 

Recently uploaded (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 

Chapter 10 - Universal Design and User Support.pptx

  • 1. Human Computer Interaction 1 - Lecture Chapter 10: Universal Design and User Support Universal Design Principles Speech Recognition and Speech Synthesis Recognitions, Gestures, Users with Disabilities
  • 2. Objectives At the end of the lesson, the students are able to: • Identify the tools that can be used by users with disabilities • Classify the multimodal system used in a system and the different user support tools • Design a system and create a user’s manual so that they can be used by anyone in any circumstances
  • 3. Universal Design Principles - NCSW • equitable use • flexibility in use • simple and intuitive to use • perceptible information • tolerance for error • low physical effort • size and space for approach and use
  • 4. Multi-Sensory Systems • More than one sensory channel in interaction – e.g. sounds, text, hypertext, animation, video, gestures, vision • Used in a range of applications: – particularly good for users with special needs, and virtual reality • Will cover – general terminology – speech – non-speech sounds
  • 5. Usable Senses The 5 senses (sight, sound, touch, taste and smell) are used by us every day – each is important on its own – together, they provide a fuller interaction with the natural world Computers rarely offer such a rich interaction Can we use all the available senses? – ideally, yes – practically – no We can use • sight • sound • touch
  • 6. Multi-modal vs. Multi-media • Multi-modal systems – use more than one sense (or mode ) of interaction e.g. visual and aural senses: a text processor may speak the words as well as echoing them to the screen • Multi-media systems – use a number of different media to communicate information e.g. a computer-based teaching system:may use video, animation, text and still images: different media all using the visual mode of interaction; may also use sounds, both speech and non-speech: two more media, now using a different
  • 7. Speech Human beings have a great and natural mastery of speech – makes it difficult to appreciate the complexities but – it’s an easy medium for communication
  • 8. Structure of Speech phonemes – 40 of them – basic atomic units – sound slightly different depending on the context they are in, these larger units are … allophones – all the sounds in the language – between 120 and 130 of them – these are formed into … morphemes – smallest unit of language that has meaning.
  • 9. Speech (cont’d) Other terminology: • prosody – alteration in tone and quality – variations in emphasis, stress, pauses and pitch – impart more meaning to sentences. • co-articulation – the effect of context on the sound – transforms the phonemes into allophones • syntax – structure of sentences • semantics – meaning of sentences
  • 10. Speech Recognition Problems • Different people speak differently: – accent, intonation, stress, idiom, volume, etc. • The syntax of semantically similar sentences may vary. • Background noises can interfere. • People often “ummm.....” and “errr.....” • Words not enough - semantics needed as well – requires intelligence to understand a sentence – context of the utterance often has to be known – also information about the subject and speaker e.g. even if “Errr.... I, um, don’t like this” is recognised, it is a fairly useless piece of information on it’s own
  • 11. The Phonetic Typewriter • Developed for Finnish (a phonetic language, written as it is said) • Trained on one speaker, will generalise to others. • A neural network is trained to cluster together similar sounds, which are then labelled with the corresponding character. • When recognising speech, the sounds uttered are allocated to the closest corresponding output, and the character
  • 13. Speech Recognition: useful? • Single user or limited vocabulary systems e.g. computer dictation • Open use, limited vocabulary systems can work satisfactorily e.g. some voice activated telephone systems • general user, wide vocabulary systems … … still a problem • Great potential, however – when users hands are already occupied e.g. driving, manufacturing – for users with physical disabilities
  • 14. Speech Synthesis The generation of speech Useful – natural and familiar way of receiving information Problems – similar to recognition: prosody particularly Additional problems – intrusive - needs headphones, or creates noise in the workplace – transient - harder to review and browse
  • 15. Speech Synthesis: useful? Successful in certain constrained applications when the user: – is particularly motivated to overcome problems – has few alternatives Examples: • screen readers – read the textual display to the user utilised by visually impaired people • warning signals – spoken information sometimes presented to pilots whose visual and haptic skills are already fully occupied
  • 16. Non-Speech Sounds boings, bangs, squeaks, clicks etc. • commonly used for warnings and alarms • Evidence to show they are useful – fewer typing mistakes with key clicks – video games harder without sound • Language/culture independent, unlike
  • 17. Non-Speech Sounds: useful? • Dual mode displays: – information presented along two different sensory channels – redundant presentation of information – resolution of ambiguity in one mode through information in another • Sound good for – transient information – background status information e.g. Sound can be used as a redundant mode in the Apple Macintosh; almost any user action (file selection, window active, disk insert,
  • 18. Auditory Icons • Use natural sounds to represent different types of object or action • Natural sounds have associated semantics which can be mapped onto similar meanings in the interaction e.g. throwing something away ~ the sound of smashing glass • Problem: not all things have associated meanings • Additional information can also be presented:
  • 19. SonicFinder for the Macintosh • items and actions on the desktop have associated sounds • folders have a papery noise • moving files – dragging sound • copying – a problem … sound of a liquid being poured into a receptacle rising pitch indicates the progress of the copy
  • 20. Earcons • Synthetic sounds used to convey information • Structured combinations of notes (motives ) represent actions and objects • Motives combined to provide rich information – compound earcons – multiple motives combined to make one more complicated earcon
  • 21. Earcons (ctd) • family earcons similar types of earcons represent similar classes of action or similar objects: the family of “errors” would contain syntax and operating system errors • Earcons easily grouped and refined due to compositional and hierarchical nature • Harder to associate with the interface task since there is no natural mapping
  • 22. touch • haptic interaction – cutaneous perception • tactile sensation; vibrations on the skin – kinesthetics • movement and position; force feedback • information on shape, texture, resistance, temperature, comparative spatial factors • example technologies – electronic braille displays – force feedback devices e.g. Phantom • resistance, texture
  • 23. Handwriting recognition Handwriting is another communication mechanism which we are used to in day-to-day life • Technology – Handwriting consists of complex strokes and spaces – Captured by digitising tablet • strokes transformed to sequence of dots – large tablets available • suitable for digitising maps and technical drawings – smaller devices, some incorporating thin screens to display the information • PDAs such as Palm Pilot
  • 24. Handwriting recognition (ctd) • Problems – personal differences in letter formation – co-articulation effects • Breakthroughs: – stroke not just bitmap – special ‘alphabet’ – Graffeti on PalmOS • Current state: – usable – even without training – but many prefer keyboards!
  • 25. gesture • applications – gestural input - e.g. “put that there” – sign language • technology – data glove – position sensing devices e.g MIT Media Room • benefits – natural form of interaction - pointing – enhance communication between signing and non- signing users • problems – user dependent, variable and issues of
  • 26. Users with disabilities • visual impairment – screen readers, SonicFinder • hearing impairment – text communication, gesture, captions • physical impairment – speech I/O, eyegaze, gesture, predictive systems (e.g. Reactive keyboard) • speech impairment – speech synthesis, text communication • dyslexia – speech input, output • autism – communication, education
  • 27. …plus… • age groups – older people e.g. disability aids, memory aids, communication tools to prevent social isolation – children e.g. appropriate input/output devices, involvement in design process • cultural differences – influence of nationality, generation, gender, race, sexuality, class, religion, political persuasion etc. on interpretation of interface features – e.g. interpretation and acceptability of language, cultural symbols, gesture and
  • 28. Reference/s Dix, A., Finlay, J., Abowd, G.D., & Beale, R. (2004). Human computer interaction (3rd ed.). Prentice Hall. ISBN 0-13-046109-1. Preece, J., Rogers, Y., & Sharp, H. (2015). Interaction design: Beyond human-computer interaction (4th ed.) John Wiley & Sons Ltd. ISBN 978-1-119-02075-2. Interaction Design: Beyond