Natural User Interface Design for Smartphones


Published on

The smartphone presents a set of usability challenges that can be solved only with a combination of all input and output modalities available to the user. In this workshop, we review some basic principles for building highly usable, multimodal applications. The principles will be illustrated through concrete implementation examples.

Published in: Technology, Business
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Naturalness is NOT inherent in Interface: it is a function of familiarity of the interface + suitability of interface to circumstances
  • Naturalness is NOT inherent in Interface: it is a function of familiarity of the interface + suitability of interface to circumstances
  • Natural User Interface Design for Smartphones

    1. 1. STKU-6 – Natural User Interface Designfor Smartphones Ahmed Bouzid, Head of Product, Angel
    2. 2. Natural User Interface
    3. 3. Natural User InterfaceIs based on natural elements Not Natural: Type / Select from a drop down / Click on a check box – Using Mouse / Keyboard / Stylus Natural: Point / Touch / Drag / Speech / Motion - Using finger, voice, bodyInvisible • Focus is on the task at hand, not on the mediating interface
    4. 4. Natural vs Familiar InterfacesYet Naturalness does notmean ease of use foreveryoneFamiliarity with UI canrender UI invisibleNaturalness is crucial innew adoption
    5. 5. Natural User Interface: Smartphones• Key is to enable users to interact with device effortlessly• Everywhere Mobility• All the time Mobility• Hence need for multimodality: different ways to interact depending on context
    6. 6. Smartphone: Strengths• Mobility: I can take it with me and use it virtually anywhere.• Size: It fits in my pocket. I can have it with me anytime.• Multi-purpose: phone, email, texting, photos, contacts, calendar, etc.• Identity: Its tied to me personally. It is not tied to a location• (as in landline) or to a family (desktop).• Personalization: I load up my music, I take my photos, link to my friends, etc.• The iPhone is an extension of myself.• Opt-in automation: When I fire up an application I chose to fire it up: I chose to self-serve using the application.
    7. 7. Smartphone: Weaknesses• Interactional real estate: forces multi-step• Informational real estate: get only small amount of information at a time before needing to touch the screen to get more (breaks reading/concentration flow)• Typing is difficult: Typing on a flat surface is a challenge, especially is the surface gets dirty or ages• Power: Need to charge the device periodically.
    8. 8. VUI WeaknessesTime linearity: unlike graphical interfaces, voice interfaces are linearlycoupled with time.Uni-directionality: When you hear something, you can’teasily go back and listen to it again. Contrast that to readinga piece of text where you can go back and forth at will.Invisibility: In a voice interface, no easy markers exist thatthe user can check when they feel lost.Imposed automation: When people into a toll free number,they are usually not calling to use an automated system butrather to talk to a person.Listening/Speaking: not always the best mode of communication.
    9. 9. VUI StrengthsIn the Cloud: all IVRs are in the cloud.Easy to start: All they need to do is to call a phonenumber.Universally accessible: They can call the IVR fromany phone.Easy to use: All they need to do is listen toinstructions and provide input when asked for it.Uniform deployment: because the IVR is in thecloud, users are always running the same version ofthe application.
    10. 10. Available Modes in SmartphoneInput • Touch/Swipe • Shaking • Biometrics • Speech • TypingOutput • Images/Videos • Text • Audio • Vibration
    11. 11. Smartphone Contexts = All Contexts• Noisy environment: can’t hear/can’t be heard• Quiet environment can’t speak/can’t make noise• Private information: don’t want to share information• Hands busy: assembling a chair, can’t touch, can’t type• Eyes busy: driving, can’t read
    12. 12. Context DimensionsEnvironment ContentUser State Medium
    13. 13. UI ActionsInput• Type full text• Touch/Pick/Swipe• Speak fully phrases/Sentences• Speak partially (give short answers: yes/no)Output• Read full text• Read short text (pick list)• See (but not read – e.g., colors/shapes)• Hear language• Hear beeps/sounds
    14. 14. Keys to Effective Smartphone NUIKey is to enable user to interact with device theway the user choose to1. User has at their disposal several modes of interaction2. User is never forced to use any one mode at any time: user chooses what mode to use3. User can complete any task purely using a single mode4. User can turn off any given mode at any time and can switch it back on at any time5. Flow progress is not penalized because user switched modes – i.e., redo steps already done or starting over
    15. 15. Our Focus: Transactional InteractionsMulti-step Interactions aimed at solving a problem/accomplishing somethingUser: What is Chipotle trading at?App: Chipotle Mexican Grill is at $321.56. Up just a tad.User: What’s the highest it has been in the last three months?App: July 10 was highest in the last 3 months, trading at $344.21.User: Buy 100 shares.App: You have Schwab and Fidelity. Which would you like?User: Schwab.App: Got it. I see you have an account ending in 2234. Use that account?”User: Yes.App: OK. 100 shares at Market or at a Specific Price?User: Market.App: Got it. That trade has been placed for 100 shares at market. I will send you an email confirmation when the shares are purchased.
    16. 16. Why Spoken Conversation? Speech is Natural Conversation is Natural Speech is efficient: speaking requires less effort than typing Use cases • Dictation • When searching is easier than selecting • Several interactions that require simple responses • Hands are busy • Eyes are busy • Short questions from device • Short responses from user • Sharing a spoken joke with friends
    17. 17. Example of Smartphone Conversations- Book flight- Order a Book- Hotel reservations- Order flowers- Order Pizza- Banking- Movie tickets- Restaurant reservations
    18. 18. Conversational NUI- Transaction requires multiple pieces of information- Complex requests that can be efficiently formulated in a sentence: “What’s the highest it has been in the last three months?”- Short responses from user: “Schwab,” “Yes,” “Market.”- Short commands from user: “Buy 100 shares.”
    19. 19. Why would you want to use voice?• We speak faster than we type• We hear faster than we read read• Sound is public: its value is the existence of distance between the source and the destination (but could use earphones)Use cases • Dictation • When searching is easier than selecting • Several interactions that require simple responses • Hands are busy • Eyes are busy • Short questions from device • Short responses from user • Sharing a spoken joke with friends
    20. 20. Why would you NOT want to use voice?• Sound is public: its value is the existence of distance between the source and the destination (but could use earphones)Use cases • Privacy: sharing personal info., credit card info. • Noisy environment: can’t hear or can’t be heard
    21. 21. When Visual• Privacy• Accuracy• Pictures• Videos• Long text
    22. 22. When Visual is not Optimal• Input • Screen small: typing, picking • Can’t write: small child • Hands busy• Output • Screen small, bad lighting • Can’t read: small child • Eyes busy
    23. 23. How Visual helps Audio• Redundancy• Visual Confirmation• No match issues: present menu to select option/or give keyboard to type• Help: visual help more effective than spoken help• Complementary info: Show bill/show device• When visual is needed: location in bill• Summary of info. collected• Enable user to quickly correct info provided earlier
    24. 24. The Elements of ConversationActions • Start/Initiate • Take turn/give turn • Interrupting • Pausing • Resuming • Repeating • Starting over • Ending/TerminatingSates • Speaking • Listening • Paused • Processing/ThinkingContext • Point in conversation • Information
    25. 25. States and Interaction Flows
    26. 26. Conversation SignalingCrucial part of communication is signaling statesand state transitions• States • Initial • Paused • Processing/Thinking • Speaking• Transition between states
    27. 27. State Signaling State Visual Audio Initial YES NO Paused YES NO Processing YES YES Listening YES NO Speaking YES YES
    28. 28. Initial State
    29. 29. Speaking State
    30. 30. Listening State
    31. 31. Processing State
    32. 32. Speaking State
    33. 33. Paused State
    34. 34. State Transition SignalingUser wishes to speak: user tapsUser finished speaking • User stops talking or • User tapsLexee is Listening: Lexee makes start listeningsound mark and changes state visualLexee is finished Listening: Lexee makes finishedlistening sound mark and changes state visual
    35. 35. Initiating Conversation• When user starts the conversation, should the application say something?• Should it say nothing and only show something?
    36. 36. PausingExplicit Pausing• User Says “Pause”• User swipesImplicit Pausing• User doesnt respond• User says wrong thing several time in row• User minimized appWhen should user be allowed to pause • Anytime? • How about when app is processing a transaction?
    37. 37. ResumingWhen resuming• Should it pick up where it left off in the prompt?• Should it play the prompt again?• Should it take the length of pause into question to determine which one? • What is pause lasted a few seconds: retrieving an address • What if pause lasted a few days: given up on ordering• If we want to pick up from where we left off: • How long is the context to be remembered? • Give the user a summary of what had happened so far?
    38. 38. Interactional InvestmentLook ahead• Dont waste users time (service down/account suspended)Provide GUI for Reviewing Collected InputProvide GUI for Changing Collected Input• Enable user to change via GUI values collected
    39. 39. Multi-ModalityReinforcing• Audio and Visual matchClashing• Audio Input and Visual Input don’t match • Should the interface pick the first that came in? • Should it privilege one over the other: e.g., assume that audio is misrec and go with visual? • Should it pick up both and signal ambiguity: ask user to resolve?Alternative• Audio OR VisualComplementary • Take me here.
    40. 40. Coming Soon: LexeeVisit: http://www.lexee.comTwitter: @officiallexee