SlideShare a Scribd company logo
Tests
Carlos Matos
Lecture 8
PC3001, PC4001, PC5001, CS2847
Goals of
evaluation
Assess extent of systemfunctionality
Assess effectof an interface on users
Identify specific problems
Evaluation
techniques
Evaluation
• tests usability and functionalityof system
• occurs in laboratory, field and/or in collaboration
with users
• evaluates both design and implementation
• should be considered at all stages in the design life
cycle
Evaluation: tests
Evaluating designs
Evaluating
designs
Cognitive walkthrough(task-specific)
Heuristic evaluation (holistic)
Review-based evaluation (holistic)
Cognitive
walkthrough
(task-specific)
Based on the concept that users prefer to learn
by doing rather than read manuals
Evaluates design on how well it supports users
in learning specific tasks
Performed by a group that includes the UI
designers and developers, led by experts in
cognitive psychology
The group ‘walks through’ the design to identify
potential problems using psychological
principles
Forms are used to guide the analysis on each
step of the tasks
Heuristic
evaluation
(holistic)
Proposed by Nielsen and Molich
Compares the UI design against
accepted usabilityprinciples
Identify usability criteria (heuristics)
Experts check that the design meets the
criteria
Example heuristics
• Consistencyand standards
• Match between systemand the real world
• Error prevention
Heuristic evaluation ‘debugs’ design
Review-based
evaluation
(holistic)
Expert-based evaluation method
Results from experimental results and
empirical evidence, found in the literature to
validate design methods
Care needed to ensure results are
transferable to new design
Model-based evaluation
Cognitive models used to filter design
options
• e.g. GOMS1 prediction of user performance
Design rationale can also provide useful
evaluation information
1GOMS
"a set of Goals,a set of Operators,aset of Methods forachievingthegoals, and a set of Selectionsrules for choosing among competingmethods forgoals."
Evaluation: Tests
Evaluating Implementations
9
Empirical
methods in
HCI
Lab experiment
• Artificial, highly controlledby experimenter
Field study
• Occurs in the actual environment people use the UI
and with real tasks
Survey
• Questionnaire, conductedby paper, phone, web, or
in person
Quantifying
usability
Learnability
• Easy (quick) to learn?
Efficiency
• Fast to use after learning?
Errors
• Number of errors
Satisfaction
• Degree of satisfaction reported by users
Controlled
experiment
Controlled evaluation of specific aspects of
interactive behaviour
Evaluator chooses hypothesis to be tested
Manipulate independent variables
• Different placement, font size, input
Measure dependent variables
• Times, #errors, #tasks done, satisfaction
Use statistical analysis to accept or reject the
hypothesis
• How changes in independent variables affect the
dependent variables – are those effects significant?
Designing the
experiment
Process
Y = f(x) + 
independent
variables
x
dependent
variables
Y
unknown/uncontrolled
variables

Designing the
experiment
Subjects/users: who – representative, sufficient
sample
Implementation: real environment, artificial
variations
Tasks
• Real tasks: word processing, e-mail, web browsing
• Artificial: users focus on a simple subset of tasks
Measuring: how to count time, #clicks, #errors
Ordering: of conditions and tasks
Hardware: physical conditions of the test,
available inputs
Hypothesis
Prediction of outcome
• Framed in terms of independent and dependent
variables
e.g. “error rate will increase as font size decreases”
Null hypothesis
• States no difference between conditions
• The aim is to disprove this
e.g. null hypothesis = “no change of error rate with
font size”
A/B Testing
Experiment based on two alternative
interfaces
• Normally A is the controland B is the variation
In Web design, this is normally used to
identify improvementsthat can maximise a
certain outcome of interest
Normally the currentversionof the interface
is associated with the null hypothesis
Concerns
Internal validity
Are observed results actually caused by the
independent variables?
External validity
Can observed results be generalised to the world
outside the lab?
Reliability
Will consistent results be obtained by repeating
the experiment?
Threats to
Internal
Validity
Ordering effects
• People learn, and people get tired
• Randomise or counterbalance ordering
Selection effects
• Avoid pre-existinggroups (unless the group is an
independent variable)
• Randomly assign users to independent variables
Experimenter bias
• Experimenters may prefer an hypothesis to be
proven valid
• Double blind experiment is quite hard for HCI
• Controlprotocol
Threats to
External
Validity
Population
• Draw a random sample from the real target
population
Ecological
• Make lab conditions as realistic as possible
Training
• Training should mimic how the real interface would
be encounteredand learned
Task
• Tasks for testingshould be based on task analysis
Threats to
Reliability
Uncontrolled variation
• User differences
• Task design
• Measurement error
Solutions
• Eliminate uncontrolled variation
Select users by experience
Give consistent training
Measure dependent variables precisely
• Repetition, repetition
Many users, many trials
Standard deviation of the mean shrinks like the square
root of N (i.e. quadrupling #users makes the mean
twice as accurate)
Blocking
Divide samples into subsets that are more
homogeneous than the whole set
• Example: testing wear rate of different shoe sole
material
Lots of variation between feet of different people, but
the feet on the same person are more homogeneous
Apply all conditions within each block
• Test material A on one foot, material B on the other
Measure difference within block
• Wear(A) – Wear(B)
Randomise within the block to eliminate validity
threats
• Randomly put A on left or right foot
Between-
subjects
experiment
Each subject performs the experiment under
only one condition
Results are compared between different
groups
• Is mean(xi) > mean (yj) ?
No transfer of learning
More users required
Variation can bias results
Within-
subjects
experiment
Each subject performs the experiment under
each condition
Results are compared within each user
• For user i computexi – yi
• Is mean(xi-yi) > 0 ?
Transfer of learning possible
Less costly and less likely
to sufferfrom user variation
Counterbalancing
Defeats ordering effects by varying order of conditions
systematically(not randomly)
Latin Square designs
• Randomly assign subjects to equal-size groups
• A, B, C, … are the experimental conditions
• Latin Square ensures that each condition occurs in every
position in the ordering for an equal number of users
• Balanced Latin Squares: http://www.yorku.ca/mack/RN-
Counterbalancing.html
G1 G2
A B
B A
G1 G2 G3
A C B
B A C
C B A
G1 G2 G3 G4
A D C B
B A D C
C B A D
D C B A
Kinds of
measures
Self-report
• E.g. satisfaction
Observation
• Visible vs. hidden observer
• Hawthorne effect1
Archivalrecords
• Public vs. private
Trace
• Subjects normally unaware (e.g. testing for book
read wear)
1Hawthorneeffect:thealterationof behaviourby the subjectsof a study due to theirawarenessof being observed.
Evaluation: tests
Query techniques
26
Interviews
Analyst questionsuser on one-to -one basis
usually based on prepared questions
Informal, subjective and relatively cheap
Advantages
• Can be varied to suit context
• Issues can be explored more fully
• Can elicit user views and identify unanticipated
problems
Disadvantages
• Very subjective
• Time consuming
Questionnaires
Set of fixed questions given to users
Advantages
• Quick and reaches large user group
• Can be analysed more rigorously
Disadvantages
• Less flexible
• Less probing
Questionnaires
Need careful design
• What information is required?
• How are answers to be analysed?
Styles of question
• General
• Open-ended
• Scalar
• Multi-choice
• Ranked
Evaluation: tests
Physiological methods
30
Eye tracking
Head or desk mounted equipment tracks eye
position
Eye movement reflects the amount of
cognitive processinga display requires;
measurements include
• fixations: eye maintains stable position. Number
and duration indicate level of difficulty with display
• saccades: rapid eye movement between points of
interest
• scan paths: moving straight to a target with a short
fixation at the target is optimal
Physiological
measurements
Emotional responselinked to physical
changes
These may help determine a user’s reaction
to an interface; measurementsinclude:
• heart activity, including blood pressure, volume and pulse.
• activity of sweat glands: Galvanic Skin Response (GSR)
• electrical activity in muscle: electromyogram (EMG)
• electrical activity in brain: electroencephalogram (EEG)
Some difficulty in interpretingthese
physiologicalresponses- more research
needed
Evaluation: tests
Applicability
33
Choosing an
evaluation
method
Question Decision
When in process design vs. implementation
Style of evaluation laboratory vs. field
Level of objectivity subjective vs. objective
Type of measures qualitative vs. quantitative
Level of information high level vs. low level
Level of interference obtrusive vs. unobtrusive
Available resources
time, subjects,
equipment, expertise

More Related Content

Similar to Tests

Unit 3_Evaluation Technique.pptx
Unit 3_Evaluation Technique.pptxUnit 3_Evaluation Technique.pptx
Unit 3_Evaluation Technique.pptx
ssuser50f868
 
How to Conduct Usability Studies: A Librarian Primer
How to Conduct Usability Studies: A Librarian PrimerHow to Conduct Usability Studies: A Librarian Primer
How to Conduct Usability Studies: A Librarian Primer
Tao Zhang
 
Don t make them think: create an easy-to-use website and catalog through user...
Don t make them think: create an easy-to-use website and catalog through user...Don t make them think: create an easy-to-use website and catalog through user...
Don t make them think: create an easy-to-use website and catalog through user...Margaret Ostrander
 
classmar2.ppt
classmar2.pptclassmar2.ppt
classmar2.ppt
RangothriSreenivasaS
 
HCI_chapter_09-Evaluation_techniques
HCI_chapter_09-Evaluation_techniquesHCI_chapter_09-Evaluation_techniques
HCI_chapter_09-Evaluation_techniques
Manusha Dilan
 
E3 chap-09
E3 chap-09E3 chap-09
E3 chap-09
Welly Dian Astika
 
Business Research Method - Unit II, AKTU, Lucknow Syllabus
Business Research Method - Unit II, AKTU, Lucknow SyllabusBusiness Research Method - Unit II, AKTU, Lucknow Syllabus
Business Research Method - Unit II, AKTU, Lucknow Syllabus
Kartikeya Singh
 
Chapter 8 Evaluation Techniques
Chapter 8 Evaluation  TechniquesChapter 8 Evaluation  Techniques
Chapter 8 Evaluation Techniques
MLG College of Learning, Inc
 
e3-chap-09.ppt
e3-chap-09.ppte3-chap-09.ppt
e3-chap-09.ppt
KingSh2
 
ICS3211 lecture 10
ICS3211 lecture 10ICS3211 lecture 10
ICS3211 lecture 10
Vanessa Camilleri
 
Human Computer Interaction Evaluation
Human Computer Interaction EvaluationHuman Computer Interaction Evaluation
Human Computer Interaction Evaluation
LGS, GBHS&IC, University Of South-Asia, TARA-Technologies
 
Usability testing.pdf
Usability testing.pdfUsability testing.pdf
Usability testing.pdf
sairaazeem3
 
ICS3211_lecture 9_2022.pdf
ICS3211_lecture 9_2022.pdfICS3211_lecture 9_2022.pdf
ICS3211_lecture 9_2022.pdf
Vanessa Camilleri
 
Usability Evaluation
Usability EvaluationUsability Evaluation
Usability Evaluation
Saqib Shehzad
 
classfeb24.ppt
classfeb24.pptclassfeb24.ppt
classfeb24.ppt
minaketan81
 
classfeb24.ppt
classfeb24.pptclassfeb24.ppt
classfeb24.ppt
RangothriSreenivasaS
 
Design and Application of Experiments and User Studies
Design and Application of Experiments and User StudiesDesign and Application of Experiments and User Studies
Design and Application of Experiments and User Studies
Victor Adriel Oliveira
 
POLITEKNIK MALAYSIA
POLITEKNIK MALAYSIAPOLITEKNIK MALAYSIA
POLITEKNIK MALAYSIA
Aiman Hud
 
Design, Create, Evaluate Process (1).pptx
Design, Create, Evaluate Process (1).pptxDesign, Create, Evaluate Process (1).pptx
Design, Create, Evaluate Process (1).pptx
Le Hung
 

Similar to Tests (20)

Unit 3_Evaluation Technique.pptx
Unit 3_Evaluation Technique.pptxUnit 3_Evaluation Technique.pptx
Unit 3_Evaluation Technique.pptx
 
How to Conduct Usability Studies: A Librarian Primer
How to Conduct Usability Studies: A Librarian PrimerHow to Conduct Usability Studies: A Librarian Primer
How to Conduct Usability Studies: A Librarian Primer
 
Don t make them think: create an easy-to-use website and catalog through user...
Don t make them think: create an easy-to-use website and catalog through user...Don t make them think: create an easy-to-use website and catalog through user...
Don t make them think: create an easy-to-use website and catalog through user...
 
classmar2.ppt
classmar2.pptclassmar2.ppt
classmar2.ppt
 
HCI_chapter_09-Evaluation_techniques
HCI_chapter_09-Evaluation_techniquesHCI_chapter_09-Evaluation_techniques
HCI_chapter_09-Evaluation_techniques
 
E3 chap-09
E3 chap-09E3 chap-09
E3 chap-09
 
Business Research Method - Unit II, AKTU, Lucknow Syllabus
Business Research Method - Unit II, AKTU, Lucknow SyllabusBusiness Research Method - Unit II, AKTU, Lucknow Syllabus
Business Research Method - Unit II, AKTU, Lucknow Syllabus
 
Chapter 8 Evaluation Techniques
Chapter 8 Evaluation  TechniquesChapter 8 Evaluation  Techniques
Chapter 8 Evaluation Techniques
 
Evaluation techniques
Evaluation techniquesEvaluation techniques
Evaluation techniques
 
e3-chap-09.ppt
e3-chap-09.ppte3-chap-09.ppt
e3-chap-09.ppt
 
ICS3211 lecture 10
ICS3211 lecture 10ICS3211 lecture 10
ICS3211 lecture 10
 
Human Computer Interaction Evaluation
Human Computer Interaction EvaluationHuman Computer Interaction Evaluation
Human Computer Interaction Evaluation
 
Usability testing.pdf
Usability testing.pdfUsability testing.pdf
Usability testing.pdf
 
ICS3211_lecture 9_2022.pdf
ICS3211_lecture 9_2022.pdfICS3211_lecture 9_2022.pdf
ICS3211_lecture 9_2022.pdf
 
Usability Evaluation
Usability EvaluationUsability Evaluation
Usability Evaluation
 
classfeb24.ppt
classfeb24.pptclassfeb24.ppt
classfeb24.ppt
 
classfeb24.ppt
classfeb24.pptclassfeb24.ppt
classfeb24.ppt
 
Design and Application of Experiments and User Studies
Design and Application of Experiments and User StudiesDesign and Application of Experiments and User Studies
Design and Application of Experiments and User Studies
 
POLITEKNIK MALAYSIA
POLITEKNIK MALAYSIAPOLITEKNIK MALAYSIA
POLITEKNIK MALAYSIA
 
Design, Create, Evaluate Process (1).pptx
Design, Create, Evaluate Process (1).pptxDesign, Create, Evaluate Process (1).pptx
Design, Create, Evaluate Process (1).pptx
 

Recently uploaded

Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
CarlosHernanMontoyab2
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
beazzy04
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 

Recently uploaded (20)

Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 

Tests

  • 1. Tests Carlos Matos Lecture 8 PC3001, PC4001, PC5001, CS2847
  • 2. Goals of evaluation Assess extent of systemfunctionality Assess effectof an interface on users Identify specific problems
  • 3. Evaluation techniques Evaluation • tests usability and functionalityof system • occurs in laboratory, field and/or in collaboration with users • evaluates both design and implementation • should be considered at all stages in the design life cycle
  • 6. Cognitive walkthrough (task-specific) Based on the concept that users prefer to learn by doing rather than read manuals Evaluates design on how well it supports users in learning specific tasks Performed by a group that includes the UI designers and developers, led by experts in cognitive psychology The group ‘walks through’ the design to identify potential problems using psychological principles Forms are used to guide the analysis on each step of the tasks
  • 7. Heuristic evaluation (holistic) Proposed by Nielsen and Molich Compares the UI design against accepted usabilityprinciples Identify usability criteria (heuristics) Experts check that the design meets the criteria Example heuristics • Consistencyand standards • Match between systemand the real world • Error prevention Heuristic evaluation ‘debugs’ design
  • 8. Review-based evaluation (holistic) Expert-based evaluation method Results from experimental results and empirical evidence, found in the literature to validate design methods Care needed to ensure results are transferable to new design Model-based evaluation Cognitive models used to filter design options • e.g. GOMS1 prediction of user performance Design rationale can also provide useful evaluation information 1GOMS "a set of Goals,a set of Operators,aset of Methods forachievingthegoals, and a set of Selectionsrules for choosing among competingmethods forgoals."
  • 10. Empirical methods in HCI Lab experiment • Artificial, highly controlledby experimenter Field study • Occurs in the actual environment people use the UI and with real tasks Survey • Questionnaire, conductedby paper, phone, web, or in person
  • 11. Quantifying usability Learnability • Easy (quick) to learn? Efficiency • Fast to use after learning? Errors • Number of errors Satisfaction • Degree of satisfaction reported by users
  • 12. Controlled experiment Controlled evaluation of specific aspects of interactive behaviour Evaluator chooses hypothesis to be tested Manipulate independent variables • Different placement, font size, input Measure dependent variables • Times, #errors, #tasks done, satisfaction Use statistical analysis to accept or reject the hypothesis • How changes in independent variables affect the dependent variables – are those effects significant?
  • 13. Designing the experiment Process Y = f(x) +  independent variables x dependent variables Y unknown/uncontrolled variables 
  • 14. Designing the experiment Subjects/users: who – representative, sufficient sample Implementation: real environment, artificial variations Tasks • Real tasks: word processing, e-mail, web browsing • Artificial: users focus on a simple subset of tasks Measuring: how to count time, #clicks, #errors Ordering: of conditions and tasks Hardware: physical conditions of the test, available inputs
  • 15. Hypothesis Prediction of outcome • Framed in terms of independent and dependent variables e.g. “error rate will increase as font size decreases” Null hypothesis • States no difference between conditions • The aim is to disprove this e.g. null hypothesis = “no change of error rate with font size”
  • 16. A/B Testing Experiment based on two alternative interfaces • Normally A is the controland B is the variation In Web design, this is normally used to identify improvementsthat can maximise a certain outcome of interest Normally the currentversionof the interface is associated with the null hypothesis
  • 17. Concerns Internal validity Are observed results actually caused by the independent variables? External validity Can observed results be generalised to the world outside the lab? Reliability Will consistent results be obtained by repeating the experiment?
  • 18. Threats to Internal Validity Ordering effects • People learn, and people get tired • Randomise or counterbalance ordering Selection effects • Avoid pre-existinggroups (unless the group is an independent variable) • Randomly assign users to independent variables Experimenter bias • Experimenters may prefer an hypothesis to be proven valid • Double blind experiment is quite hard for HCI • Controlprotocol
  • 19. Threats to External Validity Population • Draw a random sample from the real target population Ecological • Make lab conditions as realistic as possible Training • Training should mimic how the real interface would be encounteredand learned Task • Tasks for testingshould be based on task analysis
  • 20. Threats to Reliability Uncontrolled variation • User differences • Task design • Measurement error Solutions • Eliminate uncontrolled variation Select users by experience Give consistent training Measure dependent variables precisely • Repetition, repetition Many users, many trials Standard deviation of the mean shrinks like the square root of N (i.e. quadrupling #users makes the mean twice as accurate)
  • 21. Blocking Divide samples into subsets that are more homogeneous than the whole set • Example: testing wear rate of different shoe sole material Lots of variation between feet of different people, but the feet on the same person are more homogeneous Apply all conditions within each block • Test material A on one foot, material B on the other Measure difference within block • Wear(A) – Wear(B) Randomise within the block to eliminate validity threats • Randomly put A on left or right foot
  • 22. Between- subjects experiment Each subject performs the experiment under only one condition Results are compared between different groups • Is mean(xi) > mean (yj) ? No transfer of learning More users required Variation can bias results
  • 23. Within- subjects experiment Each subject performs the experiment under each condition Results are compared within each user • For user i computexi – yi • Is mean(xi-yi) > 0 ? Transfer of learning possible Less costly and less likely to sufferfrom user variation
  • 24. Counterbalancing Defeats ordering effects by varying order of conditions systematically(not randomly) Latin Square designs • Randomly assign subjects to equal-size groups • A, B, C, … are the experimental conditions • Latin Square ensures that each condition occurs in every position in the ordering for an equal number of users • Balanced Latin Squares: http://www.yorku.ca/mack/RN- Counterbalancing.html G1 G2 A B B A G1 G2 G3 A C B B A C C B A G1 G2 G3 G4 A D C B B A D C C B A D D C B A
  • 25. Kinds of measures Self-report • E.g. satisfaction Observation • Visible vs. hidden observer • Hawthorne effect1 Archivalrecords • Public vs. private Trace • Subjects normally unaware (e.g. testing for book read wear) 1Hawthorneeffect:thealterationof behaviourby the subjectsof a study due to theirawarenessof being observed.
  • 27. Interviews Analyst questionsuser on one-to -one basis usually based on prepared questions Informal, subjective and relatively cheap Advantages • Can be varied to suit context • Issues can be explored more fully • Can elicit user views and identify unanticipated problems Disadvantages • Very subjective • Time consuming
  • 28. Questionnaires Set of fixed questions given to users Advantages • Quick and reaches large user group • Can be analysed more rigorously Disadvantages • Less flexible • Less probing
  • 29. Questionnaires Need careful design • What information is required? • How are answers to be analysed? Styles of question • General • Open-ended • Scalar • Multi-choice • Ranked
  • 31. Eye tracking Head or desk mounted equipment tracks eye position Eye movement reflects the amount of cognitive processinga display requires; measurements include • fixations: eye maintains stable position. Number and duration indicate level of difficulty with display • saccades: rapid eye movement between points of interest • scan paths: moving straight to a target with a short fixation at the target is optimal
  • 32. Physiological measurements Emotional responselinked to physical changes These may help determine a user’s reaction to an interface; measurementsinclude: • heart activity, including blood pressure, volume and pulse. • activity of sweat glands: Galvanic Skin Response (GSR) • electrical activity in muscle: electromyogram (EMG) • electrical activity in brain: electroencephalogram (EEG) Some difficulty in interpretingthese physiologicalresponses- more research needed
  • 34. Choosing an evaluation method Question Decision When in process design vs. implementation Style of evaluation laboratory vs. field Level of objectivity subjective vs. objective Type of measures qualitative vs. quantitative Level of information high level vs. low level Level of interference obtrusive vs. unobtrusive Available resources time, subjects, equipment, expertise