Ch4 Performance metrics

Chapter 4
Performance Metrics

Presenter: 00335011 魏傳諺

Agenda

• Preface
• Task Success
• Time-on-Task
• Errors
• Efficiency
• Learnability

Preface of Performance Metrics

• Based on specific user behaviors
– User behaviors
– The use of scenarios or task
• How well users are actually using a product
• Useful to estimate the magnitude of a specific usability issue
– How many people are likely to encounter the same issue after the product is
released?
– How many users are able to successfully complete a core set of tasks using
a product
• Not the magical elixir for every situation
– sample size
– time & money
– tell the what very effectively but not the why

Five Basic Types

• The most widely used performance metric
Task Success • How effectively users are able to complete a given set of tasks

Time-on-Task • How much time is required to complete a task

Errors • Reflect the mistakes made during a task

Efficiency • The amount of effort a user expends to complete a task

Learnability • How performance changes over time

Task Success

• The most common usability metric
• As long as the user has a well-defined task, you can measure
success

Collecting Any Type of Success Metric

• Each task must have a clear end-state
– Define the success criteria  Data collection
• Find the current price for a share of Google stock (clear end-state)
• Research ways to save for your retirement (not a clear end-state)

• Way to collect success data
– Verbally articulate the answer after completing the task
– Provide their answers in a more structured way
• Try to avoid write-in answers if possible

• In some case the correct solution to a task may not be verifiable
– depends on the user‟s specific situation
– testing is not being performed in person

Binary Success

• Either participants complete a task successfully or they don‟t
• How to Collect and Measure
– 0&1
• How to Analyze and Present
– By individual task
– By user or type of user
• Frequency of use
• Previous experience using the product
• Domain expertise
• Age group
• Can calculate a percentage of tasks that each successfully completed
– Binary data  Continuous data

• Calculating Confidence Intervals

Levels of Success

• Partially completing a task?
– coming close to fully completing a task may provide value to the
participant
– Helpful for you to know
• Why some participants failed to complete a task
• With which particular tasks they needed help

Levels of Success (cont’d)

• How to Collect and Measure
– Must define the various levels
– Based on the extent or degree to which a participant completed the task
• Complete Success, Partial Success, and Failure
• What constitutes „„giving assistance‟‟ to the participant
• Assign a numeric value for each level
• Does not differentiate between different types of failure
– Based on the experience in completing a task
• No Problem, Minor Problem, Major Problem, and Failure/Gave up
• Ordinal data  No average score
– Based on the participant accomplishing the task in different ways
• Depending on the quality of the answer (not needs numeric score)

Levels of Success (cont’d)

• How to Analyze and Present
– To create a tacked bar chart
– To report a “usability score”

Issues in Measuring Success

• How to define whether a task was successful?
– When unexpected situations arise
• Make note of them
• Afterward try to reach a consensus

• How or when to end a task
– Stopping rule
• Complete task / Reach the point at which they would give up or seek
assistance
• “Three strikes and you‟re out”
• Set a time limit
– If the participant is becoming particularly frustrated or agitated

Time-on-Task

• Way to measure the efficiency of any product
– The faster a participant can complete a task, the better the experience
• Exceptions to the assumption that faster is better
– Game
– Learning

Importance of Measuring Time-on-Task

• Particularly important for products
– where tasks are performed repeatedly by the user
• The side benefits of measuring time-on-task
– Increasing Efficiency  Cost Savings  Actual ROI

How to Collect and Measure Time-on-Task

• The time elapsed between the start of a task and the end of a task
– In minutes
– In seconds
• Measure by any time-keeping device
– Start time & End time
– Two people record the times
• Automated Tools for Measuring Time-on-Task
– less error-prone
– Much less obtrusive
• Turning on and off the Clock
– Rules about how to measure time
• Start the clock as soon as they finish reading the task
• Point the timing ends at the participant hit the “answer” button
• Stop timing when the participant has stopped interacting with the product

How to Collect and Measure Time-on-Task (cont’d)

• Tabulating Time Data

Analyzing and Presenting Time-on-Task Data

• Ways to present
– Mean
– Median
– Geometric mean
• Ranges
– Time interval
• Thresholds
– Whether users can complete certain tasks within an acceptable amount of
time
• Distributions and Outliers
– Exclude outliers (> 3 SD above the mean)
– Set up thresholds
– determine the fastest possible time

Issues to Consider When Using Time Data

• Only Successful Tasks or All Tasks?
– Advantage of only including successful tasks
• A cleaner measure of efficiency
– Advantage of including all tasks
• A more accurate reflection of the overall user experience
• An independent measure in relation to the task success data
– Always determined when to end  include all times
– Sometimes decided when to end  only include successful tasks
• Using a Think-Aloud Protocol?
– Think-aloud protocol: to gain important insight
– Have an impact on the time-on-task data
– Retrospective probing technique
• Should You Tell the Participants about the Time Measurement?
– Perform the tasks as quickly and accurately as possible

Errors

• Usability issue vs. Error
– A usability issue is the underlying cause of a problem
– One or more errors are a possible outcome
• Errors
– incorrect actions that may lead to task failure

When to Measure Errors

• When you want to understand the specific action or set of actions
that may result in task failure
• Errors can tell
– How many mistakes were made
– Where they were made within the product
– How various designs produce different frequencies and types of errors
– How usable something really is
• Three general situations where measuring errors might be useful
– When an error will result in a significant loss in efficiency
– When an error will result in significant costs
– When an error will result in task failure

What Constitutes an Error?

• No widely accepted definition of what constitutes an error
• Based on many different types of incorrect actions by the user
– Entering incorrect data into a form field
– Making the wrong choice in a menu or drop-down list
– Taking an incorrect sequence of actions
– Failing to take a key action
• Determine what constitutes an error
– Make a list of all the possible actions
– Define many of the different types of errors that can be made

What Constitutes an Error? (cont’d)

Collecting and Measuring Errors

• Not always easy
– Need to know what the correct (set of) action(s) should be
• Consideration
– Only a single error opportunity
– Multiple error opportunities
• Way of organizing error data
– Record the number of errors for each task and each user
– 0 ~ max(number of error opportunities)

Analyzing and Presenting Errors

• Tasks with a Single Error Opportunity
– Look at the frequency of the error for each task
• Frequency of errors
• Percentage of participants who made an error for each task
– From an aggregate perspective
• Average the error rates for each task into a single error rate
• Take an average of all the tasks that had a certain number of errors
• Establish maximum acceptable error rates for each task
• Tasks with Multiple Error Opportunities
– Look at the frequency of errors for each task  error rate
– The average number of errors made by each participant for each task
– Which tasks fall above or below a threshold
– Weight each type of error with a different value and then calculate an “error score”

Issues to Consider When Using Error Metrics

• Make sure you are not double-counting errors
• Need to know
– An error rate, and
– Why different errors are occurring
• An error is the same as failing to complete a task
– Report errors as task failure

Efficiency

• Time-on-task
• Look at the amount of effort required to complete a task
– In most products, the goal is to minimize the amount of effort
– two types of effort
• Cognitive
– Finding the right place to perform an action
– Deciding what action is necessary
– Interpreting the results of the action
• Physical
– The physical activity required to take action

Collecting and Measuring Efficiency

• Identify the action(s) to be measured
• Define the start and end of an action
• Count the actions
• Actions must be meaningful
– Incremental increase in cognitive effort
– Incremental increase in physical effort
• Look only at successful tasks

Analyzing and Presenting Efficiency Data

Analyzing and Presenting Efficiency Data (cont’d)

Efficiency as a Combination of Task Success and Time

• Task Success + Time-on-Task
• Core measure of efficiency
– The ratio of the task completion rate to the mean time per task

LEARNABILITY

• Most products, especially new ones, require some amount of learning
• Experience
– Based on the amount of time spent using a product
– Based on the variety of tasks performed
• Learning
– Sometimes quick and painless
– At other times quite arduous and time consuming
• Learnability
– The extent to which something can be learned
– How much time and effort are required to become proficient
– While happens over a short period of time  maximize efficiency
– While happen over a longer time period  great rely on memory

Collecting and Measuring Learnability Data

• Basically the same as they are for the other performance metrics
• Collect the data at multiple times
– Based on expected frequency of use
• Decide which metrics to use  Decide how much time to allow
between trials
• Alternatives
– Trials within the same session
– Trials within the same session but with breaks between tasks
– Trials between sessions

Analyzing and Presenting Learnability Data

• By examining a specific performance metric
• Interpret the chart
– Notice the slope of the line(s)
– Notice the point of asymptote, or essentially where the line starts to
flatten out
– Look at the difference between the highest and lowest values on the y-
axis
• Compare learnability across different conditions

Issues to Consider When Measuring Learnability

• What Is a Trial?
– Learning is continuous and without breaks in time
• Memory is much less a factor in this situation
• More about developing and modifying different strategies to complete a set
of tasks
• Take measurements at specified time intervals

• Number of Trials
– There must be at least two
– In most cases there should be at least three or four
– You should err on the side of more trials than you think you might need
to reach stable performance.

Ch4 Performance metrics

More Related Content

What's hot

Similar to Ch4 Performance metrics

Recently uploaded

Ch4 Performance metrics