2. Background
There is a debate about evaluating a display.
Some says "NICE", "MEDIUM", or "CLEAR".
There is an assumption that as long as a software can be used, it is good
enough.
The evaluation of a software or display activity is an activity that is avoided
because it will increase development time and costs.
Evaluation activities are very important because designers can find out
whether their work is useful and desirable by the user.
3. Background (cont.)
Evaluation is a process that systematically collects data that informs us
about the opinion of a person or group of users about their experience using
a product for a particular task in a particular environment.
A user wants to use a system that is easy to learn, and its use is as
effective, efficient, safe and satisfying as possible. In addition, as far as
possible is fun, attractive, challenging, etc.
4. Why Evaluation Is Needed
Designers cannot assume that others like themselves, and
follow design guidelines guarantee that their work is good.
Evaluation is needed to check whether the user can use the product and like it.
Evaluation of usage satisfaction with a product can be done using a
questionnaire and / or interview.
5. When to Evaluate
Evaluation can be done at:
ď‚· During the process of making the product so that it is always the same as
what is requested or needed by the user. This process is usually called
formative evaluations.
ď‚· When the product has been made, namely through a prototype.
ď‚· When the product has been marketed. If there are deficiencies or
changes in user requirements, the latest product / upgrade version can be
made, for example: Windows, Winamp, etc. programs. This evaluation is
usually called summative evaluations
6. When to Evaluate (cont.)
Product evaluation can be done through market research, either
through individuals or groups of users
8. “Quick and Dirty” Evaluation
Is feedback in the form of desires and likes from users or
consultants who are delivered informally to designers about the
products they maked.
This evaluation can be carried out at all stages of product manufacturing and
the emphasis is on the quick / shortest possible input than carefully
documented findings.
9. Usability Testing
This evaluation was quite dominant in the 1980’s.
Involving measurement of user performance in preparing their tasks carefully,
from this process the design of the system is made.
User performance is generally measured in the number of errors made and
the time needed to complete the task.
This evaluation uses questionnaires and interviews with users about their
satisfaction using the system.
Research is usually done in a laboratory, where the user is given a certain
treatment (eg light, sound, color, etc.) or it can be without treatment.
11. Field Studies
In contrast to usability testing, this evaluation is carried
out in the original environment where the user works, this
aims to improve understanding of the user's work
naturally and how the technology affects.
This evaluation can be used to:
ď‚· Helps identify opportunities for new technologies.
ď‚· Determine the requirements for design.
ď‚· Facilitating the introduction of a technology.
ď‚· Technology evaluation.
12. Field Studies (cont.)
Techniques that can be used:
ď‚· Interview.
ď‚· Observation (observations only made by designers).
ď‚· Participatory (user involved in designing).
ď‚· Ethnography (cultural judgment).
From the data obtained, designer can evaluate, both quantitatively and
qualitatively, on the product.
13. Predictive Evaluation
Based on the experience of an expert in dealing with users,
and usually this is used as a benchmark for predicting
problems using a product.
The advantages of this evaluation:
ď‚· The desired user does not need to be presented
ď‚· The manufacturing process is relatively fast, inexpensive, and quite preferred
by the company.
In recent years, this evaluation has been quite popular.
14. Evaluation Techniques
Observing users.
Asking users their opinions.
Asking experts their opinions
User testing performance.
Modeling users' task performance to predict the efficacy of a user
interface.
15. Relationship Between Paradigms
and Evaluation Techniques
Technique “Quick and Dirty”
Observing users It is important to see how users
behave in their original environment
Asking users Discuss with potential users and
users, in a specific group or group
Asking experts To get criticism about the use of a
prototype
User testing ===
Modeling users’ task performance ===
16. Relationship Between Paradigms
and Evaluation Techniques
(cont.)
Technique Usability Testing
Observing users Through videos and notes, analysis is carried out to
identify errors, investigate how software works, or
calculate time performance
Asking users By using a satisfaction questionnaire, user opinion is
collected. Interviews are sometimes used to get a more
detailed opinion
Asking experts ===
User testing Performed in the laboratory
Modeling users’ ===
17. Relationship Between Paradigms
and Evaluation Techniques
(cont.)
Technique Field Studies
Observing users Performed in any location. In ethnographic studies,
evaluators participate in the user environment
Asking users Evaluators can conduct interviews or discuss what they
see to participants.
Asking experts ===
User testing ===
Modeling users’
task performance
===
18. Relationship Between Paradigms
and Evaluation Techniques
(cont.)
Technique Predictive
Observing users ===
Asking users ===
Asking experts One uses his benchmark in making designs to predict the
efficacy of a face to face
User testing ===
Modeling users’
task performance
The model used to predict the efficacy of a face-to-face or
compare the performance of time with its version
19. Likert Scale
It is a scale that is quite widely used to conduct evaluations
Scale sizes ranging from 4 to 7
Size 4 (1 = very bad, 2 = bad, 3 = good, 4 = very good)
Size 5 (1 = very bad, 2 = bad, 3 = neutral, 4 = good, 5 = very good)
20. Likert Scale (cont.)
Size 7 (1 = very bad, 2 = bad, 3 = rather bad, 4 = neutral, 5 = rather
good, 6 = good, 7 = very good)
Research generally uses 5 scales
21. Example Evaluation
Criteria Evaluator Average
1 2 3 4 5
Layout 5 4 4 3 4 4
Access speed 3 4 3 3 4 3.4
Access procedures, eg: KHS, KRS 4 4 5 3 4 4
Color combination 4 4 2 4 2 3.2
Information that is always up to date 5 4 3 4 4 4.2
Average 3.76
Imagine the Amikom web site, then give an assessment:
22. Example Evaluation (cont.)
From these results, the overall evaluator's opinion is
neutral because the value is 3.76
The best criteria is information that is always up to date, while those that
must get better attention are the criteria for color combination.