0% found this document useful (0 votes)
9 views85 pages

cs3240 09 Qualitative Evaluation

This lecture discusses qualitative evaluation methods in usability engineering, emphasizing the importance of real user feedback and naturalistic observation. It covers various techniques such as heuristic evaluation, think-aloud methods, and constructive interaction, highlighting their benefits and challenges. The session concludes with practical steps for conducting user tests effectively.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views85 pages

cs3240 09 Qualitative Evaluation

This lecture discusses qualitative evaluation methods in usability engineering, emphasizing the importance of real user feedback and naturalistic observation. It covers various techniques such as heuristic evaluation, think-aloud methods, and constructive interaction, highlighting their benefits and challenges. The session concludes with practical steps for conducting user tests effectively.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 85

Lecture 9: Qualitative

Evaluation
Shengdong Zhao

Acknowledgement:
Some material in this lecture is from Saul Greenberg, Maneesh Agrawal, Scott Klemmer, Richard Davis, etc.
Used with permission.

1
Design
Prototype

Evaluate
The Design Process

[Koberg & Bagnall] Design Thinking Workshop

5
Naturalistic approach
•Observation occurs in realistic setting
– real life

•Problems
– hard to arrange and do
– time consuming
– may not generalize

6
Usability engineering
approach
Is the test result relevant to the usability of real products in
real use outside of lab?
Problems
– non-typical users
– non-typical tasks
– different physical environment
– different social context
• experimenter vs. boss

•Partial Solution
– use real users
– task-centered system design tasks
– environment similar to real situation

7
Discount usability evaluation
•Low cost methods to gather usability problems
– approximate: capture most large and many minor
problems

•How?
– Qualitative:
• observe interactions
• gather explanations
• produces description
• anecdotes, transcripts, problem areas, critical incidents…

– Quantitative*
• count, log, measure user actions
• speed, error rate, counts of activities

8
Qualitative vs. Quantitative

This week Next week

Words Numbers

9
Discount usability evaluation
•Methods
– Inspection

– extracting the conceptual model

– direct observation
• think-aloud
• constructive interaction
• Retrospective Think Aloud

– query techniques (interviews and questionnaires)


– continuous evaluation (user feedback and field
studies)
10
Inspection
• Designer tries the system (or prototype)
– does the system “feel right”?
– benefits
• catch major problems early

– problems
• not reliable
• not valid
• intuitions can be wrong

• Inspection methods
– task centered walkthroughs
– heuristic evaluation

11
Heuristic Evaluation
Usability Heuristics
“Rules of thumb” describing features of usable systems
– Can be used as design principles
– Can be used to evaluate a design
Example: Minimize users’ memory load
Pros and cons
– Easy and inexpensive
• No need users
• Catch many design flaws
– More difficult than it seems
• Not a simple checklist
• Cannot assess how well the interface will address user goals
Heuristic Evaluation
Developed by Jakob Nielsen (1994)
Original Heuristics
H1-1: Simple and natural dialog
H1-2: Speak the users’ language
H1-3: Minimize users’ memory load
H1-4: Consistency
H1-5: Feedback
H1-6: Clearly marked exits
H1-7: Shortcuts
H1-8: Precise & constructive error messages
H1-9: Prevent errors
H1-10: Help and documentation
Revised Heuristics
Also developed by Nielsen.
– Based on factor analysis of 249 usability problems
– A prioritized, independent set of heuristics
Revised Heuristics
H2-1: Visibility of system status
H2-2: Match system and real world
H2-3: User control and freedom
H2-4: Consistency and standards
H2-5: Error prevention
H2-6: Recognition rather than recall
H2-7: Flexibility and efficiency of use
H2-8: Aesthetic and minimalist design
H2-9: Help users recognize, diagnose and recover from errors
H2-10: Help and documentation
Heuristic: Visibility (Feedback)

searching database for matches

H2-1: Visibility of system status


Heuristic: Visibility (Feedback)
Users should always be aware!

Feedback: Toolbar, cursor, ink


Heuristics (H2-2): Match System &
World
Heuristics: Match System & World
Speak users’ language

•Withdrawing money at ATM

•Use meaningful mnemonics, icons and abbreviations


Heuristics (2-3) : Control &
Freedom

“Exits” for mistaken choices, undo, redo


Don’t force down fixed paths …
Heuristics: Control & Freedom
• Mark exits: Users don’t like to be trapped!

• Strategies
– Cancel button (or Esc key) for dialog
• Make the cancel button responsive!
– Universal undo
Heuristics: Consistency

H2-4: Consistency and standards


Heuristics: Errors and Memory

H2-5: Error prevention

H2-6: Recognition rather than recall


– Make objects, actions, options, & directions visible or easily
retrievable
Heuristic: Errors and Memory
• Promote recognition over recall
– Recognition is easier than recall

• Describe expected input clearly


– Don’t allow for incorrect input
Heuristics: Flexibility
Edit
Cut ctrl-X
Copy ctrl-C
Paste ctrl-V

H2-7: Flexibility and efficiency of use


– Accelerators for experts (e.g., gestures, shortcuts)
– Allow users to tailor frequent actions (e.g., macros)
Heuristics: Aesthetics

H2-8: Aesthetic and minimalist design


– No irrelevant information in dialogues
Heuristics: Help Users

H2-9: Help users recognize, diagnose, and recover


from errors
Good Error Messages

From Cooper’s “About Face 2.0”


Heuristics: Docs
H2-10: Help and documentation
– Easy to search
– Focused on the user’s task
– List concrete steps to carry out
– Not too long
Heuristics: Docs

32
The Process of Heuristic
Evaluation
Phases of Heuristic Eval. (1-2)
1) Pre-evaluation training

2) Evaluation
– Individuals evaluate interface then aggregate results

– Work in 2 passes
• Overview -> Details

– Each evaluator produces list of problems


Phases of Heuristic Eval. (3-4)
3) Severity rating
– Cosmetic << minor << major <<
catastrophic

4) Debriefing
– Discuss outcome
– Suggest solutions
– Assess difficulty to fix
Examples
Can’t copy info from one window to another
– Violates “User control and freedom” (H2-3)
– Violates “Recognition rather than recall” (H2-7)
– Violates “Flexibility and efficiency of use” (H2-8)
– Fix: allow copying

Typography uses mix of upper/lower case formats and fonts


– Violates “Consistency and standards” (H2-4)
– Slows users down
– Fix: pick a single format for entire interface

– Probably wouldn’t be found by user testing


Severity Rating
Used to allocate resources to fix problems

Estimates of need for more usability efforts

Combination of
– Frequency
– Impact
– Persistence (one time or repeating)

Should be calculated after all evaluations are in

Should be done independently by all judges


Levels of Severity
0 - don’t agree that this is a usability problem

1 - cosmetic problem

2 - minor usability problem

3 - major usability problem; important to fix

4 - usability catastrophe; imperative to fix


Severity Ratings Example
1. [H2-4 Consistency] [Severity 3][Fix 0]

The interface used the string "Save" on the first screen


for saving the user's file, but used the string "Write
file" on the second screen. Users may be confused by
this different terminology for the same function.
Debriefing
• Conduct with evaluators, observers, and development
team members

• Discuss general characteristics of UI

• Suggest improvements to address major usability problems

• Development team rates how hard things are to fix

• Make it a brainstorming session


– Little criticism until end of session
Number of Evaluators
Single evaluator achieves poor results
– Only finds 35% of usability problems
– 5 evaluators find ~ 75% of usability problems
– Why not more evaluators???? 10? 20?
• Adding evaluators costs more
• Many evaluators won’t find many more problems

But always depends on market for product:


– popular products  high support cost for small bugs
Decreasing Returns
Problems Found Benefits / Cost

Caveat: Graphs are for a specific example


Summary
• Heuristic evaluation is a discount method

• Have evaluators go through the UI twice

• Have evaluators independently rate severity

• Combine the findings from 3 to 5 evaluators

• Discuss problems with design team

• Cheaper alternative to user testing


– Finds different problems, so good to alternate
In-class Exercise

44
Discount usability evaluation
•Methods
– Inspection

– extracting the conceptual model

– direct observation
• think-aloud
• constructive interaction
• Retrospective Think Aloud

– query techniques (interviews and questionnaires)


– continuous evaluation (user feedback and field
studies)
45
Conceptual model extraction
•How?
– show the user static images of
• the prototype or screens
– ask the user explain
• the function of each screen element
• how they would perform a particular task
•What?
– Initial conceptual model (first time)
– Formative conceptual model (later)

•Value?
– good for eliciting people’s understanding before & after
use
– poor for examining system exploration and learning

46
Direct observations
•Evaluator observes users interacting with system
– in lab:
• user asked to complete a set of pre-determined tasks
– in field:
• user goes through normal duties

•Value?
– excellent at identifying gross design/interface problems
– validity depends on how controlled/contrived the situation
is

47
Simple observation method
•User is given the task
•Evaluator just watches the
user

•Problem
– does not give insight into the
user’s decision process or
attitude

48
Think aloud method
Users speak their thoughts
while doing the task
•gives insight into what the user Hmm, what does this
is thinking do? I’ll try it… Ooops,
now what happened?
•most widely used evaluation
method in industry

•However:
– unnatural (awkward and
uncomfortable)
– hard to talk if they are
concentrating
– may alter the way users do the
task

49
Initial Conceptual Model and
Think Aloud Exercise

Techniques used:
• Conceptual Model
• Think Aloud

50
51
52
Think aloud method
Users speak their thoughts
while doing the task
•gives insight into what the user Hmm, what does this
is thinking do? I’ll try it… Ooops,
now what happened?
•most widely used evaluation
method in industry

•However:
– unnatural (awkward and
uncomfortable)
– hard to talk if they are
concentrating
– may alter the way users do the
task

53
Problems of Think aloud
method
However:
– unnatural (awkward and
uncomfortable) Hmm, what does this
do? I’ll try it… Ooops,
– hard to talk if they are now what happened?
concentrating
– may alter the way users do
the task

54
Constructive interaction
method
•Two people work together on
a task Oh, I think
– monitor their normal you clicked
on the
conversations wrong icon
Now, why
did it do
Co-discovery learning that?

– use semi-knowledgeable “coach”


and novice
– only novice uses the interface
• novice ask questions
• coach responds
– gives insights into two user
groups
55
Problems of Think aloud
method
However:
– unnatural (awkward and
uncomfortable) Hmm, what does this
do? I’ll try it… Ooops,
– hard to talk if they are now what happened?
concentrating
– may alter the way users
do the task

56
RTA – Retrospective Think
Aloud
•Users first complete the task
and verbalize after
•Process is observed and
recorded with notes

•Benefits
–Verbalizing on a higher level
–More relaxed
–Fabrication not a problem

Ref: ZhiweiGuan, Shirley Lee, Elisabeth Cuddihy, Judith 57


Ramey
Comparing Eye-tracking Patterns

58
What you have learned
• Why do we need evaluation?
• Different stages where usability evaluation applies
• What a usability room look like?
• A number of discounted usability evaluation methods
– Inspection
– Initial Conceptual Model
– Direct observation
• Think aloud
• Constructive interaction method
• RTA (Retrospective Think Alound)

59
Steps to Prepare and
Conduct a User Study

60
Preparing for a User Test
• Objective: narrow or broad?
• Design the tasks
• Decide on whether to use video/audio
• Choose the setting
• Representative users

61
User Test
• Roles:
– Greeter
– Facilitator: Help users
to think aloud…
– Observers: record
“critical incidents”

62
Critical Incidents
• Critical incidents are unusual or interesting
events during the study.
• Most of them are usability problems.
• They may also be moments when the user:
– got stuck, or
– suddenly understood something
– said “that’s cool” etc.

63
The User Test
• The actual user test will look something like
this:
– Greet the user
– Explain the test
– Collect Demographic Information
– Get user’s signed consent
– Demo the system
– Run the test (maybe ½ hour)
– Post-Interview & Questionnaire
– Debrief

64
10 Steps to better evaluation

65
10 steps to better evaluation
1. Introduce yourself
– some background will
help relax the subject.

66
10 steps
2. Describe the purpose of the observation (in
general terms), and set the participant at ease
– You're helping us by trying out this product in its
early stages.
– If you have trouble with some of the tasks, it's the
product's fault, not yours. Don't feel bad; that's
exactly what we're looking for.

67
10 steps (contd.)
3. Tell the participant that it's okay to quit at
any time, e.g.:
– Although I don't know of any reason for this to happen,
if you should become uncomfortable or find this test
objectionable in any way, you are free to quit at any
time.

68
10 steps (contd.)
4. Talk about the equipment in the room.
– Explain the purpose of each piece of equipment
(hardware, software, video camera, microphones, etc.)
and how it is used in the test.

69
10 steps (contd.)

5. Explain how to “think aloud.”


– Explain why you want participants to think aloud, and
demonstrate how to do it. E.g.:
– We have found that we get a great deal of information
from these informal tests if we ask people to think aloud.
Would you like me to demonstrate?

70
10 steps (contd.)

6. Explain that you cannot provide help.

71
10 steps (contd.)
7. Describe the tasks and introduce the
product.
– Explain what the participant should do and in
what order. Give the participant written
instructions for the tasks.
– However, don’t demonstrate what
you’re trying to test.

72
10 steps (contd.)
8. Ask if there are any questions before you
start; then begin the observation.

73
10 steps (contd.)

9. Conclude the observation. When the test is


over:
– Explain what you were trying to find.
– Answer any remaining questions.
– Discuss any interesting behaviors you would like the
participant to explain.

74
10 steps (contd.)
10. Use the results.
– When you see participants making mistakes, you
should attribute the difficulties to faulty product
design, not to the participant.

75
Using the Results
• Update task analysis and rethink design
– Rate severity & ease of fixing problems
– Fix both severe problems & make the easy fixes
• Will thinking aloud give the right answers?
– Not always
– If you ask a question, people will always give an answer,
even it is has nothing to do with the facts
– Try to avoid leading questions

76
Questions?
High-order summary:
• Follow a loose master-apprentice model
• Observe, but help the user describe what they’re
doing
• Keep the user at ease

77
How many users should you
observe?
• Problems
– observing many users is expensive
– but individual differences matter
• best user 10x faster than slowest
• best 25% of users ~2x faster than slowest 25%

• Partial solution
– reasonable number of users
with reasonable range
– big problems usually detected with 3-5 users
– small problems / fine measures need many users

78
In-class Exercise

79
In-class Exercise
• Procedure
– Greet the user
– Explain the test
• (How to be consistent?)
– Collect demographic information (how?)
– Get user’s signed consent
– Demo the system
• (how to be consistent in demoing the system to different users?)
– Run the test (maybe ½ hour)
• (What test and how to conduct it?)
– Post-study questionnaire
• (What type of questions you want to ask?)
– Debrief
• Work in pairs, and describe briefly what will you do in
each step

80
What we have prepared?

81
Pre-Study Questionnaire

82
Example Questionnaires
http://oldwww.acm.org/perlman/question.html

83
A Simple Questionnaire

84
A More Comprehensive One

85
Even More Comprehensive

86
Next Time
• Quantitative Evaluation

87

You might also like