Usability Evaluation
There are four types of evaluation, according to the purpose of the evaluation:
Exploratory - how is it (or will it be) used?
Predictive - estimating how good it will be.
Formative - how can it be made better?
Summative - how good is it?
Exploratory Evaluation
Explores current usage and the potential design space for new designs.
Done before interface development.
Learn which software is used, how often, and what for.
Collect usage data – statistical summaries and observations of usage.
Predictive Evaluation
Estimates the overall quality of an interface (like a summative evaluation, but a prediction made in
advance). Done once a design has been done, but before implementation proceeds.
Formative Evaluation
In system testing, development costs can be minimized by finding bugs as early as possible in the
software development cycle. The same applies to usability ‘bugs’ – it is far more useful to identify
potential problems before the system is built than after it. Formative evaluation describes studies that are
carried out as part of the design process. To some extent, formative evaluation can be carried out simply
by inviting usability experts, or representative users, to review product plans and specifications, and offer
their opinion. A more formalized approach to soliciting user opinions is participatory design methods,
where representative users take part in design activities, perhaps structured in a way that means they do
not have to learn too much technical jargon, but can concentrate on the way they are likely to interact
with the user interface.
Formative Evaluation using Cognitive Walkthroughs
The Cognitive Walkthrough method is a structured analytic approach to assessing usability early
in the project. The following model is used mostly in such evaluations.
Behavior model
The model of a user carrying out a task through exploratory learning involves four basic phases:
The model describes how a notional user sets a goal to be accomplished with the system.
A typical goal will be expressed in terms of the expected capabilities of the system, such
as “check spelling of this document”.
The model describes how the notional user searches the interface for currently available
actions. The availability of actions may be observable as the presence of menu items, of
buttons, of available command-line inputs, etc.
The model describes how the notional user selects the action that seems likely to make
progress toward the goal.
The model describes how the notional user performs the selected action and evaluates the
system's feedback for evidence that progress is being made toward the current goal.
Summative Evaluation
Summative evaluation, often performed under the umbrella of usability testing is carried out at
the end of a project after the system has been built, to assess whether it meets its specification, or
whether a project was successful. This is in contrast to formative evaluation, where the main
objective is to contribute to the design of the product, by assessing specifications or prototypes
before the system has been built. Formative evaluation is often analytic (it proceeds by reasoning
about the design), while summative evaluation is often empirical (it proceeds by making
observations or measurements). Summative evaluation is also used frequently in research
situations, where the performance of a new interaction technique is assessed for scientific
publication. However, summative evaluation is not as popular in commercial settings as in
academic settings.
Bad Evaluation Techniques
Some user interface developers use evaluation techniques that are practically useless.
Unfortunately these techniques can even be found in some published research in computer
science. When users are shown a shiny new interface next to a tatty old one, they will often say
that they like the new one better, regardless of its usability. There are many circumstances in
which a person's introspective feelings about their mental performance is not a good predictor of
actual performance, so this type of report is unreliable as well as open to bias. There is a great
deal of variation between different people in their ability to use different interfaces. This may
result from different mental models, different cognitive skills, different social contexts and many
other factors. Any conclusions drawn from an observation of only one person must therefore be
very suspect. Unfortunately, many user interfaces are developed based on observations of a
single person - the programmer. The introspection of the user interface developer about his or her
performance is seldom relevant to users.
Modified Soup Analogy
When the cook tastes other cooks’ soups, that’s exploratory.
When the cook assesses a certain recipe, that’s predictive.
When the cook tastes the soup while making it, that’s formative.
When the guests (or food critics) taste the soup, that’s summative.”