Evaluation of usability can be done afterwards and during design. The usefulness of all the guidelines, heuristics and other aids is related to the kind of evaluation that is being conducted. In terms of our layered usability model, evaluating with users should be done by looking at the usage indicators. When evaluating during design without users,
Figure 2.4: Usage indicators, means and software quality
the usage indicators cannot provide any data and designers can only look at how the means are used and make an estimate on their impact.
2.8.1 Measuring usability
Evaluating with users is a good method for obtaining data about the actual usage. Us-ing scenarios and other techniques, data about the number of errors or speed of perfor-mance can be obtained which should provide a good indication of the usability of the product. The scenarios should be explicitly related to the usability goals for the system.
The actual measuring activities should be done in the actual context of use for which the system is designed. When this is not possible usability labs may be an inferior sur-rogate. Measuring of the usage indicators is not always easy. Task performance times are easy to measure but satisfaction and memorability are harder to quantify. Using questionnaires such as QUIS and SUMI, a more standardized measurement can be ob-tained, although these are only valid for certain classes of systems. Measuring of usage indicators can be done using usability metrics. In Whiteside et al. (1988) and Nielsen (1993) lists of metrics are given. Table 2.5 summarizes all given metrics.
Evaluation during the design process is more problematic than evaluating with users.
Although mockups and paper prototypes can be tested with users, the usage indicators cannot be evaluated directly. What can be done is looking at the means that influence the usage indicators. Using walkthroughs and scenarios each of the means can be evaluated by looking at the way they are present in the design and by estimating the positive or negative impact on the usage indicators. For instance, it can be checked if a design is consistent with a platform’s style guideline or if sufficient warnings are given. This is where the heuristics and ergonomic criteria of Scapin & Bastien (1997)
Table 2.5: Usability Metrics
Metric Usage Indicator
Time to complete a specific task Performance Time
Number of commands used Performance Time
Percent of task completed per unit time Performance Time Relative time spent in physical actions Performance Time Relative time spent in mental actions Performance Time Time spent waiting for the system responds Performance Time Number of tasks that can be completed within a given time limit Performance Time
Number of regressive behaviors Memorability
Number of system features users can remember afterwards Memorability
Time spent in errors Errors
Percent of number of errors Errors
Number of repetitions of failed commands Errors Number of immediately subsequent erroneous actions Errors Time spent using help or documentation Learnability Frequency of help and documentation use Learnability Ration of users using effective vs. ineffective strategy Learnability Number of good and bad features recalled by users Satisfaction Percent of favorable/unfavorable user comments Satisfaction Number of users preferring your system Satisfaction Number of times user expresses frustration or satisfaction Satisfaction Number of times interface misleads the user Task Completion
Percent of task completed Task Completion
Number of available commands not invoked Task Completion
Ratio of successes to failures Task Completion
Number of runs of successes and of failures Task Completion Number of times users need to work around a problem Task Completion Number of times the user is disrupted from a work task Task Completion Number of times user loses control of the system Task Completion
and Nielsen (1993) are very useful. This kind of early evaluation does not replace the need for late evaluation with users but can contribute when good choices of means can be made.
Another way of ensuring usability during the design process is by using formal de-sign models. Many models and techniques exist for describing dede-signs using formal notations. State charts, GOMS (Card et al. 1983), ConcurTaskTree’s (Palanque & Pa-tern`o 1997) and similar notations are used to describe designs. These kinds of notations are usually strong in describing structural aspects of a design (the dialog structure) and very weak at describing presentational aspects. In (Payne & Green 1989) Payne says, “as far as TAG is concerned, the screen could be turned off ”. Although ETAG (Tauber 1990) also does not consider presentational aspects it deals with functionality.
Reisner’s Action Language (Reisner 1983) also allows knowledge sources to be mod-eled, e.g. it is possible to model if a user can get needed information from the screen or from working memory.
In relation to the means of our model, this is already a big limitation since a lot of means such as consistency, warnings or feedback are strongly related to presentational aspects. A heuristic that says ”speak the user’s language” is difficult to deal with us-ing formal models. Another factor is that most formal models are usually built with the viewpoint of describing ”correct” use of the application and therefore do not de-scribe error handling or issuing warnings. For formal models to be really valuable, they
should include the context of use as well and relate properties of the system model to the context models.
2.8.2 Improving usability
When an evaluation shows that the usability needs to be improved the problem is to find out which means need to be changed and how they need to be changed. As was mentioned earlier means may have a positive effect on one usage indicator while hav-ing a negative effect on another. In some cases, it may be obvious how to improve usability but in cases where problems are of a more structural kind it may not be so simple to solve. In that case, the designer has to take a step back and look at the knowl-edge domains again. The knowlknowl-edge domains are the only sources for judging why and how a means is to be changed. For instance, when the task conformance is seen as a problem the task model can give the designer information about what is wrong with the task conformance. Similarly, the user model may give information about the memory limitations which may require the design to have more or better feedback of user actions. Unfortunately the knowledge domains are not always available or written down in a way that makes it easy to use them in practice. Task models may not contain the right information or the available guidelines do not say anything about a particular problem.
2.8.3 Usability process improvement
Designing for usability should be a focus in the system development process. However, most organizations have not sufficiently recognized the importance of usability and have not incorporated it in their current design methods. Within the field of software engineering the concept of maturity is often used to indicate how good the process is in a certain area. To indicate how well an organization is dealing with usability, a usability maturity scale was developed in the INUSE project (Earthy 1999). The levels are:
• X: Ignorance - “We don’t have problems with usability”, Usability is not dis-cussed as an issue.
• A: Uncertainty - “We don’t know why we have problems with usability”, User-Centred processes are not implemented, or fail to achieve their purpose
• B: Awakening - “Is it absolutely necessary to always have problems with usabil-ity?”, User-Centred processes are implemented but are performed by inappropri-ate staff using sub-optimal methods
• C: Enlightenment - “Through management commitment and improvement of human-centered processes we are identifying and resolving our problems”, User-Centred processes are implemented and produce results, but these results do not always give the expected benefits to the software development process
• D: Wisdom - “Usability defect prevention is a routine part of our operation”, User-Centred processes are integrated into the software life cycle and used to improve all work products
• E: Certainty - “We know why we do not have problems with usability”, The culture of the organization is user-centered
The experience of the INUSE project was that much of European industry is at level X, A or sometimes B on this scale. It shows that there is still a strong need for methods that incorporate usability aspects into the current design practice. Obviously, most companies still need to reach the awakening process before the necessary changes can occur.