Skip to content

Instantly share code, notes, and snippets.

@patilswapnilv
Created September 21, 2023 15:51
Show Gist options
  • Save patilswapnilv/53c88c24f9e0611c14667fbfcab84450 to your computer and use it in GitHub Desktop.
Save patilswapnilv/53c88c24f9e0611c14667fbfcab84450 to your computer and use it in GitHub Desktop.
will peak with many more participants than three to five.
You have to settle for a sensible approach with a practical outcome
Okay, so what is the answer? What is the best number of participants to use in testing? Our friend Jim Foley's answer to any HCI question was never
more appropriate than it is here: It depends! It takes as many participants as it takes for your situation, your application, your design, your resources, and your goals.
You have to decide for yourself every time you do UX testing-how many participants can you afford. You cannot compute and graph all the curves we have been talking about because you will never know how many UX problems exist in the design so you will never know what percentage of the existing problems you have found. You do not have to use those curves, anyway, to have a pretty good intuitive feeling for whether you are still detecting useful new problems with each new participant. Look at the results from each participant and each iteration and ask if those results were worth it and whether it is worth investing a little more. Based on how much time and money you can afford and how easily you can recruit participants, use your UX practitioner thinking skills and ask yourself, do you feel lucky? Huh? Do you think there are more reasonably significant UX problems still out there that you can still find and fix and thereby improve the product? Then go ahead; make the UX practitioner's day-iterate.
Rigorous Empirical Evaluation: Running the Session
15.1 INTRODUCTION
15.1.1 You Are Here
We begin each process chapter with a "you are here" picture of the chapter topic in the context of the overall Wheel lifecycle template; see Figure 15-1. This chapter is about running a lab-based evaluation, a step of rigorous UX evaluation.
15.2 PRELIMINARIES WITH PARTICIPANTS
15.2.1 Introduce Yourself and the Lab: Be Sure Participants Know What to Expect
In this chapter, our story opens with the arrival of participants at your UX lab. If you have a separate reception room in your UX facility, this is where you meet your participants before getting down to business with evaluation. Greet and welcome each participant and thank him or her for helping. Bring them in and show them around.
Figure 15-1
You are here, at running the session, within the evaluation activity in the context of the overall Wheel lifecycle template.
Introduce them to the setup and show them the lab. If you have one-way glass, explain it and how it will be used and show them the other side- what happens "behind the curtain." Openly declare any video recording you will do (which should have been explained in the consent form, too). Make participants feel that they are partners in their role.
Tell your participants all about the design being evaluated and about the process in which they are participating. For example, you might say "We have early screen
designs for our product in the form of a low-fidelity prototype of a new system for " Tell them how they can help and what you want them to do.
Do your best to relieve anxiety and satisfy curiosity. Be sure that your participants have all their questions about the process answered before you proceed into evaluation. Make it very clear that they are helping you evaluate and are not evaluating them in any way. For example, "You are here to evaluate a new design for ; you will be asked to try some tasks using the computer, to
help us find places where the design is not supportive enough for you."
15.2.2 Paperwork
While still in the reception room or as soon as the user has entered the participant room:
� Have each participant read the general instructions and explain anything verbally, as needed.
� Have the participant read the institutional review board consent form (Chapter 14) and explain the consent form verbally as well.
� Have the participant sign the consent form (two copies); it must be signed "without duress." You keep one signed copy and give the participant the other signed copy. Your copy must be retained for at least 3 years (the period may vary by organization).
� Have the participant sign a non-disclosure form, if needed.
� Have the participant fill out any demographic survey you have prepared.
A short written demographic survey can be used to confirm that each participant does, indeed, meet the requirements of your intended work activity role and corresponding user class characteristics.
15.2.3 The Session Begins
Give the participants a few minutes to familiarize themselves with the system unless walk up and use is a goal. If you are using benchmark tasks, after the preliminaries and when you both are ready to start, have the participant read the first benchmark or other task description and ask if there are any questions. If you are taking timing data, do not include the benchmark task reading time as part of the task.
Once the evaluation session is under way, interesting things can start happening quickly. Data you need to collect may start arriving in a flood. It can be overwhelming, but, by being prepared, you can make it easy and fun, especially if you know what kinds of data to collect. We will look at the possible kinds of data and the methods for generating and collecting them. But first, we need to get some protocol issues out of the way.
15.3 PROTOCOL ISSUES
Session protocol is about the mechanical details of session setup and your relationship with participants and how you handle them throughout each session.
15.3.1 Attitude toward UX Problems and toward Participants
Before you actually do evaluation, it is easy to agree that this UX testing is a positive thing and we are all working together to improve the design. However, once you start hearing about problems participants are having with the design, it can trigger unhelpful reactions in your ego, instincts, and pride. Proceed in your testing with a positive attitude and it will pay off.
15.3.2 Cultivating a Partnership with Participants
Take the time to build rapport with your participants. More important to the success of your UX evaluation sessions than facilities and equipment is the rapport you establish with participants, as partners in helping you evaluate and improve the product design. Once in the participant room, the facilitator should take a little time to "socialize" with the participant. If you have taken the participant on a "tour" of your facilities, that will have been
a good start.
If you are using co-discovery techniques (Chapter 12), allow some time for co-discovery partners to get to know each other and do a little bonding, perhaps while you are setting things up. Starting the session as total strangers can make them feel awkward and can interfere with their performance.
15.3.3 Interaction with Participant during the Session
So far, you have done the necessary preparation for your evaluation, including preparation of benchmark task descriptions, procedures, and consent forms, as well as participant preparation. It is finally time to get an evaluation session underway. The facilitator helps ensure that the session runs smoothly and efficiently.
It is generally the job of the facilitator to listen and not talk. But at key junctures you might elicit important data, if it does not interfere with task timing or if you are focusing on qualitative data. You can ask brief questions, such as "What are you trying to do?" "What did you expect to happen when you clicked on the such-and-such icon?" "What made you think that approach would work?"
If you are focusing on qualitative data, the evaluator may also ask leading questions, such as "How would you like to perform that task?" "What would make that icon easier to recognize?" If you are using the "think-aloud" technique for qualitative data gathering, encourage the participant by prompting occasionally: "Remember to tell us what you are thinking as you go."
Do not "blow off" problems perceived by the participant as just, for example, an anomaly in the prototype. If you think that an issue pointed out as a UX problem by a participant is actually not a genuine issue, write it down as a problem for the moment, anyway. Otherwise you will discourage them from mentioning problems that might not be as real as they seem.
If participants show signs of stress or fatigue, give them a break. Let them leave the participant room, walk around, and/or have some refreshments.
Do not be too up-tight about the session schedule. It is almost impossible to set a time schedule for tasks and steps for the participant in a session. It is better
to present a list of objectives and let the participant know where you both are, as a team, in working toward those goals. To the extent possible, let the participant decide when to take breaks and when to stop for lunch.
15.3.4 To Help the Participant or Not to Help the Participant? Give hints if necessary, but direct help almost always works against the goals of the session. Sometimes when participants are not making progress, they can benefit from a hint to get them back on track so that their session again becomes useful. You want to see whether the participant can determine how to perform the task. You should not give them information about how to complete a task. So, if participants ask for help, how can you let them know you are there for them without doing some coaching? Often the best way is to lead them to answer their own questions.
For example, do not answer questions such as "is this right?" directly, but by asking your own questions, directing them to think it through for themselves. With experience, evaluators become very creative at being appropriately evasive while still helping a participant out of a problem without adversely affecting data collected. Sometimes it helps to tell the participant upfront that you will decline to answer design-related questions to see how the participant interacts with the system. Make note of those questions and answer them at the end of the session.
15.3.5 Keeping Your Participant at Ease
Remind yourself and your whole team that you should never, never laugh at anything during a UX evaluation session. You may be in the control room and think you have a sound-proof setup but laughter has a way of piercing the glass. Because participants cannot see people behind the glass, it is easy for participants to assume that someone is laughing at them.
If participants become visibly flustered, frustrated, "zoned out," or blame themselves continually for problems in task performance, they may be suffering from stress and you should intervene. Take a short break and reassure and calm them. Remind them that "you are evaluating the design; we are not evaluating you." If participants become so discouraged that they want to quit the entire session, there is little you can or should do but thank them, pay them, and let them go.
15.3.6 Protocols for Evaluating with Low-Fidelity Prototypes Have your paper prototype laid out and ready to go. Well before starting a session using a paper prototype, the team should prepare for using the prototype by assembling all the parts and pieces of the prototype. To prevent the easel (Chapter 11) from moving during the action, consider taping it to the
Figure 15-2
Typical setup at the end of a table for evaluation with a paper prototype.
working table between the participant, the facilitator, and the executor, as shown in Figure 15-2.
"Boot up" the prototype by putting the initial "screen" on the easel, having each moving part ready at hand and convenient to find and grab to enter it into the action.
Make sure that any changes made to data and prototype internal states by previous participants are reset to original values for the current participant.
Before each participant enters, the "executor" should arrange everything
necessary for running the prototype, including stacks of prototype parts and other equipment (e.g., marking pens, extra paper or plastic, tape).
Have the whole evaluation team ready to assume their roles and be ready to carry them out in the session.
� Evaluation facilitator, to keep the session moving, to interact with participants, and to take notes on critical incidents (pick a person who has leadership abilities and "people" skills).
� Prototype executor, to move transparencies and provide "computer" responses to user actions (pick a person who knows the design well and is good at the cold and consistent logic of "playing" computer).
� User performance timer, to time participants performing tasks and/or count errors (to collect quantitative data)-the timer person may want to practice with a stopwatch a bit before getting into a real session.
� Critical incident note takers (for spotting and recording critical incidents and UX problems).
Review your own in-session protocol. Some of the "rules" we suggest include:
� Team members must not coach participants as they perform tasks.
� The executor must not anticipate user actions and especially must not give the correct computer response for a wrong user action! The person playing computer must respond only to what the user actually does!
� The person playing computer may not speak, make gestures, etc.
� You may not change the design on the fly, unless that is a declared part of your process.
Figure 15-3 shows another view of a participant with a paper prototype.
15.4 GENERATING AND COLLECTING QUANTITATIVE UX DATA
If your evaluation plan calls for taking quantitative data, participants perform prescribed benchmark tasks during a session and evaluators take numeric data. For example, an evaluator may measure the time it takes the participant to perform a task, count the number of errors a participant makes while performing a task, count the number of tasks a participant can perform within a given time period, and so on, depending on the measures established in your UX targets.
15.4.1 Objective Quantitative Data
The main idea is to use participant performance of benchmark tasks as a source of objective (observed) quantitative UX data.
Timing task performance
By far the simplest way to measure time on task is by using a stopwatch manually. It is really the only sensible way for low-fidelity, especially paper, prototypes.
Timing with a stopwatch is also acceptable for software prototypes and is still the de facto standard way, sufficing for all situations except those demanding the most precision. If timing manually, you usually start the timer when the participant has finished reading the benchmark task description, has no questions, and you say "please start."
Try not to use very short benchmark tasks. It can be difficult to get accurate timings with a stopwatch on very short tasks. Something in the order of 5 minutes or more is easy to time.
If precise timing measurements are required, it is possible to embed software timers to instrument the software internally. These routines keep time stamps denoting when execution of the application software enters and exits key software modules. Software timers also free up one data collector for other jobs, such as observing critical incidents, but they do require more post-session work to compile timing data.
Figure 15-3
Participant pondering a paper prototype during formative evaluation.
Counting user errors
The key to counting errors correctly is in knowing what constitutes an error. Not everything that goes wrong, not even everything a user does wrong, during task performance should be counted as a user error. So what are we looking for? A user error is usually considered to have occurred when the participant takes any action that does not lead to progress in performing the desired task. The main idea is to catch the times that a participant takes a "wrong turn" along the expected path of task performance, especially including cases where the design was inadequate to direct the interaction properly.
Wrong turns include choosing an incorrect item from a menu or selecting the wrong button, choices that do not lead to progress in performing the desired task. Other examples include selecting the wrong menu, button, or icon when the user thought it was the right one, double-clicking when a single click is needed, and vice versa.
If a participant takes a wrong turn but is able to back up and recover, an error has still occurred. In addition, it is important to note the circumstances under which the participant attempted to back up and whether the participant was successful in figuring out what was wrong. These occasions can provide qualitative data on user error recovery strategies that you might not be able to get in any other way.
The simplest way to count user errors during task performance is to use a manual event counter such as a handheld "clicker" for counting people coming through a gate for an event. Manual counters are perfect for low-fidelity, especially paper, prototypes. For software prototypes and operational software applications, if you use video to capture the interactions, you can tag the video stream with error points and the video analysis software can tally the count easily.
What generally does not count as a user error?
Typically, we do not count accessing online help or other documentation as an error. As a practical matter, we also want to exclude any random act of curiosity or exploration that might be interjected by the user (e.g., "I know this is not right, but I am curious what will happen if I click this"). Also a different successful path "invented" by the user is not really an error, but still could be noted as an important observation.
And we do not usually include "oops" errors, what Norman (1990, p. 105) calls "slips." These are errors that users make by accident when, in fact, they know better. For example, the user knows the right button to click but clicks the wrong one, perhaps through a slip of the hand, a brain burp, or being too hasty.
However, we do note an oops error and watch for it again. If it recurs, it may not be random and you should look for a way the design might have caused it.
Finally, we do not usually include typing errors, unless their cause could somehow be traced to a problem in the design or unless the application is about typing.
15.4.2 Subjective Quantitative Data: Administering Questionnaires
If you are using questionnaires, the data collection activity is when you administer them. Have the participant fill out the questionnaires you have chosen per the timing you have decided, such as after a task or at the end of a session.
15.5 GENERATING AND COLLECTING QUALITATIVE UX DATA
In Chapter 12 we discussed how the goal of qualitative data collection is to identify usability problems and their causes within the design. In lab-based testing with participants, this goal is achieved primarily through observation and recording of critical incidents, often with the help of the think-aloud technique.
15.5.1 Collecting Critical Incident Information
The way that you collect and tag critical incident data as you go will have much to say about the accuracy and efficiency of your subsequent data analysis. Get your detailed critical incident and UX problem information recorded as clearly, precisely, and completely as you can in real time during data collection.
Do not count on going back and reviewing video recordings to get the essence of the problems. In the raw data stream, there are huge amounts of "noise," data not relevant to UX analysis. Important events, such as critical incident occurrences, are embedded in and obscured by irrelevant data. The wheat is still within the chaff.
Finding critical incidents within this event stream by viewing the video is laborious and time-consuming, which is one important reason for using direct (real-time) critical incident observation by an evaluator as a primary data collection technique. Do as much filtering as possible at the moment of data capture.
As events unfold, it is the job of the skilled evaluator to capture as much information about each critical incident as possible, as they happen in real time.
In early stages or when you do not have any software tool support, you can just take notes on the critical incidents that you spot.
In Chapter 16 we discuss UX problem instance records. It is helpful to use that as a template to support completeness in collecting critical incident data.
15.5.2 Critical Incident Data Collection Mechanisms
Video recording for critical incident data collection. In some labs, video recording is used routinely to capture all user and screen actions and facilitator and participant comments. Once you get into video recording, it is probably best to use a software support tool to control the video equipment for recording, review, and later analysis and for tagging the video with evaluator comments.
If you use video recording, the minimal video to capture is the real-time sequencing of screen action, showing both user inputs and system reactions and displays. Software support tools such as Morae(tm) or OVO(tm) can capture full- resolution video stream of screen action automatically. This is adequate for a large portion of the UX evaluation sessions we do. However, if participant physical actions, gestures, and/or facial expressions are important, as they might well be to evaluate emotional impact, digital video streams from cameras can be added and most capture software will synchronize all the video inputs automatically.
For some purposes, you can use one video camera aimed at the participant's hands to see details of physical actions and gestures and another at the participant's face to see expressions and, if useful, you can even have a third camera to capture a wide-angle overview of evaluator, participant, the computer, or other device being evaluated-the full context of the user experience.
Critical incident markers: Filtering raw data. Each critical incident marker created by the evaluator points to a particular place in the raw video stream, tagging the wheat as it still resides within the chaff. This real-time tagging enables evaluators to ignore the chaff in any further analysis, boosting the efficiency of the data analysis process enormously.
Tagging critical incidents is somewhat analogous to a similar kind of data tagging in crime scene data collection and analysis: little flags or tags are arranged in proximity to items that are identified as evidence so that the crime scene analysts can focus on the important items easily. Each "start" and "stop" tag (Figure 15-4) denotes a video clip, which constitutes filtered raw data representing a critical incident.
Critical incident comments: Interpretive data. In addition to marking critical incidents, evaluators sometimes want to make comments explaining critical incidents. The comments are linked to the corresponding video clips so that analysts subsequently can view the related clips as they read the comments.
The comments are an interpretive "coating" on filtered raw data.
Figure 15-4 illustrates the video stream tagged with critical incident markers and associated evaluator comments.
Working without a net: Not recording raw data. It is more efficient if you do not have to use video recording.
Although video recording of interaction even with paper prototypes can be appropriate and useful, evaluators certainly do not always record video of the interaction events, especially for evaluations of early prototypes where the fast pace of iterative design changes calls for lightweight data collection and analysis.
Operating without the safety net of a video recording results in immediate loss of raw data.
Evaluators depend solely on the comments, and the analysis process begins with just the interpretive
accounts of data, but this if often fully adequate and appropriate for early versions of a design or when rapid methods are required for all versions.
Manual note taking for critical incident data collection. Manual note taking is the most basic mechanism and is still a useful and efficient approach in many UX labs. Evaluators take comprehensive, real-time raw critical incident notes with a laptop or with pencil and paper. When thoughts come faster than they can write, they might make audio notes on a handheld digital voice recorder-anything to capture raw data while it is still fresh.
Even if you are also making audio or video recordings, you should take notes as though you are not. It is a mistake to rely on your video recordings as a primary method of raw data capture. It is just not practical to go back and get your raw critical incident notes by reviewing the entire video. The video, however, can be a good backup, something you can look at for missing data, to resolve a question, or to clear up a misunderstanding.
Figure 15-4
Overview of raw data filtering by tagging critical incidents and adding interpretive comments.
15.5.3 Think-Aloud Data Collection
Although there are some points to watch for, in its simplest form this technique could not be easier. It simply entails having participants think out loud and share their thoughts verbally while they perform tasks or otherwise interact with a product or system you want to evaluate. Within the UX evaluation method you are using:
� At the beginning, explain the concept of thinking aloud and explain that you want to use this technique with participants.
� Explain that this means you will expect them to talk while they work and think, sharing their thoughts by verbalizing them to you.
� You might start with a little "exercise" to get warmed up and to get participants acclimated to thinking aloud.
� Depending on your main UX evaluation method, you may capture think-aloud data by audio recording, video recording, and/or written or typed notes.
� Among the thoughts you should encourage participants to express are descriptions of their intentions, what they are doing or are trying to do, and their motivations, the reasons why they are doing any particular actions.
� You especially want them to speak out when they get confused, frustrated, or blocked.
Depending on the individual, thinking aloud usually comes quite naturally; it does not take much practice. Occasionally you might have to encourage or remind the participant to keep up the flow of thinking aloud.
15.6 GENERATING AND COLLECTING EMOTIONAL IMPACT DATA
Collecting emotional impact data depends on observing and measuring indicators of emotional response through verbal communication, facial expressions, body language, behaviors, and physiological changes.
15.6.1 Applying Self-Reporting Verbal Techniques for Collecting Emotional Impact Data
Applying the think-aloud technique to evaluate emotional impact
We have already talked about using the think-aloud technique for capturing the participant's view of interaction, critical incidents, and UX problems. The think- aloud technique is also excellent for obtaining a window into the mind of the user with respect to emotional feelings as they occur.
Depending on the nature of the interaction, emotional impact indicators may be infrequent in the flow of task performance user actions, and you may see them mainly as a by-product of your hunt for usability problem indicators. So, when you do encounter an emotional impact indicator during observation in task performance, you certainly should make a note of it. You can also make emotional impact factors the primary focus during the think-aloud technique.
� When you explain the concept of thinking aloud, be sure participants understand that you want to use this technique to explore emotional feelings resulting from interaction and usage.
� Explain that this means you will expect them to share their emotions and feelings while they work and think by talking about them to you.
� As you did when you used the think-aloud technique to capture qualitative UX data, you may wish to begin with a little "exercise" to be sure participants are on the
same page about the technique.
� As before, you can capture think-aloud data by audio recording, video recording, and/or written or typed notes.
� Also as before, you may have to remind participants occasionally to keep the thinking aloud flowing.
During the flow of interaction:
� You can direct participants to focus their thinking aloud on comments about joy of use, aesthetics, fun, and so on.
� You should observe and note the more obvious manifestations of emotional impact, such as expressions like "I love this" and "this is really cool" and "wow" expressions, annoyances, or irritation.
� You also need to watch out for the more subtle expressions that can provide insights into the user experience, such as a slight intake of breath.
� As a practitioner, you also must be sensitive to detecting when emotional impact goes flat, when there is no real joy of use. Ask participants about it, causes, and about how it can be improved.
Finally, a caution about cultural dependency. Most emotions themselves are pretty much the same across cultures, and non-verbal expressions of emotion, such as facial expressions and gestures, are fairly universal. But cultural and social factors can govern an individual's willingness to communicate about emotions. Different cultures may also have different vocabularies and different perspectives on the meaning of emotions and the appropriateness of sharing and revealing them to others.
Applying questionnaires to evaluate emotional impact
Based on your project context and the type of application, use the discussion of questionnaires in Chapter 12 to select and apply a questionnaire to evaluate emotional impact.
15.6.2 Applying Direct Non-Verbal Techniques for Collecting Emotional Impact Data
Using non-verbal techniques for collecting emotional impact usually means deploying probes and instrumentation; see Chapter 12 for a discussion of physiological measurements.
15.7 GENERATING AND COLLECTING PHENOMENOLOGICAL EVALUATION DATA
Get ready for a study of emotional impact situated in the real activities of users over time, if possible from the earliest thinking about the product to adoption into their lifestyles. You will need to choose your data collection techniques to compensate for not being able to be with your participants all the time, which means including self-reporting.
We encourage you to improvise a self-reporting technique yourself, but you should definitely consider a diary-based technique, in which each participant maintains a "diary," documenting problems, experiences, and
phenomenological occurrences within long-term usage. Diaries can be kept via paper and pencil notes, online reports, cell-phone messages, or voice recorders.
For a diary-based technique to be effective, participants must be primed in advance:
� Give your users a list of the kinds of things to report.
� Give them some practice exercises in identifying relevant situations and reporting on them.
� Get them to internalize the need to post a report whenever they confront a usage problem, use a new feature, or encounter anything interesting or fun within usage.
To encourage participants to use voice-mail for reporting, consider paying them a per-call monetary compensation (in addition to whatever payment you give them for participating in the study). In the Palen and Salzman study, they found that a per-call payment encouraged participants to make calls. There is a
possibility that this incentive might bias participants into making some unnecessary calls, but that did not seem to happen in this study.
To perhaps get more representative data, you might choose to trigger reporting to control the timing (Chapter 12, under data collection technique for phenomenological aspects), rather than letting your participant perform reporting at when it is convenient or times when things are going well with the product. For example, you can give your participant a dedicated pager (Buchenau & Suri, 2000) to signal randomly timed "events" at which times the participant is asked to report on their usage and context.
Another way you could choose to sample phenomenological usage is by periodic questionnaires over time. You can use a series of such questionnaires to elicit understanding of changes in usage over those time periods.
You can also choose to do direct observation and interviews in simulated real usage situations (Chapter 12, under data collection technique for phenomenological aspects). You will need to create conditions to encourage episodes of phenomenological activity to occur during these observational periods.
As an example of using this technique, Petersen, Madsen, and Kjaer (2002) conducted a longitudinal study of the use of a TV and video recorder by
two families in their own homes. During the time of usage, periodic interviews were scheduled in the analysts' office, except in cases where users had difficulty in getting there and, then, the interviews were conducted in the users' homes.
During the interviews, the evaluators posed numerous usage scenarios and had the participants do their best to enact the usage while giving their feedback, especially about emotional impact. All interviews were videotaped. The idea
is to set up conditions so that you can capture the essence of real usage and reflect real usage in a tractable time frame.
Here are some tips for success:
� Establish the interview schedule to take into account learning through usage by implementing a sequence of sessions longitudinally over time.
� As in contextual inquiry, it is necessary to observe user activities rather than just to ask about them. As we know, the way people talk about what they do is often not the same as what they actually do.
� Be cautious and discreet with videotaping in more private settings, such as the participant's home, usually found in this kind of usage context.
As you collect data, you will be looking for indicators of all the different ways your users involve the product in their lives, the high points of joy in use, how the
basic mode of usage changes, evolves, or emerges over time, and especially how usage is adapted to emerge as new and unusual kinds of usage. As said in Chapter 12, you want to be able to tell stories of usage and good emotional impact over time.
15.8 WRAPPING UP AN EVALUATION SESSION
Returning to the more traditional lab-based or similar evaluation methods, we now need to do several things to wrap up an evaluation session.
15.8.1 Post-Session Probing Via Interviews and Questionnaires
Immediately after the sessions, but while your participant is still present, is the best opportunity to ask probing questions to clear up any confusion you have about critical incidents or UX problems. Conduct post-session interviews and administrator questionnaires to capture user thoughts and feelings while they are fresh.
Clarify ambiguities about the nature of any problems. Be sure you understand the real problems and their causes. Interact with your participant as a doctor does in diagnosing a patient. If you wait until the participant is gone, you lose the opportunity to ask further questions to disambiguate the diagnoses and causes.
Facilitators often start with some kind of standard structured interview, asking a series of preplanned questions aimed at probing the participant's thoughts about the product and the user experience. A typical post-session interview might include, for example, the following general questions. "What did you like best about the interface?" "What did you like least?" "How would you change so- and-so?" An interesting question to ask is "What are the three most important pieces of information that you must know to make the best use of this interface?"
For example, in one design, some of the results of a database query were presented graphically to the user as a data plot, the data points of which were displayed as small circles. Because most users did not at first realize that they could get more information about a particular data point if they clicked on the corresponding circle, one very important piece of information users needed to know about the design was that they should treat a circle as an icon and that they could manipulate it accordingly.
It can be even more effective to follow up with unstructured opportunistic questioning. Find out why certain things happened. Be sure to ask about any
critical incidents that you are not sure about or potential UX problems for which you do not yet completely understand the causes.
15.8.2 Reset for the Next Participant
After running an evaluation session with one participant, you should clean up everything to be ready for the next participant. This means removing any effects from the previous session that might affect the participant or task
performance in the next session. Often you will have to do this kind of cleanup even between tasks for the same participant.
If you are using a paper prototype, you still must reset internal states and data back to initial conditions needed by the first task using the prototype. For example, if previous participants made changes to the prototype, such as filling in information on a paper or plastic form, provide a fresh clean form. If the user made changes to a "database," recorded anywhere that will be visible to the next participant, these have to be reset for a fresh participant.
For Web-based evaluation, clear out the browser history and browser cache, delete temporary files, remove any saved passwords, and so on. For a
software prototype, save and backup any data you want to keep. Then reset the prototype state and remove any artifacts introduced in the previous session.
Delete any temporary files or other saved settings. Reset any user-created content on the prototype, such as any saved appointments, contacts, or files. Reset any other tools, such as Web-based surveys or questionnaires, to make sure one participant's answers are not visible to the next one.
Finally, give the participant(s) their pay, gifts, and/or premiums, thank them, and send them on their way.
15.9 THE HUMAINE PROJECT
The European community project HUMAINE (Human-Machine Interaction Network on Emotions) issued a technical report detailing a taxonomy of affective measurement techniques (Westerman, Gardner, & Sutherland, 2006). They point out that there is a history of physiological and psychophysiological measurement in human factors practice since the late 1970s to detect, for example, stress due to operator overload, and an even longer history of this kind of measurement in psychological research.
In the HUMAINE report, the authors discuss the role of medicine in physiological measurement, including electroencephalograms and event- related potential, measured with electroencephalography, a technique that
detects and measures electrical activity of the brain through the skull and scalp. Event-related potentials can be roughly correlated to cognitive functions involving memory and attention and changes in mental state.
As the authors say, these physiological measurements have the advantage over self-reporting methods in that they can monitor continuously, require no conscious user actions, and do not interrupt task performance or usage activity. To be meaningful, however, such physiological measurements have to be associated with time stamps on a video of user activity.
A major disadvantage, ruling the approach out for most routine UX evaluation, is the requirement for attached sensors. New, less intrusive instrumentation is being developed. For example, Kapoor, Picard, and Ivanov (2004) report being able to detect changes in user posture, for example, due to fidgeting, with pressure sensors attached to a chair.
Rigorous Empirical Evaluation: Analysis
If it ain't broke, it probably doesn't have enough features.
- Anonymous
16.1 INTRODUCTION
16.1.1 You Are Here
We begin each process chapter with a "you are here" picture of the chapter topic in the context of the overall Wheel lifecycle template; see Figure 16-1. This chapter is about analyzing data collected during evaluation.
The focus of research and practice has slowly been shifting away from methods for usability data collection and comparisons of those data collection methods to issues about how best to use data generated or collected by these methods (Howarth, Andre, & Hartson, 2007). So, now that we have data, what's next?
Figure 16-1
You are here; at data analysis, within the evaluation activity in the context of the overall Wheel lifecycle template.
16.2 INFORMAL SUMMATIVE (QUANTITATIVE) DATA ANALYSIS
As we have said, the quantitative data analysis for informal summative evaluation associated with formative evaluation does not include inferential statistical analyses, such as analyses of variance (ANOVAs), t tests, or F tests. Rather, they use simple "descriptive" statistics (such as mean and standard deviation) to make an engineering determination as to whether the interaction design has met the UX target levels. If the design has not yet met those targets, qualitative analysis will indicate how to modify the design to improve the UX ratings and help converge toward those goals in subsequent cycles of formative evaluation.
Iteration can seem to some like a process going around in circles, which can be scary to managers. As we will see later, your informal summative analysis, coupled with your UX targets and metrics, is a control mechanism to help managers and other project roles know whether the iterative process is converging toward a usable interaction design and when to stop iterating.
RIGOROUS EM PIRICAL E V A LUAT I O N: ANALYSIS 557
16.2.1 Lining up Your Quantitative Ducks
The first step in analyzing quantitative data is to compute simple descriptive statistics (e.g., averages) for timing, error counts, questionnaire ratings, and so on, as stated in the UX targets. Be careful about computing only mean values, though, because the mean is not resistant to outliers and, therefore, can be a misleading indicator. Because we are not doing formal quantitative analysis, the small number of participants typical in formative evaluation can lead to a mean value that meets a reasonable UX target and you can still have serious UX problems.
It may help to include standard deviation values, for example, to indicate something about the rough level of confidence you should have in data. For example, if three participants are all very close in performance times for a particular task, the numbers should give you pretty good confidence, and the average is more meaningful. If there is a big spread, the average is not very meaningful and you should find out why there is such a variance (e.g., one user spent a huge amount of time in an error situation). Sometimes it can mean that you should try to run a few more participants.
After you compute summary statistics of quantitative data, you add them to the "Observed Results" column at the end of the UX target table. As an example, partial results from a hypothetical evaluation of the Ticket Kiosk System are shown in Table 16-1 using some of the UX targets established in Table 10-10.
Table 16-1
Example of partial informal quantitative testing results for the Ticket Kiosk System
Work Role: User Class
UX Goal
UX Measure
Measuring Instrument
UX Metric
Baseline Level
Target Level
Observed Results
Meet Target?
Ticket buyer:
Walk-up
Initial user
BT1: Buy
Average
3 min as
2.5
3.5 min
No
Casual new
ease of use
performance
special event
time on
measured
min
user, for
ticket
task
at the
occasional
kiosk
personal use
Ticket buyer: Casual new
Walk-up ease of use
Initial user performance
BT2: Buy
movie ticket
Average number of
<1
<1
2
No
user, for
for new
errors
occasional
user
personal use
Ticket buyer:
Initial
First
Questions
Average
7.5/10
8/10
7.5
No
Casual new
customer
impression
Q1-Q10 in
rating
user, for
satisfaction
questionnaire
across users
occasional
XYZ
and across
personal use
questions
Next, by directly comparing the observed results with the specified UX goals, you can tell immediately which UX targets have been met, and which have not, during this cycle of formative evaluation. It is useful to add, as we have done in Table 16-1, yet one more column to the right-hand end of the UX target table, for "Did you meet UX target?" Entries can be Yes, No, or Almost.
In looking at the example observed results in Table 16-1, you can see that our example users did not meet any of the UX target levels. This is not unusual for an early evaluation of a still-evolving design. Again, we stress that this was only informal summative analysis-it cannot be used anywhere for claims and cannot be used for any kind of reporting outside the UX group or, at most, the project team. It is only for managing the iterative UX engineering process internally. If you want results from which you can make claims or that you can make public, you need to do (and pay for) a formal summative evaluation.
Not All Errors Are Created Equal
Andrew Sears, Professor and Dean, B. Thomas Golisano
College of Computing and Information Sciences, Rochester Institute of Technology
As we design and evaluate user interfaces, we must decide what metrics will be used to assess the efficacy of our design. Perhaps the most common choices are the speed-error-satisfaction triad that we see used not only in usability studies but also in more formal evaluations such as those that appear in scholarly journals and conference proceedings. After all, a fast system that leads to many errors is not particularly useful, and even a fast and error-free system is not useful if nobody will use it (unless you are dealing with one of those less common situations where users do not have a choice).
If we accept that each of these aspects of usability should be considered in some form and define how each will be assessed, we can begin to evaluate how well a system performs along each dimension, we can introduce changes and evaluate the impact of these changes, and we can fine-tune our designs. There are many techniques that help usability engineers identify problems, including approaches that result in problems being classified in one way or another in an effort to help prioritize the process of addressing the resulting collection of problems. Prioritization is important because there may not be time to fix every problem given the time pressures often experienced as new products are developed. Prioritization is even more important when one considers the variable nature of the "problems" that can be identified when using different evaluation techniques.
While the severity of a problem may be considered when deciding which problems to fix, the focus is typically on eliminating errors. However, through our research with several error-prone technologies, it has become clear that focusing exclusively on eliminating errors may lead to less than optimal outcomes. These technologies, including
speech and gesture recognition, can produce unpredictable errors that can result in dramatically different consequences for the user of the system. It was in this context that we began to rethink the need to eliminate errors. This shift in focus was motivated, in part, by the fact that we could not necessarily eliminate all errors but we noticed that there were opportunities to change what happened when errors did occur. It was also motivated by the observation that people are really quite good at processing input where some details are missing or inaccurate.
Perhaps the simplest example would be when you are participating in a conversation, but the person you are talking to mumbles or background noise masks a few words. Often, it is possible to fill in the gaps and reconstruct what was missed originally.
Two specific examples may be useful in seeing how these ideas can be applied when designing or redesigning information technologies. The first example considers what happens when an individual is interacting with a speech-based system. At times, the speech recognition engine will misinterpret what was said. When using speech to navigate within a text document, such errors can result in a variety of consequences, including ignoring what was said, moving the cursor to the wrong location, or even inserting extra text. Recovering from each of these consequences involves a different set of actions, which require different amounts of work. By recognizing that an error has occurred, and using this information to change the consequences that an individual must overcome, we can improve the usability of a system even without eliminating the error. At times, a slightly higher error rate may be desirable if this allows the severity of the consequences to be reduced sufficiently. This example is explored in more depth by Feng and Sears (2009). This article discusses the issue of designing for error-prone technologies and the importance of considering not only the number of errors users encounter but the severity of the consequences associated with those errors.
A second example looks at the issue of taking notes using mobile technologies. The process of entering text on mobile devices is notorious for being slow and error prone. If someone tries to record a brief note while correcting all errors, the process tends to be sufficiently slow to discourage many individuals. At the same time, people tend to be quite good at dealing with many different types of errors. Because these brief notes tend to be used to remind the user of important details, having an error-free note may not be that important as long as the erroneous note is sufficient to remind the user of the information, event, or activity that inspired them originally to record the note. Our studies found that a note-taking mechanism that did not allow users to review and correct their notes could allow users to recall important details just as effectively as error-free notes while significantly reducing the time they spent recording the note. Dai et al. (2009) explore this example in more detail, showing how users can overcome errors.
Errors are inevitable, but not all errors result in the same consequences for the user. Some errors introduce significant burdens, creating problems that the user must then fix before they can continue with their original task. Other errors may be irritating, requiring users to repeat their actions, but do not introduce new problems. Still other errors may be annoying but may not prevent the user from accomplishing his or her task. Understanding how an error affects the user and when there are opportunities to reduce the consequences of errors (sometimes this can involve increasing the number of errors but still results in an overall improvement in the usability of the system) can allow for more effective systems to be designed even when errors cannot be prevented.
16.2.2 The Big Decision: Can We Stop Iterating?
Now it is time for a major project management decision: Should you continue to iterate? This decision should be a team affair and made at a global level, not just considering quantitative data. Here are some questions to consider:
� Did you simultaneously meet all your target-level goals?
� What is your general team feeling about the conceptual design, the overall interaction design, the metaphor, and the user experiences they have observed?
If you can answer these questions positively, you can accept the design as is and stop iterating. Resource limitations also can force you to stop iterating and get on with pushing this version out in the hope of fixing known flaws in the next version. If and when you do decide to stop iterating, do not throw your qualitative data away, though; you paid to get it, so keep this round of problem data for next time.
If your UX targets were not met-the most likely situation after the first cycle(s) of testing-and resources permit (e.g., you are not out of time or money), you need to iterate. This means analyzing the UX problems and finding a way to solve them in order of their cost and effect on the user experience.
Convergence toward a quality user experience
Following our recurring theme of using your own thinking and experience in addition to following a process, we point out that this is a good place to use your intuition. As you iterate, you should keep an eye on the quantitative results over multiple iterations: Is your design at least moving in the right direction?
It is always possible for UX levels to get worse with any round of design changes. If you are not converging toward improvement, why not? Are UX problem fixes uncovering problems that existed but could not be seen before or are UX problem fixes causing new problems?
16.3 ANALYSIS OF SUBJECTIVE QUESTIONNAIRE DATA
Depending on which questionnaire you used, apply the appropriate calculations for the final scores (Chapter 12).
16.4 FORMATIVE (QUALITATIVE) DATA ANALYSIS
Our friend Whitney Quesenbery gave us this nutshell digest of her approach to usability problem analysis, which she in turn adapted from someone else:
The team usually includes all the stakeholders, not just UX folks, and we rarely have much time. First, we agree on what we saw. No interpretation, just observation. This gets us all on the same page. Then we brainstorm until we agree on "what it means." Then we brainstorm design solutions.
16.4.1 Introduction
Formative analysis of qualitative data is the bread and butter of UX evaluation. The goal of formative data analysis is to identify UX problems and causes (design flaws) so that they can be fixed, thereby improving product user experience. The process of determining how to convert collected data into scheduled design and implementation solutions is essentially one of negotiation in which, at various times, all members of the project team are involved. In the first part of qualitative analysis you should have all your qualitative data represented as a set of UX problem instances so that you can proceed with diagnosis and problem solutions.
Did not find many UX problems? Better look again at your data
collection process. We seldom, if ever, see an interaction design for which UX testing does not reveal lots of UX problems. Absence of evidence is not evidence of absence.
Figure 16-2 illustrates the steps of qualitative data analysis:
consolidating large sets of raw critical incident comments into UX problem instances, merging UX problem instances into UX problem records, and grouping of UX problem records so that we can fix related problems together.
For practical purposes we have to separate our material into chapters. In practice, early analysis-especially for qualitative data-overlaps with the data collection process. Because evaluator comments are interpretive, we have
Figure 16-2
Consolidating, merging, and grouping of UX problem data.
already begun to overlap analysis of qualitative data with their capture. The earlier you think about UX problems and their causes, the better chance you have at getting all the information you will need for problem diagnosis. In this chapter, we move from this overlap with data collection into the whole story of qualitative data analysis.
16.4.2 Get an Early Jump on Problem Analysis
Keep a participant around to help with early analysis
In a typical way of doing things, data collection is "completed," the participant is dismissed, and the team does high fives and cracks open the bubbly before turning its attention to data analysis. But this Dilbertian HFTAWR (high-frivolity- to-actual-work ratio) approach puts the problem analyst at a disadvantage when
the need inevitably arises to ask the participant questions and resolve ambiguities. The analyst can sometimes ask the facilitator or others who collected data, but often at significant communication effort.
Neither the facilitator nor the analyst now has access to the participant.
Too often the problem analyst can only try to interpret and reconstruct missing UX data. The resulting completeness and accuracy become highly dependent on the knowledge and experience of the problem analyst.
We suggest bringing in the problem analyst as early as possible, especially if the analyst is not on the data collection team. And, to the extent it is practical, start analyzing qualitative data while a participant is still present to fill in missing data, clarify ambiguous issues, and answer questions.
Early UX problem data records
If data collectors used software support tools, the critical incident notes may already be in rudimentary problem records, possibly with links to tagged video sequences. With even minimal support from some kind of database tool, evaluators can get a leg up on the process yet to come by entering their critical incident descriptions directly into data records rather than, say, just a word processor or spreadsheet. The earlier you can get your raw critical incident notes packaged as data records, the more expedient the transition to subsequent data analysis.
Clean up your raw data before your memory fades
However you get data, you still have mostly raw qualitative data at this point. Many of the critical incident notes are likely to be terse observational comments that will be difficult to integrate in subsequent analysis, particularly if the person performing UX problem analysis is not the same person who observed the incidents and recorded the comments.
Therefore, it is essential for data collectors to clean up the raw data as soon after data collection as time and evaluator skills permit to capture as complete a record of each critical incident as possible while perishable detailed data are still fresh. In this transition from data collection to data analysis, experienced data collectors will anticipate the need for certain kinds of content later in problem analysis.
Clarify and amplify your emotional impact data
UX problems involving emotional impact are, by nature, usually broader in scope and less about details than usability problems. Therefore, for UX problems about emotional impact, it is important to get at the underlying
essence of the observations while the explanatory context is still fresh. Otherwise, in our experience, you may end up with a vague problem description of some symptoms too nebulous to use.
16.4.3 Sources of Raw Qualitative Data
We are talking primarily about data from lab-based UX testing here, but critical incident data can come from other sources such as expert UX inspections. It is our job to sort through these, often unstructured, data and extract the essential critical incident and UX problem information. Regardless of the source of the raw data, much of the data analysis we do in this chapter is essentially
the same.
Some sources are less detailed and some are more synoptic, for example, evaluator problem notes from a session without video recording tend to be more summarized, or summary problem descriptions can come from a UX inspection, in which there are no real critical incidents as there are no real users.
For these less detailed data, inputs to data analysis are often in the form of narratives about perceived UX-related situations and you might have to work a bit harder to extract the essence. These reports often roll up more than one problem in one description and you need to unpack individual UX issues into UX problem instances, as discussed next.
16.4.4 Isolate Individual Critical Incident Descriptions On occasion, participants can experience more than one distinct UX problem at the same time and a single critical incident comment can refer to all of these problems. The first step in the sequence for refining raw data into UX problem reports is to scan the raw critical incident notes, looking for such notes about more than one UX problem, and separate them into multiple critical incident notes, each about a single UX problem.
Here is an example from one of our UX evaluation sessions for a companion Website for the Ticket Kiosk. The participant was in the middle of a benchmark task that required her to order three tickets to a Three Tenors concert. As she proceeded through the task, at one point she could not locate the button (which was below the "fold") to complete the transaction.
When she finally scrolled down and saw the button, the button label said "Submit." At this point she remarked "I am not sure if clicking on this button will let me review my order or just send it in immediately." This is an example of a critical incident that indicates more than one UX problem: the button is located where it is not immediately visible and the label is not clear enough to help the user make a confident decision.
16.4.5 Consolidating Raw Critical Incident Notes into UX Problem Instances
The UX problem instance concept
Howarth et al. (Howarth, Andre, & Hartson, 2007; Howarth, Smith-Jackson, & Hartson, 2009) introduced the concept of UX problem instances to serve as
a bridge between raw critical incident descriptions and UX problem records. A UX problem instance is a single occurrence of a particular problem experienced by a single participant.
The same UX problem may be encountered and recorded in multiple instances-occurring in the same or different sessions, observed by the same or different evaluators, experienced by the same or different participants, within the context of the same or of a different task. These are not each separate problems, but several instances of the same problem.
Critical incidents vs. UX problem instances
We have been using the term "critical incident" for some time and now we are introducing the idea of a UX problem instance. These two concepts are very similar and, if used loosely, can be thought of as referring to more or less the same thing. The difference rests on a bit of a nuance: A critical incident is an observable event (that happens over time) made up of user actions and system reactions, possibly accompanied by evaluator notes or comments, that indicates a UX problem instance.
Critical incident data are raw data and just a record of what happened and are not yet interpreted in terms of a problem or cause and, therefore, not in a form used easily in the analysis that follows. The UX problem instance is a more "processed" or more abstract (in the sense of capturing the essence) notion that we do use in the analysis.
Gathering up parts of data for a critical incident
Raw data for a single critical incident can appear in parts in the video/data stream interspersed with unrelated material. These data representing parts of a critical incident may not necessarily be contiguous in a real-time stream because the participant, for example, may be multitasking or interrupting the train of thought.
To build a corresponding UX problem instance, we need to consolidate all data (e.g., raw notes, video and audio sequences) about each critical incident. The second column from the left in Figure 16-2 shows sets of clips related to the same critical incident being extracted from the video stream. This step pulls out all the
pieces of one single critical incident that then indicates (makes up) the UX problem instance.
This extraction of related parts of a single critical incident description will be fairly straightforward. Raw data related to a given critical incident instance, if not contiguous in the stream of events, will usually be in close proximity.
Figure 16-3
Example "transcript" of raw data stream showing multiple critical incident notes pertaining to a single UX problem instance.
Example: Consolidating Critical Incident Parts of a Single UX Problem Instance
These abstract ideas are best conveyed by a simple example, one we borrow from Howarth, Andre, and Hartson (2007), based on real raw data taken from a UX evaluation of a photo management application. Using this application, users can manage photographs, already on a PC or received via email, into albums contained on their computers.
For this example, our user is trying to upload a picture to put in an album.
The transcript of raw data (video stream plus evaluator comments) goes something like what you see in Figure 16-3.
In consolidating the raw critical incident notes (and associated video and audio clips) relating to this one UX problem instance, the practitioner would include both parts of the transcript (the two parts next to the curly braces, for CI-1, part 1 and CI-1, part 2) but would not include the intervening words and actions of the participant not related to this UX problem instance.
Putting it into a UX problem instance
This is a good time for a reality check on the real value of this critical incident to your iterative process. Using good UX engineering judgment, the practitioner keeps only ones that represent "real" UX instances.
As you put the critical incident pieces into one UX problem instance, you abstract out details of the data (the he-said/she-said details) and convert the wording of observed interaction event(s) and think-aloud comments into the wording of an interpretation as a problem/cause.
16.4.6 A Photo Album Example
Using the Howarth et al. (2007) example of a photo album management application, consider the task of uploading a photo from email and importing it into the album application.
16.4.7 UX Problem Instances
UX problem instance content
To begin with, in whatever scheme you use for maintaining UX data, each UX problem instance should be linked back to its constituent critical incident data parts, including evaluator comments and video clips, in order to retain full details of the UX problem instance origins, if needed for future reference.
The next thing to do is to give the problem a name so people can refer to this problem in discussions. Next we want to include enough information to make the UX problem instance as useful as possible for data analysis. Much has been said in the literature about what information to include in a UX problem record, but common sense can guide you. You need enough information to accomplish the main goals:
� understand each problem
� glean insight into its causes and possible solutions
� be conscious of relationships among similar problems
You will support these goals by problem records fields containing the following kinds of information.
Problem statement: A summary statement of the problem as an effect or outcome experienced by the user, but not as a suggested solution. You want to keep your options flexible when you do get to considering solutions.
User goals and task information: This information provides problem context to know what the user was trying to do when the problem was encountered. In the photo album example, the user has photos on a computer and had the goal of uploading
another picture from email to put in an album contained in the photo management application.
Immediate intention: One of the most important pieces of information to include is the user's immediate intention (at a very detailed level) exactly when the problem was encountered, for example, the user was trying to see a button label in a small font or the user was trying to understand the wording of a label.
More detailed problem description: Here is where you record details that can harbor clues about the nature of the problem, including a description of user actions and system reactions, the interaction events that occurred.
Designer knowledge (interpretation and explanation of events): Another very important piece of information is an "outside the user" explanation of what might have happened in this problem encounter. This is usually based on what we call "designer knowledge." If the participant was proceeding on an incorrect assumption about the design or a misunderstanding of how the design works, the correct interpretation (the designer knowledge) can shed a lot of light on what the participant should have done and maybe how the design could be changed to lead to that outcome.
Designer knowledge is a kind of meta comment because it is not based on observed actions. It is based on knowledge about the design possessed by the evaluator, but not the participant, of why the system did not work the way the user expected and how it does work in this situation. We set up the evaluator team to ensure that someone with the requisite designer knowledge will be present during the evaluation session to include that information in the UX problem instance content that we now need in this transition to data analysis. Here is an example of designer knowledge, in this case about a critical incident that occurred in evaluation of the photo album application, as shown near the bottom left-hand side of Figure 16-3: "I think the participant doesn't realize she has to create or open an album first before she can upload a picture."
Causes and potential solutions: Although you may not know the problem causes or potential solutions at first, there should be a place in the problem record to record this diagnostic information eventually.
UX problem instance project context
In addition to the problem parameters and interaction event context, it can be useful to maintain links from a problem instance to its project context. Project context is a rather voluminous and largely uninteresting (at least during the session) body of data that gives a setting for UX data within administrative and project-oriented parameters.
While completely out of the way during data collection and not used for most of the usual analysis process, these project context data can be important for keeping track of when and by whom problem data were generated and collected and to which version or iteration of the design data apply. This information is linked and cross-linked so that, if you need to, you can find out which evaluators were conducting the session in which a given critical incident occurred and on what date, and using which participant (participant id, if not the name).
Project context data can include (Howarth, Andre, & Hartson, 2007):
� organization (e.g., company, department)
� project (e.g., product or system, project management, dates, budget, personnel)
� version (e.g., design/product release, version number, iteration number)
� evaluation session (e.g., date, participants, evaluators, associated UX target table)
� task run (e.g., which task, associated UX targets)
16.4.8 Merge Congruent UX Problem Instances into UX Problem Records
We use the term congruent to refer to multiple UX problem instances that represent the same underlying UX problem (not just similar problems or problems in the same category).
Find and merge multiple UX problem instances representing the same problem
In general, the evaluator or analyst cannot be expected to know about or remember previous instances of the same problem, so new critical incident descriptions (and new UX problem instances accordingly) are created each time an instance is encountered. Now you should look through your problem instances and find sets that are congruent.
How do you know if two problem descriptions are about the same underlying problem? Capra (2006, p. 41) suggests one way using a practical solution-based criterion: "Two problems, A and B, were considered the same [congruent] if
fixing problem A also fixes problem B and fixing problem B also fixes problem A." Capra's approach is based on criteria used in the analysis of
UX reports collected in CUE-4 (Molich & Dumas, 2008). The symmetric aspect of this criterion rules out a case where one problem is a subset of the other.
As an example from our Ticket Kiosk System evaluation, one UX problem instance states that the participant was confused about the button labeled "Submit" and did not know that this button should be clicked to move on in the transaction to pay for the tickets. Another (congruent) UX problem instance account of the same problem (encountered by a different participant) said that the participant complained about the wording of the button label "Submit," saying it did not help understand where one would go if one clicked on that button.
Create UX problem records
In the merging of congruent UX problem instances, the analyst creates one single UX problem record for that problem. This merging combines the descriptions of multiple instances to produce a single complete and representative UX problem description.
The resulting problem description will usually be slightly more general by virtue of filtering out irrelevant differences among instances while embracing their common defining problem characteristics. In practice, merging is done by taking the best words and ideas of each instance description to synthesize an amalgam of the essential characteristics.
As an example of merging UX problem instances in the photo album application example, we see UX problem instances UPI-1 and UPI-12 in the middle of Figure 16-2, both about trying to find the upload link to upload pictures before having created an album into which to upload. The problem in UPI-1 is stated as: "The participant does not seem to understand that she must first create an album."
The problem in UPI-12 says "User can't find link to upload pictures because the target album has not yet been created." Two users in different UX evaluation sessions encountered the same problem, reported in slightly different ways. When you merge the two descriptions, factoring out the differences, you get a slightly more general statement of the problem, seasoned with a pinch of designer knowledge, in the resulting UX problem record, UP-1: "Participants don't understand that the system model requires them to create an album before pictures can be uploaded and stored in it."
In your system for maintaining UX data (e.g., problem database), each UX problem record should be linked back to its constituent instances in order to retain full details of where merged data came from. The number of UX problem instances merged to form a given UX problem is useful information about the frequency of occurrence of that problem, which could contribute to the weight of importance to fix in the cost-importance ratings (coming up soon later).
If an instance has a particularly valuable associated video clip (linked to the instance via the video stream tag), the UX problem record should also contain a link to that video clip, as the visual documentation of an example occurrence of the problem. Some UX problems will be represented by just one UX problem instance, in which case it will just be "promoted" into a UX problem record.
Thence, UX problem instances will be used only for occasional reference and the UX problem records will be the basis for all further analysis.
16.4.9 Group Records of Related UX Problems for Fixing Together
UX problems can be related in many different ways that call for you to consider fixing them at the same time.
� Problems may be in physical or logical proximity (e.g., may involve objects or actions within the same dialogue box).
� Problems may involve objects or actions used in the same task.
� Problems may be in the same category of issues or design features but scattered throughout the user interaction design.
� Problems may have consistency issues that require similar treatments.
� Observed problem instances are indirect symptoms of common, more deeply rooted, UX problems. A telling indicator of such a deeply rooted problem is complexity and difficulty in its analysis.
By some means of association, for example, using an affinity diagram, group together the related records for problems that should be fixed together, as done with UP-1 and UP- 7 at the right-hand side of Figure 16-2. The idea is to create a common solution that might be more general than required for a single problem, but which will be the most efficient and consistent for the whole group.
Example: Grouping Related Problems for the Ticket Kiosk System
Consider the following UX problems, adapted with permission, from a student team in one of our classes from an evaluation of the Ticket Kiosk System:
Problem 9: The participant expected a graphic of seat layout and missed seeing the button for that at first; kept missing "View Seats" button.
Problem 13: For "Selected Seats," there is no way to distinguish balcony from floor seats because they both use the same numbering scheme and the shape/layout used was not clear enough to help users disambiguate.
Problem 20: In "View Seats" view, the participant was not able to figure out which of the individual seats were already sold because the color choices were confusing.
Problem 25: The participant did not understand that blue seat sections are clickable to get detailed view of available seats. She commented that there was not enough information about which seats are available.
Problem 26: Color-coding scheme used to distinguish availability of seats was problematic. On detailed seat view, purple was not noticeable as a color due to bad contrast. Also, the text labels were not readable because of contrast.
Suggested individual solutions were:
Problem 9 Solution: Create an icon or graphic to supplement "View Seats" option. Also show this in the previous screen where the user selects the number of seats.
Problem 13 Solution: Distinguish balcony seats and floor seats with a different numbering scheme and use better visual treatment to show these two as different.
Problem 20 Solution: Use different icons and colors to make the distinction between sold and available seats clearer. Also add a legend to indicate what those icons/ colors mean.
Problem 25 Solution: Make the blue seat clickable with enhanced physical affordances, and when one is clicked, display detailed seat information, such as location, price, and so on.
Problem 26 Solution: Change the colors; find a better combination that can distinguish the availability clearly. Consider using different fills instead of just colors. Probably should have thicker font for labels (maybe bold would do it).
These problems may be indicative of a much broader design problem: a lack of effective visual design elements in seat selection part of the workflow. We can group and label all these problems into the problem group:
Group 1: Visual designs for seat selection workflow.
With a group solution of:
Group 1 Solution: Comprehensively revise all visual design elements for seat selection workflow. Update style guide accordingly.
Higher level common issues within groups
When UX problem data include a number of critical incidents or problems that are quite similar, you will group these instances together because they are closely related. Then you usually look for common issues among the problems in the group.
But sometimes the real problem is not explicit in the commonality within the group, but the problems only represent symptoms of a higher level problem. You might have to deduce that this higher level problem is the real underlying cause of these common critical incidents.
For example, in one application we evaluated, users were having trouble understanding several different quirky and application-specific labels. We first tried changing the label wordings, but eventually we realized that the reason they did not "get" these labels was that they did not understand an important aspect of the conceptual design. Changing the labels without improving their understanding of the model did not solve the problem.
16.4.10 Analyze Each Problem
Terminology
To begin with, there is some simple terminology that we should use consistently. Technically, UX problems are not in the interaction design per se, but are usage difficulties faced by users. That is, users experience UX problems such as the inability to complete a task. Further, UX problems are caused by flaws in the interaction design. Symptoms are observable manifestations of problems (e.g., user agitation, expressed frustration, or anger).
Thus, the things we actually seek to fix are design flaws, the causes of UX problems. Some (but not all) problems can be observed; but causes have to be deduced with diagnosis. Solutions are the treatments (redesign changes) to fix the flaws. Further downstream evaluation is needed to confirm a "cure."
Sometimes we say "the poor design of this dialogue box is a UX problem" but, of course, that is a short-hand way of saying that the poor design can cause users to experience problems. It is okay to have this kind of informal difference in terminology, we do resort to it ourselves, as long as we all understand the real meaning behind the words.
Table 16-2 lists the terminology we use and its analog in the medical domain.
Table 16-2
Analogous UX and medical terminology
Problems Illness or physical problems experienced by patient
UX problems experienced by user (e.g., inability to complete task)
Symptoms Symptoms (e.g., difficulty in
walking, shortness of breath)
Symptoms (e.g., frustration, anger)
Diagnosis (causes of symptoms)
Identify the disease that cause the symptoms (e.g., obesity)
Identify interaction design flaws that cause the UX problems
Causes of causes
Identify the cause(s) of the disease (e.g., poor lifestyle choices)
Determine causes of interaction design flaws (e.g., poor UX process choices)
Treatment Medicine, dietary counseling,
surgery to cure disease
Redesign fixes/changes to interaction design
Cure confirmation
Later observation and testing Later evaluation
16.4.11 UX Problem Data Management
As time goes by and you proceed further into the UX process lifecycle, the full life story of each UX problem grows, entailing slow expansion of data in the UX problem record. Each UX problem record will eventually contain information about the problem: diagnosis by problem type and subtype, interaction design flaws as problem causes, cost/importance data estimating severity, management decisions to fix (or not) the problem, costs, implementation efforts, and downstream effectiveness.
Most authors mention UX problems or problem reports but do not hint at the fact that a complete problem record can be a large and complex information object. Maintaining a complete record of this unit of UX data is surely one place where some kind of tool support, such as a database management system, is warranted. As an example of how your UX problem record structure and content can grow, here are some of the kinds of information that can eventually be attached to it. These are possibilities we have encountered; pick the ones that suit you:
Problem name Problem description Task context
Effects on users (symptoms) Links to video clip(s) Associated designer knowledge
Problem diagnosis (problem type and subtype and causes within the design)
Links to constituent UX problem instances
Links for relationships to other UX problems (e.g., in groups to be fixed together) Links to project context
Project name Version/release number Project personnel
Link to evaluation session
Evaluation session date, location, etc.
Session type (e.g., lab-based testing, UX inspection, remote evaluation) Links to evaluators
Links to participants
Cost-importance attributes for this iteration (next section) Candidate solutions
Estimated cost to fix Importance to fix Priority ratio Priority ranking Resolution
Treatment history Solution used
Dates, personnel involved in redesign, implementation Actual cost to fix
Results (e.g., based on retesting)
For more about representation schemes for UX problem data, see Lavery and Cockton (1997).
16.4.12 Abridged Qualitative Data Analysis
As an abridged approach formative (qualitative) data analysis:
� Just take notes about UX problems in real time during the session.
� Immediately after session, make UX problem records from the notes.
As an alternative, if you have the necessary simple tools for creating UX problem records:
� Create UX problem records as you encounter each UX problem during the session.
� Immediately after the session, expand and fill in missing information in the records.
� Analyze each problem, focusing on the real essence of the problem and noting causes (design flaws) and possible solutions.
16.5 COST-IMPORTANCE ANALYSIS: PRIORITIZING PROBLEMS TO FIX
It would be great to fix all UX problems known after each iteration of evaluation. However, because we are taking an engineering approach, we have to temper our enthusiasm for perfection with an eye toward cost-effectiveness.
So, now that we are done, at least for the moment, with individual problem analysis, we look at some aggregate problem analysis to assess priorities about what problems to fix and in what order. We call this cost-importance analysis because it is based on calculating trade-offs between the cost to fix a problem and the importance of getting it fixed. Cost-importance analysis applies to any UX problem list regardless of what evaluation method or data collection technique was used.
Although these simple calculations can be done manually, this analysis lends itself nicely to the use of spreadsheets. The basic form we will use is the cost- importance table shown in Table 16-3.
16.5.1 Problem
Starting with the left-most column in Table 16-3, we enter a concise description of the problem. Analysts needing to review further details can consult the problem data record and even the associated video clip. We will use some sample UX problems for the Ticket Kiosk System in a running example to illustrate how we fill out the entries in the cost- importance table.
In our first example problem the user had decided on an event to buy tickets for and had established the parameters (date, venue, seats, price, etc.) but did not realize that it was then necessary to click on the "Submit" button to finish up the event-related choices and move to the screen for making payment. So we enter a brief description of this problem in the first column of Table 16-4.
Table 16-3
Basic form of the cost-importance table
User confused by the button label "Submit" toproceed to payment part of the purchasing transaction
Table 16-4
Problem description entered into cost-importance table
16.5.2 Importance to Fix The next column, labeled "Imp" in the table, is for an estimate of the importance to fix the problem, independent of cost. While importance includes severity or criticality of the problem, most commonly used by other authors, this parameter can also include other considerations. The idea is to capture the effect of a problem on user performance, user experience, and overall system integrity and consistency. Importance can also include intangibles such as management and marketing "feelings" and consideration of the cost of not fixing the problem (e.g., in terms of lower user satisfaction), as well as "impact analysis" (next section).
Because an importance rating is just an estimate, we use a simple scale for the values:
� Importance 1/4 M: Must fix, regardless
� Importance 1/4 5: The most important problems to fix after the "Must fix" category
� If the interaction feature involved is mission critical
� If the UX problem has a major impact on task performance or user satisfaction (e.g., user cannot complete key task or can do so only with great difficulty)
� If the UX problem is expected to occur frequently or could cause costly errors
� Importance 1/4 3: Moderate impact problems
� If the user can complete the task, but with difficulty (e.g., it caused confusion and required extra effort)
� If the problem was a source of moderate dissatisfaction
� Importance 1/4 1: Low impact problems
� If problem did not impact task performance or dissatisfaction much (e.g., mild user confusion or irritation or a cosmetic problem), but is still worth listing
This fairly coarse gradation of values has proven to work for us; you can customize it to suit your project needs. We also need some flexibility to assign intermediate values, so we allow for importance rating adjustment factors, the primary one of which is estimated frequency of occurrence. If this problem is expected to occur very often, you might adjust your importance rating upward by one value.
Conversely, if it is not expected to occur very often, you could downgrade your rating by one or more values. As Karat, Campbell, and Fiegel (1992) relate frequency of occurrence to problem severity classification, they ask: Over all the affected user classes, how often will the user encounter this problem?
Applying this to our importance rating, we might start with a problem preventing a task from being completed, to which we would initially assign Importance 1/4 5. But because we expect this UX problem to arise only rarely and it does not affect critical tasks, we might downgrade its importance to 4 or even 3. However, a problem with moderately significant impact might start out rated as a 3 but, because it occurs frequently, we might upgrade it to a 4.
For example, consider the Ticket Kiosk System problem about users being confused by the button label "Submit" to proceed to payment in the ticket-purchasing transaction. We rate this fairly high in importance because it is part of the basic workflow of ticket buying; users will perform this step often, and most participants were puzzled or misled by this button label.
However, it was not shown to be a show-stopper, so we initially assign it an importance of 3. But because it will be encountered by almost every user in almost every transaction, we "promoted" it to a 4, as shown in Table 16-5.
Learnability can also be an importance adjustment factor. Some problems have most of their impact on the first encounter. After that, users learn quickly to overcome (work around) the problem so it does not have much effect in subsequent usage. That could call for an importance rating reduction.
16.5.3 Solutions
The next column in the cost-importance table is for one or more candidate solutions to the problems. Solving a UX problem is redesign, a kind of design, so you should use the same approach and resources as we did for the original design, including consulting your contextual data. Other resources and activities that might help include design principles and guidelines, brainstorming, study of other similar designs, and solutions suggested by users
Table 16-5
Estimate of importance to fix entered into cost- importance table
User confused by the 4
button label "Submit" to conclude ticket purchasing transaction
and experts. It is almost never a good idea to think of more training or better documentation as a UX problem solution.
Solutions for the photo album problem example
Let us look at some solutions for a problem in the example concerning the photo album application introduced earlier in this chapter. Users experienced a problem when trying to upload photos into an album. They did not understand that they had to create an album first. This misunderstanding about the workflow model built into the application now requires us to design an alternative.
It appears that the original designer was thinking in terms of a planning model by which the user anticipates the need for an album in advance of putting pictures into it. But our users were apparently thinking of the task in linear time, assuming (probably without thinking about it) that the application would either provide an album when it was needed or let them create one. A usage-centered design to match the user's cognitive flow could start by offering an active upload link.
If the user clicks on it when there is no open album, the interaction could present an opportunity for just-in-time creation of the necessary album as part of the task flow of uploading of a picture. This can be accomplished by either asking if the user wants to open an existing album or creating a new one.
Taking a different design direction, the interaction can allow users to upload pictures onto a "work table" without the need for pictures to necessarily be in an album. This design provides more interaction flexibility and potential for better user experience. This design also allows users to place single photos in multiple albums, something that users cannot do easily in their current work domain (without making multiple copies of a photo).
Ticket Kiosk System example
Coming back to the confusing button label in the Ticket Kiosk System, one obvious and inexpensive solution is to change the label wording to better represent where the interaction will go if the user clicks on that button. Maybe "Proceed to payment" would make more sense to most users.
We wrote a concise description of our proposed fix in the Solution column in Table 16-6.
Table 16-6
Potential problem solution entered into cost- importance table
User confused by the button label "Submit" to conclude ticket purchasing transaction
4 Change the label wording to "Proceed to Payment"
16.5.4 Cost to Fix
Making accurate estimates of the cost to fix a given UX problem takes practice; it is an acquired engineering skill. But it is nothing new; it is part of our job to make cost estimates in all kinds of engineering and budget situations. Costs for our analysis are stated in terms of resources (e.g., time, money) needed, which almost always translates to person-hours required.
Because this is an inexact process, we usually round up fractional values just to keep it simple. When you make your cost estimates, do not make the mistake of including only the cost to implement the change; you must include the cost of redesign, including design thinking and discussion and, sometimes, even some experimentation. You might need help from your software developers to estimate implementation costs.
Because it is very easy to change label wordings in our Ticket Kiosk System, we have entered a value of just one person-hour into the Cost column in Table 16-7.
Cost values for problem groups
Table 16-8 shows an example of including a problem group in the cost- importance table.
Note that the cost for the group is higher than that of either individual problem but lower than their sum.
Table 16-7
Estimate of cost to fix entered into cost- importance table
User confused by the button label "Submit" to conclude ticket purchasing transaction
4 Change the 1
label wording to "Proceed to Payment"
Table 16-8
Cost entries for problem groups entered into cost-importance table
Transaction flow for purchasing tickets
7.
The user wanted to enter or choose date and venue first and then click "Purchase Tickets," but the interaction design required them to click on "Purchase Tickets" before entering specific ticket information.
3
Change to allow actions in either order and label it so
Establish a 3 5
comprehensive and more flexible model of transaction flow and add labeling to explain it.
17. The "Purchase Tickets" button took user to screen to select tickets and commit to them, but then users did not realize they had to continue on to the another screen to pay for them.
Provide 3
better labeling for this flow
Calibration feedback from down the road: Comparing actual with predicted costs
To learn more about making cost estimates and to calibrate your engineering ability to estimate costs to fix problems, we recommend that you add a column to your cost-importance table for actual cost. After you have done the redesign and implementation for your solutions, you should record the actual cost of each and compare with your predicted estimates. It can tell you how you are doing and how you can improve your estimates.
16.5.5 Priority Ratio
The next column in the cost-importance table, the priority ratio, is a metric we use to establish priorities for fixing problems. We want a metric that
will reward high importance but penalize high costs. A simple ratio of importance to cost fits this bill. Intuitively, a high importance will boost up the priority but a high cost will bring it down. Because the units of cost and importance will usually yield a fractional value for the priority ratio, we scale it up to the integer range by multiplying it by an arbitrary factor, say, 1000.
If the importance rating is "M" (for "must fix regardless"), the priority ratio is also "M." For all numerical values of importance, the priority ratio becomes:
Priority ratio 1/4 �importance=cost�*1000
Example: Priority Ratios for Ticket Kiosk System Problems
For our first Ticket Kiosk System problem, the priority ratio is (4/1) x 1000 1/4
4000, which we have entered into the cost-importance table in Table 16-9.
In the next part of this example, shown in Table 16-10, we have added several more Ticket Kiosk System UX problems to fill out the table a bit more realistically.
Note that although fixing the lack of a search function (the sixth row in Table 16-10) has a high importance, its high cost is keeping the priority ratio low. This is one problem to consider for an Importance 1/4 M rating in the future. At the other end of things, the last problem (about the Back button to the Welcome screen) is only Importance 1/4 2, but the low cost boosts the priority ratio quite high. Fixing this will not cost much and will get it out of the way.
16.5.6 Priority Rankings
So far, the whole cost-importance analysis process has involved only some engineering estimates and some simple calculations, probably in a spreadsheet. Now it gets even easier. You have only to sort the cost-importance table by priority ratios to get the final priority rankings.
First, move all problems with a priority ratio value of "M" to the top of the table. These are the problems you must fix, regardless of cost. Then sort the rest of the table in descending order by priority ratio. This puts high importance, low cost problems at the top of the priority list, as shown at A in the upper left-hand quadrant of Figure 16-4. These are the problems to fix first, the fixes that will give the biggest bang for the buck.
Being the realist (our nice word for cynic) that you are, you are quick to point out that, in the real world, things do not line up with high importance and low cost together in the same sentence. You pay for what you get. But, in fact, we do find a lot of problems of this kind in early iterations.
Table 16-9
Priority ratio calculation entered into cost- importance table
User confused by the button label "Submit" to conclude ticket purchasing transaction
4
Change the label wording to "Proceed to Payment"
1 4000
Table 16-10
Priority ratios for more Ticket Kiosk System problems
Problem
Imp.
Solution
Cost
Prio. Ratio
Prio. Rank
Cuml. Cost
Resolution
User confused by the button label "Submit" to conclude ticket purchasing transaction
4
Change the label wording to "Proceed to Payment"
1
4000
Did not recognize the "counter" as being for the number of tickets. As a result, user failed to even think about how many tickets he needed.
M
Move quantity information and label it
2
M
Unsure of current date and what date he was purchasing tickets for
5
Add current date field and label all dates precisely
2
2500
Users were concerned about their work being left for others to see
5
Add a timeout feature that clears the screens
3
1667
User confused about "Theatre" on the "Choose a domain" screen. Thought it meant choosing a physical theater (as a venue) rather than the category of theatre arts.
3
Improve the wording to "Theatre Arts"
1
3000
Ability to find events hampered by lack of a search capability
4
Design and implement a search function
40
100
Did not recognize what geographical area theater information was being displayed for
4
Redesign graphical representation to show search radius
12
333
Did not like having a "Back" buttonon second screen since first screen was only a "Welcome"
2
Remove it
1
2000
Transaction flow for purchasing tickets (group problem; see Table 16-8)
3
Establish a comprehensive and more flexible model of transaction flow and
add labeling to explain it
5
600
A good example is a badly worded button label. It can completely confuse users but usually costs almost nothing to fix. Sometimes low-importance, low- cost problems float up near the top of the priority list. You will eventually want to deal with these. Because they do not cost much, it is usually a good idea to just fix them and get them out of the way.
Figure 16-4
The relationship of importance and cost in prioritizing which problems to fix first.
The UX problems that sort to the bottom of the priority list are costly to fix with little gain in doing so. You will probably not bother to fix these problems, as shown at B in the lower right-hand quadrant of Figure 16-4.
Quadrants A and B sort out nicely in the priority rankings. Quadrants C and D, however, may require more thought. Quadrant C represents problems for which fixes are low in cost and low in importance. You will usually just go ahead and fix them to get them off your plate. The most difficult choices appear in quadrant D
because, although they are of high importance to fix, they are also the most expensive to fix.
No formula will help; you need good engineering judgment. Maybe it is time to request more resources so these important problems can be fixed. That is usually worth it in the long run.
The cost-importance table for the Ticket Kiosk System, sorted by priority ratio, is shown in Table 16-11.
16.5.7 Cumulative Cost
The next step is simple. In the column labeled "Cuml. Cost" of the cost- importance table sorted by priority ratio, for each problem enter an amount that is the cost of fixing that problem plus the cost of fixing all the problems above it in the table. See how we have done this for our example Ticket Kiosk System cost- importance table in Table 16-11.
16.5.8 The Line of Affordability
Using your budget, your team leader or project manager should determine your "resource limit," in person-hours, that you can allocate to making design changes for the current cycle of iteration. For example, suppose that for the Ticket Kiosk System we have only a fairly small amount of time available in the schedule, about 16 person hours.
Draw the "line of affordability," a horizontal line in the cost-importance table just above the line in the table where the cumulative cost value first exceeds your resource limit. For the Ticket Kiosk System, the line of affordability appears just above the row in Table 16-11 where the cumulative cost hits 27.
Table 16-11
The Ticket Kiosk System cost-importance table, sorted by priority ratio, with cumulative cost values entered, and the "line of affordability" showing the cutoff for this round of problem fixing
Problem
Imp.
Solution
Cost
Prio. Ratio
Prio. Rank
Cuml. Cost
Resolution
Did not recognize the "counter" as being for the number of tickets. As a result, user failed to even think about how many tickets he needed.
M
Move quantity information and label it
2
M
1
2
User confused by the button label "Submit" to conclude ticket purchasing transaction
4
Change the label wording to "Proceed to Payment"
1
4000
2
3
User confused about "Theatre" on the "Choose a domain" screen. Thought it meant choosing a physical theater (as a venue) rather than the category of theatre arts.
3
Improve the wording to "Theatre Arts"
1
3000
3
4
Unsure of current date and what date he was purchasing tickets for
5
Add current date field and label all dates precisely
2
2500
4
6
Did not like having a "Back" button on second screen since first screen was only a "Welcome"
2
Remove it
1
2000
5
7
Users were concerned about their work being left for others to see
5
Add a timeout feature that clears the screens
3
1667
6
10
Transaction flow for purchasing tickets (group problem; see Table 16-8)
3
Establish a comprehensive and more flexible model of transaction flow and add
labeling to explain it.
5
600
7
15
Did not recognize what
geographical area theater information was being displayed for
4
Redesign graphical
representation to show search radius
12
333
8
27
Ability to find events hampered by lack of a search capability
4
Design and implement a search function
40
100
9
67
Just for giggles, it might be fun to graph all your problems (no, not all your problems; we mean all your cost-importance table entries) in a cost-importance space like that of Figure 16-4. Sometimes this kind of graphical representation can give insight into your process, especially if your problems tend to appear in clusters. Your line of affordability will be a vertical line that cuts the cost axis at the amount you can afford to spend on fixing all problems this iteration.
16.5.9 Drawing Conclusions: A Resolution for Each Problem
It's time for the payoff of your cost-importance analysis. It's time for a resolution-a decision-about how each problem will be addressed.
First, you have to deal with your "Must fix" problems, the show-stoppers. If you have enough resources, that is if all the "Must fix" problems are above the line of affordability, fix them all. If not, you already have a headache. Someone, such as the project manager, has to earn his or her pay today by making a difficult decision. The extreme cost of a "Must fix" problem could make it infeasible to fix in the current version. Exceptions will surely result in cost overruns, but might have to be dictated by corporate policy, management, marketing, etc. It is an important time to be true to your principles and to everything you have done in the process so far. Do not throw it away now because of some perceived limit on how much you are willing to put into fixing problems that you have just spent good money
to find.
Sometimes you have resources to fix the "Must fix" problems, but no resources left for dealing with the other problems. Fortunately, in our example we have enough resources to fix a few more problems. Depending on their relative proximity to the line of affordability, you have to decide among these choices as a resolution for all the other problems:
� fix now
� fix, time permitting
� remand to "wait-and-see list"
� table until next version
� postpone indefinitely; probably never get to fix
In the final column of the cost-importance table, write in your resolution for each problem, as we have done for the Ticket Kiosk System in Table 16-12.
Finally, look at your table; see what is left below the line of affordability. Is it what you would expect? Can you live with not making fixes below that line?
Again, this is a crossroads moment. You will find that in reality that low-
Table 16-12
Problem resolutions for Ticket Kiosk System
Problem
Imp.
Solutions
Cost
Prio. Ratio
Prio. Rank
Cuml. Cost
Resolution
Did not recognize the "counter" as being for the number of tickets. As a result, user failed to even think about how many tickets he needed.
M
Move quantity information and label it
2
M
1
2
Fix in this version
User confused by the button label "Submit" to conclude ticket purchasing transaction
4
Change the label wording to "Proceed to Payment"
1
4000
2
3
Fix in this version
User confused about "Theatre" on the "Choose a domain" screen. Thought it meant choosing a physical theater (as a venue) rather than the category of theatre arts.
3
Improve the wording to "Theatre Arts"
1
3000
3
4
Fix in this version
Unsure of current date and what date he was purchasing tickets for
5
Add current date field and label all dates precisely
2
2500
4
6
Fix in this version
Did not like having a "Back" button on second screen since first screen was only a "Welcome"
2
Remove it
1
2000
5
7
Fix in this version
Users were concerned about their work being left for others to see
5
Add a timeout feature that clears the screens
3
1667
6
10
Fix in this version
Transaction flow for purchasing tickets (group problem; see Table 16-8)
3
Establish a comprehensive and more flexible model of transaction flow and add labeling to
explain it
5
600
7
15
Fix in this version
Ability to find events
4
Design and
40
100
9
67
Wait until
hampered by lack of a
implement a search
next
search capability
function
version, or
after that
importance/high-cost problems are rarely addressed; there simply will not be time or other resources. That is okay, as our engineering approach is aiming for cost-effectiveness, not perfection. You might even have to face the fact that some important problems cannot be fixed because they are simply too costly.
However, in the end, do not just let numbers dictate your actions; think about it. Do not let a tight production schedule or budget force release of something that could embarrass your organization. Quality is remembered long after schedules are forgotten.
16.5.10 Special Cases
Tie-breakers
Sometimes you will get ties for priority rankings, entries for problems with equal priority for fixing. If they do not occur near the line of affordability, it is not necessary to do anything about them. In the rare case that they straddle the line of affordability, you can break the tie by almost any practical means, for example, your team members may have a personal preference.
In cases of more demanding target systems (e.g., an air traffic control system), where the importance of avoiding problems, especially dangerous user errors, is a bigger concern than cost, you might break priority ties by adjusting the priorities via weighting importance higher than cost in the priority ratio formula.
Cost-importance analysis involving multiple problem solutions
Sometimes you can think of more than one solution for a problem. It is possible that, after a bit more thought, one solution will emerge as best. If, however, after careful consideration you still have multiple possibilities for a problem solution, you can keep all solutions in the running and in the analysis until you see something that helps you decide.
If all solutions have the same cost to fix, then you and your team will just have to make an engineering decision. This might be the time to implement all of them and retest, using local prototyping (Chapter 11) to evaluate alternative design solutions for just this one feature.
Usually, though, solutions are distinguished by cost and/or effectiveness. Maybe one is less expensive but some other one is more desirable or more effective; in other words, you have a cost-benefit trade-off. You will need to resolve such cost-benefit problems separately before entering the chosen solution and its cost into the cost-importance table.
Problem groups straddling the line of affordability
If you have a group of related problems right at the line of affordability, the engineering answer is to do the best you can before you run out of resources. Break the group back apart and do as many pieces as possible. Give the rest of the group a higher importance in the next iteration.
Priorities for emotional impact problems
Priorities for fixing emotional impact problems can be difficult to assess. They are often very important because they can represent problems with product or system image and reputation in the market. They can also represent high costs to fix because they often require a broader view of redesign, not just focusing on one detail of the design as you might for a usability problem.
Also, emotionalimpactproblems areoften notjust redesign problemsbut might require more understanding of the users and work or play context, which means going all the way back in the process to contextual inquiry and contextual analysis and a new approach to the conceptual design. Because of business and marketing imperatives, you may havetomovesomeemotionalimpactproblemsintothe "Must fix" category and do what is necessary to produce an awesome user experience.
16.5.11 Abridged Cost-Importance Analysis
As an abridged version of the cost-importance analysis process:
� Put the problem list in a spreadsheet or similar document.
� Project it onto a screen in a room with pertinent team members to decide priorities for fixing the problems.
� Have a discussion about which problems to fix first based on a group feeling about the relative importance and cost to fix each problem, without assigning numeric values.
� Do a kind of group-driven "bubble sort" of problems in which problems to fix first will float toward the top of the list and problems you probably cannot fix, at least in this iteration, will sink toward the bottom of the list.
� When you are satisfied with the relative ordering of problem priorities, start fixing problems from the top of the list downward and stop when you run out of time or money.
16.6 FEEDBACK TO PROCESS
Now that you have been through an iteration of the UX process lifecycle, it is time to reflect not just on the design itself, but also on how well your process worked. If you have any suspicions after doing the testing that the quantitative criteria were not quite right, you might ask if your UX targets worked well.
For example, if all target levels were met or exceeded on the very first round of evaluation, it will almost certainly be the case that your UX targets were too lenient. Even in later iterations, if all UX targets are met but observations during evaluation sessions indicate that participants were frustrated and performed tasks poorly, your intuition will probably tell you that the design is nevertheless not acceptable in terms of its quality of user experience. Then, obviously, the UX team should revisit and adjust the UX targets or add more considerations to your criteria for evaluation success.
Next, ask yourself whether the benchmark tasks supported the evaluation process in the most effective way. Should they have been simpler or more complex, narrower or broader? Should any benchmark task description be reworded for clarification or to give less information about how to do a task?
Finally, assess how well the overall process worked for the team. You will never be in a better position to sit down, discuss it, and document possible improvements for the next time.
16.7 LESSONS FROM THE FIELD
16.7.1 Onion-Layers Effect
There are many reasons to make more than one iteration of the design- test-redesign part of the UX lifecycle. The main reason, of course, is to continue to uncover and fix UX problems until you meet your UX target values. Another reason is to be sure that your "fixes" have not caused new problems. The fixes are, after all, new and untested designs.
Also, in fixing a problem, you can uncover other UX problems lurking in the dark and inky shadows of the first problem. One problem can be obscured by another, preventing participants and evaluators from seeing the second problem, until the top layer of the onion1 is peeled off by solving that "outer" problem.
16.7.2 UX Problem Data as Feedback to Process Improvement
In our analysis we are also always on the lookout for causes of causes. It sometimes pays off to look at your UX process to find causes of the design flaws that cause UX problems, places in your process where, if you could have done something differently, you might have avoided a particular kind of design flaw. If you suffer from an overabundance of a particular kind of UX problem and can determine how your process is letting them into the designs,
1Thanks to Wolmet Barendregt for the onion-layer analogy.
maybe you can head off that kind of problem in future designs by fixing that part of the process.
For example, if you are finding a large number of UX problems involving confusing button or icon labels or menu choices, maybe you can address these in advance by providing a place in your design process where you look extra carefully at the precise use of words, semantics, and meanings of words. You might even consider hiring a professional writer to join the UX team. We ran into a case like this once.
For expediency, one project team had been letting their software programmers write error messages as they encountered the need for them in the code. This situation was a legacy from the days when programmers routinely did most of the user interface. As you can imagine, these error messages were not the most effective. We helped them incorporate a more structured approach to error message composition, involving UX practitioners, without unduly disrupting the rest of their process.
Similarly, large numbers of problems involving physical user actions are indicators of design problems that could be addressed by hiring an expert in ergonomics, human factors engineering, and physical device design. Finally, large numbers of problems involving visual aspects of design, such as color, shape, positioning, or gray shading, might indicate the need for hiring a graphic designer or layout artist.
Intentionally left as blank
Evaluation Reporting 17
Objectives
After reading this chapter, you will:
1. Know how to report informal summative evaluation results
2. Be ready to report qualitative formative evaluation results, including the influence of audience and goals on content, format and vocabulary, and tone
3. Understand influences on problem report effectiveness
17.1 INTRODUCTION
17.1.1 You Are Here
We begin each process chapter with a "you are here" picture of the chapter topic in the context of the overall Wheel lifecycle template; see Figure 17-1. Having gotten through UX evaluation preparation, data collection, and analysis, we conclude the evaluation chapters with this one: reporting your formative UX evaluation results. The reporting described in this chapter is largely aimed at rigorous empirical methods, but much applies as well to rapid methods.
17.1.2 Importance of Quality Communication and Reporting
Evaluation reports often occur as communication across discontinuities of time, location, and people. Redesign activities are often separated from UX evaluation by delays in time that can cause information loss due to human memory limitations. This is further aggravated if the people doing the redesign are not the same ones who conducted the evaluation.
Finally, evaluation and redesign can occur at different physical locations, rendering all information not well communicated to be unrecoverable.
UX evaluation reports with inadequate contextual information or incomplete UX problem descriptions will be too vague for designers who were not present for the UX testing.
Figure 17-1
You are here; at reporting, within the evaluation activity in the context of the overall Wheel lifecycle template.
To the project team, the report for an evaluation within an iteration is a redesign proposal. Hornb�k and Fr0kj�r (2005) show the need for usability evaluation reports that summarize and convey usability information, not just lists of problem descriptions by themselves.
All the effort and cost you invested thus far in UX evaluation can be wasted at the last minute if you do not follow up now to:
� inform the team and project management about the UX problems in the current design
� persuade them of the need to invest even more in fixing those problems.
17.1.3 Participant Anonymity
We remind you, before we get into the details, that regardless of the kind of evaluation or reporting you are doing, you must preserve participant anonymity. You should have promised this on your informed consent form, and you have an ethical, and perhaps a legal, obligation to protect it
religiously thereafter. The necessity for preserving participant anonymity extends especially to evaluation reporting.
There is simply no need for anyone in your reporting audience to know the identity of any participant. This means not including any names in the report and not showing faces in video clips. This latter requirement can be met with some simple video blurring.
Participant anonymity does not mean that you, as the evaluator or facilitator, do not know the names of participants. Somewhere along the line someone must have recruited and signed up and possibly even paid the participants.
You should keep participant identification information in just one place-on a sheet of paper or in a database mapping the names to identification codes.
Codes, never names, are used everywhere else in the evaluation process-on data collection forms, during data analysis, and in all reports.
17.2 REPORTING INFORMAL SUMMATIVE RESULTS
Formative evaluation, by definition, has a qualitative formative component and an optional informal summative component (Chapter 12). There are still no standards for reporting informal summative results in connection with formative evaluation (more about this in the next section).
Because product design is not research but engineering, we are not concerned with getting at scientific "truth"; ours is a more practical and less exact business. Our evaluation drives our engineering judgment, which is also based on hunches and intuition that are, in turn, based on skill and experience.
As we said in Chapter 12, the audience for your informal summative evaluation results should be strictly limited to your own project group. They are to be used only as an engineering tool within the project.
17.2.1 What if You Are Required to Produce a Formative Evaluation Report for Consumption Beyond the Team?
The meaning of "internal use" can vary some, but it usually means restricted to the project group (e.g., designers, evaluators, implementers, project manager) and definitely not for public dissemination. Formative evaluation reports must remain within a group who all understand the limitations on how data can be used. At times that could also include higher level managers and others in the larger organization.
But sometimes you do not have a choice; you can be ordered to produce a formative evaluation report for broader dissemination. Suppose the marketing people want to make claims about levels of user experience reached via
your engineering process. Our first line of advice is to follow our principle and simply not let specific informal summative evaluation results out of
the project group. Once the results are out of your hands, you lose control of what is done with them, and you could be made to share the blame for their misuse.
Your next response should be to inform. Explain the limited nature of data and the professional and ethical issues in misrepresenting the power to make claims from data. If, at the end of the day, you still have to issue a formative evaluation report, we recommend you bend over backward in labeling the report with caveats and qualifications that make it clear to all readers that the informal summative UX results are intended to be used only as a project management tool and should not be used in public claims.
17.2.2 What if You Need a Report to Convince the Team to Fix the Problems?
What good is doing the UX evaluation if no one is convinced the problems you found are "real" and, as a result, the design does not get changed? It may be part of the job of UX engineers to convince others in the project team
to take action about poor UX, as revealed by UX evaluation. This part of the role is especially important in large organizations where people who collect data are not necessarily the same people, or even people who have a close working relationship with, those who make the decisions about design changes.
It is not uncommon for project teams to request a UX testing report to see for themselves what the UX situation is and how badly the recommended changes are needed. This could just be part of the normal way your organization works or, depending on the working relationship among project team members, this situation could be indicative of a management or organizational problem rather than a technical one.
The need for the rest of your team, including management, to be convinced of the need to fix UX problems that you have identified in your UX engineering process could be considered a kind of litmus test for a lack of teamwork and trust within your organization. If everyone is working together as a team, no one should have to convince the others of the value of their efforts on the project; they all just do what the process calls for.
We do not live in a perfect world, however, and the people requesting your UX report may have some power over you. For example, your manager or the software engineers might stand between you and the changes to the software. If they require a report that "proves" the need for design changes, you must make it clear that what they are asking for is not your report but
a report of a summative study. Of course, they will have to pay for someone to perform that study, which could go a long way in heading off such requests the next time.
Sometimes the other project team members know how the UX people work and trust them. For them, if the UX practitioner says a certain design change is needed, they go with it, but still might want an explanation to help them understand the need and help cement their buy in. In this case, it is not a problem to share the report, as these are your team members.
17.3 REPORTING QUALITATIVE FORMATIVE RESULTS
All UX practitioners should be able to write clear and effective reports about problems found but, in their "CUE-4" studies, Dumas, Molich, and Jeffries (2004) found that many cannot. They observed a large variation in reporting over several teams of usability specialists and that most reports were inadequate by their standards. It is hoped that this chapter will help you communicate clearly and effectively via the planning and construction of your UX evaluation reports.
If you use rapid evaluation methods for data collection, it is especially important to communicate effectively about the analysis and results because this kind of data can otherwise be dismissed easily "as unreliable or inadequate to inform design decisions" (Nayak, Mrazek, & Smith, 1995). Even in lab-based testing, though, the primary type of data from formative evaluation is qualitative, and raw qualitative data must be skillfully distilled and interpreted to avoid the impression of being "soft" and subjective.
17.3.1 Common Industry Format (CIF) for Reporting Formal Summative UX Evaluation Results
In October 1997, the U.S. National Institute of Standards and Technology (NIST) started an effort to "increase the visibility of software usability." NIST was to be a facilitator in bringing together software vendors and consumer organizations, with the stated goal to develop and evaluate a common usability reporting format for sharing usability data with consumer organizations.
The idea was to make software product usability visible to customers and consumers through (theretofore nonexistent) standard, comparable, methods of reporting measured usability through "the Common Industry Format (CIF) for reporting usability results".
The pressure to bring products to market rapidly had affected usability adversely, as it still now does. The idea was to force software suppliers to face the fact that their customers were concerned about usability and, if usability could be made visible to consumers, it would become a competitive market
factor with software that is measurably more usable winning out. In the face of the many possible ways to report summative usability evaluation results, a common reporting format would add consistency and comparability.
Oriented toward off-the-shelf software products and Websites, the CIF provides a kind of "Consumer Reports" support for software buyers, affording a way to compare usability of competitive software products (Quesenbery, 2005, p. 452).
It is clear from the goals that this standard pertained to formal summative evaluation and not formative evaluation, although at the time it was still too early for that limitation or even the distinction to be articulated.
The CIF standard calls out requirements for reports to include:
� A description of the product
� Goals of the testing
� A description of the number and types of participants
� Tasks used in evaluation
� The experimental design of the test (very important for formal summative studies because of the need for eliminating any biases and to ensure the results do not suffer from external, internal, and other validity concerns)
� Evaluation methods used
� Usability measures and data collection methods employed
� Numerical results, including graphical methods of presentation
The American National Standards Institute (ANSI) approved this
standard for summative reporting as ANSI-NCITS 354-2001 in December 2001 and it became an international standard, ISO/IEC 25062: Software Engineering-Software Product Quality Requirements and Evaluation (SQuaRE), in May 2005.
17.3.2 Common Industry Format (CIF) for Reporting Qualitative Formative Results
Following this initial effort on a Common Industry Format for reporting formal summative evaluation results, the group, under the direction of Mary Theofanos, Whitney Quesenbery, and others, organized two workshops in 2005 (Theofanos et al., 2005), these aimed at a CIF for formative reports (Quesenbery, 2005; Theofanos & Quesenbery, 2005).
In this work they recognized that because most evaluations conducted by usability practitioners are formative, there was a need for an extension of the original CIF project to identify best practices for reporting formative results.
They concluded that requirements for content, format, presentation style, and level of detail depended heavily on the audience, the business context,
and the evaluation techniques used.
While their working definition of "formative testing" was based on having representative users, here we use the slightly broader term "formative evaluation" to include usability inspections and other methods for collecting formative usability and user experience data, not necessarily requiring representative users.
17.4 FORMATIVE REPORTING CONTENT
In this section we cover the different types of reporting content that could go into a formative evaluation report. In later sections we discuss which of these content types are suitable to different audiences.
17.4.1 Individual Problem Reporting Content
Many researchers and practitioners have suggested various content items that might prove useful for problem diagnosis and redesign. The idea is to provide all the essential facts a designer will need to understand and fix the problem.
Of course, at this point, the evaluators would have had to collect sufficient data to be able to provide all this information. The basic information needed includes:
� the problem description
� a best judgment of the causes of the problem in the design
� an estimate of its severity or impact
� suggested solutions
In the first of these items, be sure to describe each problem as a problem, not as a solution. Because the problems were experienced by users doing tasks, describe them in that context-users and tasks and the effects of the problems on users. This means saying, for example, "users could not figure out what to do next because they did not notice the buttons" instead of "we need flashing red buttons."
The second item, the engineering judgment of the causes of the problem in the interaction design, is an essential part of the diagnosis of a UX problem and perhaps the most important part of the report. Because the flaw in the design is what needs to be fixed, you should connect it with the appropriate
design guidelines and/or heuristic violations, as much as possible in terms of interaction issues and human-computer interaction principles.
Next is an estimate of severity or importance in terms of the impact on users. To be convincing, this must be well reasoned. Finally, to help designers act to fix the problems, recommend one or more possible design solutions, along with cost estimates and tradeoffs for each, especially if a solution has a downside. To justify the fixes, make compelling arguments for improved design and positive impact on users.
There are many other kinds of information that can be useful in a UX problem report content, including an indication of how many times each UX problem was encountered, by each user and by all users, to help convey its importance.
17.4.2 Include Video Clips Where Appropriate
Show video clips of users, made anonymous, encountering critical incidents if you are giving an oral report or include links in a written report. Use the visual power of video to share the highlights of your evaluation process, including some examples of UX problem encounters.
17.4.3 Pay Special Attention to Reporting on Emotional Impact Problems
Special discussion should be directed to reporting emotional impact problems, as those problems can be the most important for product improvement and marketing advantage, but these problems and their solutions can also be the most elusive. Emotional impact problems should be flagged as a somewhat different kind of problem with different kinds of recommendations for solutions.
Provide a holistic summary of the overall emotional impact on participants. Report specific positive and negative highlights with examples from particular episodes or incidents. If possible, try to inspire by comparing with products and systems having high emotional impact ratings.
17.4.4 Including Cost-Importance Data
Usually, cost-importance analysis is considered part of the nitty-gritty engineering details that would be beyond the interest or understanding of those outside the UX team and its process. However, cost-importance analysis, especially the prioritization process, can be of great interest to those who have to fix the problems and those who have to pay for it.
Importance ratings and supporting rationale can be helpful in convincing designers to fix at least the most urgent problems. The cost-importance table, such as Table 16-12, plus any discussion supporting the choice of table entries will tell the story.
17.5 FORMATIVE REPORTING AUDIENCE, NEEDS, GOALS, AND CONTEXT OF USE
As Theofanos and Quesenbery (2005) say, choices about content, format, vocabulary, and tone are all about the relationship between the author and the audience. Nayak et al. (1995) discuss some of the difficulties of conveying
UX information, such as explaining observation-based data and understanding the needs of the target audience. Because the needs, goals, and context of use for a given evaluation report are dependent on the audience, this section is organized on the various kinds of audiences.
The 2005 UPA Workshop Report on formative evaluation reporting (Theofanos et al., 2005) stresses different reporting requirements for different business contexts and audiences and different combinations thereof. Their view of reporting goals includes the following:
� Documenting the process: The author is usually part of the team, and the goal is to document team process and decision making. The scope of the "team" is left undefined and could be just the evaluation team or the whole project team, or perhaps even the development organization as a whole.
� Feeding the process: This is the primary context in our perspective, an integral part of feedback for iterative redesign. The goal is to inform the team about evaluation results, problems, and suggested solutions. In this report the author is considered to be related to the team but not necessarily a member of the team. However, it would seem that the person most suited as the author would usually be a UX practitioner who is part of the evaluator team and, if UX practitioners are considered part of the project team, a member of that team, too.
� Informing and persuading: The audience depends on working relationships within the organization. It could be the evaluator team informing and persuading the designers and developers (i.e., software implementers) or it could be the whole project team (or part thereof) informing and persuading management, marketing, and/or the customer or client.
17.5.1 Introducing UX Engineering to Your Audience
Because your goal is to persuade your audience of the need to invest time and cost into taking action to fix problems discovered, you must include your audience in the process and reasoning that led from raw data to conclusions so that your recommendations do not appear to be pulled out of the air. To include them in your process, you must explain the process.
Therefore, sometimes the main purpose of an evaluation report is to introduce the concepts of UX and UX engineering to an audience (your project
team, management, marketing, etc.) not yet aware. This kind of audience requires a different kind of report from all the others. It is more like an evaluation report contained within a more general presentation. First you have to establish your credentials and credibility and gain their engagement.
The goals for reporting to this kind of audience include (more or less in this order):
� Engender awareness and appreciation
� Teach concepts
� Sell buy-in
� Present results
Start on the first goal by building rapport and empathy. You are all on the same side. Help them feel "safe"; you would never try to sell them on something that was not good for them and the organization. You want to get them to appreciate the need for usability and a good user experience to appreciate the value of these things to them and their organization. This is basically a motivation for UX based on a business case (Chapter 24).
The next goal of your presentation is teaching. The idea is to explain terminology and concepts. You want to educate them about what UX is and how the UX engineering process works to improve UX. Explain that evaluation is not everything in the process but is a key part. Help them understand
how to view evaluation results. It is not negative stuff, it is not criticism, and, above all, it is not a personal thing. It is positive, good stuff, and is a team thing; it is an opportunity and a means to improve.
The third goal is about persuasion and selling of the concept. You want to get their buy-in to the idea of doing UX. You want them to want to include a UX component in their overall development process (and budgets and schedules).
Finally, if you have done your job with the other goals, they should be receptive-no, eager-to hear about the results of your latest evaluation and to talk about how they can make the necessary design changes and iterate to improve the current design.
17.5.2 Reporting to Inform Your Project Team
The primary audience for a report of UX problem details is your own project team-the designers and implementers who will fix the problems. Unless your project team is very small and everyone was present for evaluation data collection and analysis, you will always need to share your evaluation results with your team.
However, you do not always have to do "formal" reporting. The key goal is to convey results and product implications clearly and meaningfully to your
workmates, informing them about UX flaws in the design and/or informally measured shortcomings in user performance with the purpose of understanding what needs to be done to improve the design in the next iteration.
For the interaction designers and UX practitioners, UX problems can be presented as they are related to specific parts of the interaction dialogues (e.g., a particular dialogue box). For the software engineers, you might organize your UX problems by software module, as that is the way they think.
Suppose you know the people on the team; you, the evaluator, know the designers and/or developers. As a UX practitioner, you may even be the designer or you work very closely with the designer(s) and you also know the developers. You decide and make the design changes and then persuade the developers to implement them. In this case your report can be short and to the point, with little need for embellishments or blandishments.
Sometimes, especially in large development organizations, the people who write the evaluation reports are sending them to a development team they do not know. They may not even have a chance to meet with the designers to present the results personally or to explain points or answer questions. This case is a bit more demanding of your reporting-needing a complete and standalone document. It calls for a bit more politeness or formality, more completeness, and definitely more selling of the changes and their implementation.
Dumas, Molich, and Jeffries (2004) point out the case of a UX consultant, where there is often no opportunity to explain comments or negotiate recommendations after the report is delivered.
Start with a "boilerplate" summary of the basics, including evaluation goals, methods, and UX targets and benchmark tasks used. Screen shots and video clips illustrating actual problem encounters are always good for selling your points about problems.
Your audience will expect you to prioritize your redesign recommendations, and cost-importance analysis (Chapter 16) is a good way to do this. Assuming that your team has technical savvy, use tables to summarize your findings; do not make them plow through a lot of text for the essence. If your development schedule is short and things are already moving fast, keep the report and your problem list short.
17.5.3 Reporting to Inform and/or Influence Your Management
The team's key goal for reporting to this audience is to influence and convince them that this is part of the process and that the process is working. You want them to understand that although you now have a version of a prototype or product, we are not done yet and we all need to iterate.
Reports to management have to be short and sweet. Be concise and get to the point. Start with an executive summary. You usually should also very briefly explain the process, the evaluation goals, the methods, and the UX targets and benchmark tasks used. Because this can be counted as at least a partly "internal" audience, you can share high-level aspects of informal quantitative testing (e.g., user performance and satisfaction scores), but just trends observed, not numbers and no "claims," and remember not to call it a "study."
Focus on UX problems that can be fixed within the number of people hours allocated in the budget but paint a complete picture of your findings. A cost-importance analysis that prioritizes UX problems based on a ratio of estimated cost to perceived importance (Chapter 16) may be a key element in demonstrating how you chose the problems to be fixed. This analysis can
also highlight other problems currently out of reach but that could be fixed with more resources.
Define your priorities and relate them directly to business goals. This is easier if you used UX targets driven by UX goals (Chapter 10), based on business and product goals. You need an "explicit connection between the business or test goals and the results" (Theofanos & Quesenbery, 2005; Theofanos et al., 2005). Screen shots and video clips, made anonymous, illustrating actual problem encounters might be useful in engaging them in the whole evaluation scene.
17.5.4 Reporting to Your Customer or Client
As with most audiences, it is best to not start by hitting them square on with what is wrong with the system. This audience needs first to understand the whole concept of engineering for UX and the methods you use and how they help improve the product. They also have to be in favor of using this process before you tell them the process has revealed that their baby is ugly, but can be fixed. So, if they are unfamiliar with the UX process, a first goal may be to educate them about it so that they will understand the rest of what you have to report.
Another common goal is to impress them that you know what you are doing, you are earning your keep, and that the project is going well.
If you want to include UX problems, go easy on the doses of bad news.
Clients and customers will not want to hear that there is a whole list of problems with the design of their system. This undermines confidence and makes them nervous. For clients, UX problems are best described with scenarios that show stories of how design flaws affected users and how your UX engineering process finds and fixes those problems. Here is where screen shots and very short before- and-after video clips (made anonymous with proper video blurring or with written permission of participants) can be effective.
17.5.5 Formative Evaluation Reporting Format and Vocabulary
Consistency in reporting UX problems is important for all audiences. Evaluation reports are, above all, a means of communication, and understanding is hampered by wildly varying vocabulary, differences among diagnoses and descriptions of the same kinds of problems, the language and style of expression in UX problem descriptions, and level of description contained (e.g., describing surface observables versus the use of abstraction to get at the nature of the underlying problem). Standards for reporting formative results, such as the CIF for formative results (discussed earlier in this chapter), help control broad variation in content, structure, and quality of UX problem reports.
To convey the essence of a problem and its potential solutions it can take several sentences or even a paragraph or more. As a more readable alternative to putting all that text in a table (e.g., a cost-importance table), you can
put identification numbers in the problem-and-solution columns and write out the descriptive text in paragraph form. For example, you might put a "1" in the problem column of your table and maybe "1a" and "1b" to represent two possible solutions in that column. Then, in the accompanying text, you might have:
Problem 1: Help system was difficult to use.
Solution 1a: Use hypertext for the help screens, including a table of contents for general help.
Solution 1b: Use context-sensitive help. For example, when users are in the HotList dialogue box and click on the Help button, they are taken to the HotList help screen instead of the help table of contents or the general help screen.
Jargon
As UX practitioners we, like most others in technical disciplines, have our own jargon-about UX. But, as UX practitioners, we must also know that our UX problem reports are like "error messages" to designers and that guidelines for error message design apply to our reports as well. And one of those guidelines about messages is to avoid jargon.
So, while we might not put jargon in our interaction designs, we might well be tempted to use our own technical language in our reports about UX. Yes,
our audience is supposed to include UX professionals as interaction designers but you cannot be sure how much they share your specialized vocabulary.
Spell things out in plain natural language.
Precision and specificity
You are communicating with others to accomplish an outcome that you have in mind. To get the audience to share the vision of that outcome or to even understand what outcome you want, you need to communicate effectively; perhaps the first rule for effective communication is to be precise and specific. It takes more effort to write effective reports.
Sloppy terminology, vague directions, and lazy hand-waving are likely to be met with indifference, and the designers and other practitioners are less likely to understand the problems and solutions we propose in the report. This kind of effect of a problem report on our audience usually results in their being unconvinced that there is a real problem.
So instead of saying a dialogue box message text is hard to understand and recommending that someone write it more clearly, you should, in fact, make your own best effort at rewording to clarify the text and say why your version is better. The criterion for effectiveness is whether the designer who receives your problem report will be able to make better design choices (Dumas, Molich, & Jeffries, 2004).
17.5.6 Formative Reporting Tone
The British are too polite to be honest, but the Dutch are too honest to be polite.
-Candid Dutch saying
All your audiences deserve respect in evaluation reports. Start with customers and clients, for example; most UX practitioners appreciate the need to temper their reports with a modicum of restraint. But even your own team should be addressed with courtesy.
Respect feelings: Bridle your acrimony
Do not attack; do not demean; do not insult. Your goal is to get designers to act on the reports and fix the problems. Their anger and resentment, while possibly offering a measure of joy to the occasional twisted evaluator, will not serve your professional goals. A UX problem report is a dish best served cold.
As Dumas, Molich, and Jeffries put it: "express your annoyance tactfully."
It is true that typos, spelling errors, or grammatical boo-boos are very avoidable. They tend to agitate evaluators, but you must be professional and resist impassioned attacks. Avoid comments that use the terms unconscionable, unprofessional, incompetent, no excuse for this, this is nonsense, this is lazy, this is sloppy, or these designers clearly do not care about our users.
Some evaluators believe that being too polite might get in the way or be perceived as a little condescending and that being blunt helps convey the message; "someone obviously needs strong words to make them see how far off base they are. Being 'diplomatic' and euphemistic about the problems
just leads to our reports being discounted and ignored." But please understand that a UX report is not a forum for practitioners with axes to grind. Many designers say they are insulted by emotional rants and that "being blunt is not helpful; it is simply rude" (Dumas, Molich, & Jeffries, 2004).
The bottom line is: be likeable. Likeability breeds persuasiveness (Wilson, 2007) and projects a collaborative atmosphere rather than an adversarial one.
Accentuate the positive and avoid blaming
Most practitioners do realize that they should start with good things to say about the system being evaluated. However, even when encouraged to be positive, some practitioners in studies (Dumas, Molich, & Jeffries, 2004) proved to be reticent in this regard. This may be because their usual audience is the project team, who just wants to know what the problems are so that they can start fixing them. The evaluators may believe that the designers are strictly technical people who would not involve their feelings in their work.
However, even if the report is mainly critical, it is best to start with something positive. Include information about places where participants did not have problems, where they were successful in task completion, and where users expressed great satisfaction or joy of use. Videos clips of good things happening can start things off with very positive feelings. The rest is, then: "We are on a roll: How can we make it even better?"
Work hard to present reports about design flaws as opportunities for design improvement, not as a criticism. A good way to do this is to remind them that the goal of formative evaluation is to find problems so that you can fix them. Therefore, a report containing information about problems found is an indication of success in the process. Congratulations, team; your process
is working!
17.5.7 Formative Reporting over Time
Do not delay or postpone evaluation reporting. Get the report out as early as possible. Once the job is done, people who need the results need them immediately. News about problems received later than necessary may not be well received and might have to be "tabled" until the next version. It is especially important to keep your project team in the loop: give them a preliminary report; do not make them wait for the final report a month later.
Keep them updated continuously. The full written report should be sent
out within a few days or weeks, not months, and there should be no surprises by this time.
The UPA Workshop Report (Theofanos et al., 2005) clearly established that most UX professionals deliver more than one evaluation report over time.
Starting immediately after testing, reports lean toward raw, undigested data, notes, and observations. Later reports tend to be more formal, with analysis applied to smooth out raw data and findings. Later, archival reports may be a way of saving all the original recordings, logs, and notes in case the evaluation must be repeated in the future.
17.5.8 Problem Report Effectiveness: The Need to Convince and Get Action with Formative Results
Wilson (2007) poses the question this way: "How do I get the product team (or the "developers") to listen to my recommendations about how to make the product better?" If it involves software implementers or others who may not have been part of the evaluation effort or who care more about programming effort than usability or user experience, your report may need to do some selling.
Without being manipulative, you can offer positive benefits to the project team and management that will buy good will and, quid pro quo, get your UX role taken seriously. Wilson (2007) tells about a usability team report that included a section on "Good Things About Product X." It was so well appreciated by sales and marketing that they gave "bootleg" copies to their customers. In turn, marketing gave the UX team new and extended access to customers.
Law (2006) conducted a study on the factors that influenced designers in deciding which problems to fix. She defines downstream utility as "the effectiveness with which the resolution to a UX problem is implemented," determined analytically in terms of the impact of fixing or not fixing the UX problems. The developer effect is the "developers' bias toward fixing UX problem with particular characteristics." To Law, the persuasive power of usability test results to induce fixes and the effectiveness of the fixes depend on factors such as:
� problem severity: more severe problems are more salient and carry more weight with designers
� problem frequency: more frequently occurring problems are more likely to be perceived as "real"
� perceived relevance of problems: designers disagreeing with usability practitioners on the relevance (similar to "realness") of problems did not fix problems that practitioners recommended be fixed
Elaborateness of usability problem descriptions and redesign proposals turned out to not be a factor in influencing designers, suggesting diminishing returns for increased verbosity in usability problem descriptions. Similarly (and possibly counter intuitively), an estimated effort to fix a problem did not seem to be an influence.
Hornb�k and Fr0kj�r (2005) interviewed designers regarding the utility of redesign proposals in evaluation reports. An essential conclusion was that designers do not want to see just problem descriptions without redesign proposals. Even if they did not take the direction recommended in a redesign proposal, the proposal usually gave them new ideas about how to attack even well-known problems.
Finally, beware of the passive-aggressive reception of your report. In our consulting we have seen designers agree with evaluation reports mainly because their managers had established evaluation and iteration as required parts of the lifecycle process. They agree to make a few changes and to consider the rest. But in these cases it was not a buy-in but a sop to make the UX people go away. No convincing was possible and no door was left open to try.
In the final analysis and depending on the size and makeup of your team, the need to convince designers to make the changes you recommend becomes about cultivating trust. If UX practitioners deliver high-quality UX problem reports, with supporting data, it builds trust with the designers. As the working relationship develops and trust grows, there is less need for convincing.
Sometimes project groups work together for one project and then team up with others for the next project, not working together long enough to develop a real trusting relationship. The less rapport and empathy among team members, the more need for high-quality evaluation reports presented in a consistent format and, possibly, supported with data.
If the trust level is high within your audience, you can keep your evaluation reports simple and focus on results and recommendations rather than persuasion.
17.5.9 Reporting on Large Amounts of Qualitative Data
If you are reporting on a large amount of formative evaluation, about a large number of UX problems, you need to be well organized. If you ramble and jump around among different kinds of problems without an integrated perspective, it will be like a hodgepodge to your audience and you will lose them, along with their support for making changes based on your evaluation.
One possible approach is to use a highly abridged version of the affinity diagram technique (Chapter 4). We showed how to use an affinity diagram to organize work activity data, and you can use the same technique here to organize
all your UX problem data for reporting. Post notes about each problem at the detailed level and group them according to commonalities, for example, with respect to task structure, organization of functionality, or other system structure.
17.5.10 Your Personal Presence in Formative Reporting
Do not just write up a report and send it out, hoping that will do the job.
If possible, you should be there to make a presentation when you deliver the report. The difference your personal presence at the time of reporting can make in reaching your goals, especially in influencing and convincing, is inestimable. Nothing beats face-to-face communication to set the desired tone and expectations. There is no substitute for being there to answer questions and head off costly misunderstandings. If the audience is distributed geographically, this is a good time to use videoconferencing or at least a teleconference.
Wrapping up UX Evaluation 18
Objectives
After reading this chapter, you will:
1. Understand goal-directed UX evaluation
2. Know how to select suitable UX evaluation methods
3. Understand how to be practical in your approach to UX evaluation, knowing when to be flexible about processes
18.1 GOAL-DIRECTED UX EVALUATION
The bottom line for developing the evaluation plan is to be flexible and do what it takes to make it work. We have given you the basics; it is up to you to come up with the variations. Adapt, evolve, transform, invent, and combine methods (Dray & Siegel, 1999). For example, supplement your lab-based test with an analytical method (e.g., your own UX inspection).
18.1.1 No Such Thing as the "Best UX Evaluation Method" When you seek a UX evaluation method for your project, it usually is not about which method is "better," but which one helps you meet your goals in a particular evaluation situation. No existing evaluation method can serve every purpose and each has its own strengths and weaknesses. You need to know your goals going in and tailor your evaluation methods to suit.
And goals can vary a lot. In the very early design stages you will likely have a goal of getting the right design, looking at usefulness and a good match to high-level workflow needs, which requires one kind of evaluation. When you get to later stages of refining a design and mainly want feedback about how to improve UX, you need an entirely different kind of evaluation approach.
So much has been written about which UX evaluation method is better, often criticizing methods that do not meet various presumed standards. Too often
these debates continue in absolute terms, when the truth is: It depends! We believe it is not about which method is "better," but which one helps you meet your goals in a particular evaluation situation.
In fact, each aspect of the overall evaluation process depends on your evaluation goals, including design of evaluation sessions (e.g., focus, procedures), types of data to be collected, techniques for data analysis, approaches to reporting results, and (of course) cost. Tom Hewett was talking about evaluation goals way back in the 1980s (Hewett, 1986).
18.2 CHOOSE YOUR UX EVALUATION METHODS
This is an overview of some UX evaluation methods and data collection techniques and how to choose them. Other methods and techniques are available, but most of them will be variations on the themes we present here.
Here we use the term "evaluation method" to refer to a choice of process and the term "technique" as a skill-based activity, usually within a method. For example, lab-based testing is an evaluation method, and critical incident identification is a data collection technique that can be used to collect qualitative data within the lab-based method. One of the earliest decisions you have to make about your approach to UX evaluation in any given stage of any given project is the basic choice of UX evaluation method.
18.2.1 Goals Tied to Resources
Sometimes you just cannot afford a fully rigorous UX evaluation method. Sometimes paying less is necessary; it is a rapid or "discount" method or nothing. Some criticism of discount methods is based on whether they are as effective as more expensive methods, whether they find UX problems as well as other methods such as lab-based UX testing. This misses the point: you choose rapid or "discount" evaluation methods because they are faster and less expensive! They are not generally as effective as rigorous methods, but often they are good enough within an engineering context.
Some criticism is even aimed at whether "discount" methods provide statistically significant results. How can that be an issue when there is no intention for that kind of result? A more appropriate target for critical review of "discount" methods would be about how much you get for how much you pay. Different methods meet different goals. If statistical significance is a goal, only formal summative evaluation will do, and you have to pay for it.
18.2.2 UX Evaluation Methods and Techniques by Stage of Progress
In Table 18-1 we show some simple stages of design representations such as storyboards and prototypes and representative formative evaluation approaches appropriate to those stages.
Design walkthroughs and other design reviews are rapid and flexible methods that employ informal demonstrations of design concepts or early versions of designs to obtain initial reactions before many design details exist. Usually at this point in a project, you have only scenarios, storyboards, screen sketches, or, at most, a low-fidelity (non-interactive) prototype.
So you have to do the "driving"; it is too early for anyone else in a user role to engage in real interaction. Walkthroughs are an important way to get early feedback from the rest of the design team, customers, and potential users. Even early lab-based tests can include walkthroughs (Bias, 1991). Also, sometimes the term is used to refer to a more comprehensive team evaluation, more like a team- based UX inspection.
By the time you have constructed a low-fidelity prototype, a paper prototype, for example, UX inspection methods and lightweight quasi-empirical testing are appropriate. Inspection methods are, in practice, perhaps the most used and most useful UX evaluation methods for this stage.
Sometimes evaluators overlook the value of critical review or UX inspection by a UX expert. Unlike most participants in lab-based testing, an expert will be broadly knowledgeable in the area of interaction design guidelines and will have
Design scenarios (Chapter 6), storyboards (Chapter 8), and detailed design (Chapter 9)
Design walkthroughs (Chapter 13) Local evaluation (Chapter 13)
Table 18-1
Appropriateness of various formative UX evaluation approaches to each stage of progress within the project
Low-fidelity prototypes (Chapter 11) UX inspection (Chapter 13)
Quasi-empirical UX testing (Chapter 13)
High-fidelity prototypes (Chapter 11)
RITE (Chapter 13)
Programmed prototype (Chapter 11) or operational product
Rigorous (e.g., lab-based) UX testing (Chapters 12 and 14 through 17)
RITE (Chapter 13)
Alpha, beta testing (Chapter 13)
Post-deployment User surveys/questionnaires (Chapter 12) Remote UX evaluation (Chapter 13) Automatic evaluation (Chapter 13)
extensive experience in evaluating a wide variety of interaction styles. The most popular UX inspection method (Chapter 13) is the heuristic evaluation (HE) method.
High-fidelity prototypes, including programmed prototypes and operational products, are very complete design representations that merit complete or rigorous evaluation, as by lab-based testing. The RITE method is a good choice here, too, because it is an empirical method that uses participants but is rapid. Alpha and beta testing with selected users and/or customers are appropriate evaluation methods for pre-release versions of the near-final product.
Finally, you can continue to evaluate a system or product even after deployment in the field via remote surveys and/or questionnaires and remote UX evaluation, a method that has the advantage of operating within the context of real-world usage.
18.2.3 Synthesize Your Own Hybrid Method
As you gain experience, you will synthesize hybrid approaches as you
go, adapting, transforming, and combining methods to suit each unique situation. For example, in a recent UX session we observed some phenomena we thought might be indicative of a UX problem but which did not lead
to critical incidents occurring with users in the lab. So after the UX lab session, we added our own analytic component to investigate these issues further.
In this particular example, the system was strongly organized and presented to the user by function. Users found it awkward to map the functions into task sequences; they had to bounce all over the interface to access parts that were closely related in task flow. It took some intense team analysis to get a handle on this and to come up with some specialized local prototypes to test further just this one issue and evaluate some alternative design solutions.
This is not the kind of UX problem that will just pop out of a critical incident with a user in the lab as does, for example, a label wording commented on by a user. Rather, it is a deeper UX problem that might have caused users some discomfort, but they may have been unable to articulate the cause. Sometimes a problem requires the attention of an expert evaluator or interaction designer who can apply the necessary abstractions. If we had not been willing to do some additional analysis, not part of the original plan, we might have missed this important opportunity to improve the quality of user experience in our design.
18.3 FOCUS ON THE ESSENTIALS
18.3.1 Evaluate Design Concepts before Details
When practitioners do UX testing, they often think first of low-level UX, the usability about details of button placement and label wording, and most
lab-based usability evaluation ends up being aimed at this level. As we said in earlier chapters, however, the fit of the design to support work practice is fundamental. Find out how they do the work and design a system to support it. What is the point of using low-level usability evaluation to hone the details of what is basically a poor, or at least the wrong, design while blocking your ability to see broader and more useful design ideas?
Our friend and colleague Whitney Quesenbery (2009) has said:
If there is a single looming problem in UX these days, it is that usability analysts too often get caught up in enumerating detailed observations, and spend a good deal less time than they should in thinking carefully about the underlying patterns and causes of the problems. This turns UX into a sort of QA checklist, rather than letting us usability analysts be the analysis partners we should be to the designers. Some of this, of course, is a legacy of having UX evaluation done too late, some is because there is often so much to do in fixing "obvious" problems, but some is because we have not taken seriously our role in supplying insights into human behavior.
In early stages when the concepts are young, formative evaluation should focus on high-level concepts, such as support of the work with the right work roles, workflows, the right general form of presenting the design to the user, mission-critical design features (e.g., potential "hanging chads" in the design), and safety-related features.
18.3.2 User Experience In Situ vs. User Reflections
The more directly the evaluation is related to actual usage, the more precise its indicators of UX problems will be. It is difficult to beat empirical evaluation with a "think-aloud technique" for accessing immediate and precise UX data when evaluating the design of an artifact in the context of usage experience.
Indirect and subjective evaluation such as obtained by a post-performance questionnaire can be less expensive and, if focused carefully on the important issues, can be effective. Alpha and beta testing (Chapter 13) are even less direct to the usage experience, severely limiting their effectiveness as formative evaluation methods.
At least when it comes to detailed formative evaluation, Carter (2007) nicely reminds us to think in terms of "inquiry within experience," evaluating the design of an artifact in the context of usage experience. To Carter, a survey or questionnaire is an abstraction that removes us from the user and the usage events, sacrificing a close connection to user experience in real time while the design is being used.
User surveys or questionnaires, even when administered immediately at the end of a UX testing session, are retrospective. A questionnaire produces only subjective data and, as Elgin (1995) states, "Subjective feedback is generally harder to interpret than objective feedback in a known setting " More
importantly, survey or questionnaire data cannot be the immediate and precise data about task performance that are essential for capturing the perishable details necessary to formative UX evaluation.
For evaluating design details, it is about getting into the user's head with "thinking-aloud" techniques. But the further the time of UX data capture from the time of occurrence, the more abstraction and need for retrospective recall, thereby the more loss of details. We looked at techniques for capturing indicators of UX problems and for capturing indicators of success with UX as observed directly within the real-time occurrence of usage experience (Chapter 12).
18.3.3 Evaluating Emotional Impact and Phenomenological Aspects
Do not just focus on usability or usefulness in your evaluations. Remember that one of your most important evaluation goals can be to give emotional impact and phenomenological aspects attention, too. Specific evaluation methods
for emotional impact were detailed in Chapter 12.
18.4 PARTING THOUGHTS: BE FLEXIBLE AND AVOID DOGMA DURING UX EVALUATION
18.4.1 Keep the Flexibility to Change Your Mind
Your organization is paying for evaluation, and it is your responsibility to get the most out of it. The key is flexibility, especially the flexibility to abandon some goals in favor of others in conflict situations. Your evaluation goals can be predetermined, but you can also come up with new or different goals as a result of what happens during evaluation.
As an illustration of staying flexible within goal-directed UX evaluation, consider an evaluation session in which you are doing the typical kind of UX evaluation and something goes wrong for the user. If you stop and ask the user questions about what went wrong and why, you will lose out on gathering quantitative performance data, one of your goals. However, because that data now will not be very useful, and understanding UX problems and their causes is your most important goal, you should stop and talk.
In most formative evaluation cases, when there is a conflict between capturing quantitative (e.g., task timing) and qualitative data (e.g., usability problem identification), qualitative data should take precedence. This is because when user performance breaks down, performance data are no longer useful, but the incident now becomes an opportunity to find the reason for the breakdown.
As part of your pre-evaluation preparation, you should prioritize your goals to prepare for conflicting goal situations. As Carter (2007) says, you should retain the freedom for interruption and intervention (even when the original goal was to collect quantitative data).
18.4.2 Do Not Let UX Evaluation Dogma Supplant Experience and Expertise
Give people a process they can follow and sometimes they hang on to it until it becomes a religion. We encourage you, instead, to do your own critical thinking. You still have to use your head and not just follow a "process."
Be ready to adapt and change directions and techniques. Be ready to abandon empirical testing for thoughtful expert analytic evaluation if the situation so demands.
According to Greenberg and Buxton (2008), "evaluation can be ineffective and even harmful if naively done 'by rule' rather than 'by thought.'" Instead of following a plan unquestioningly, you must make choices as you go that will help you reach your most important goals. It is your job to think about and judge what is happening and where things are going within the project.
Sauro (2004, p. 31) warns against "one-size-fits-all usability pronouncements" mimed and unencumbered by the thought process.
The dogma of our usual doctrine of usability testing reveres objective observation of users doing tasks. But sometimes we can evaluate a design subjectively, applying our own personal knowledge and expertise as a UX professional. As Greenberg and Buxton (2008, p. 114) put it, "Our factual methods do not respect the subjective: they do not provide room for the experience of the advocate, much less their arguments or reflections or intuitions about a design."
Greenberg and Buxton quote a wonderful passage from the literature of architecture about the "experienced designer-as-assessor." The quote (from Snodgrass and Coyne, 2006, p. 123) defends design evaluation by an architect designer, illustrating the fact that just because the person is a designer does not rule them out as an evaluator.
In fact, the designer has acquired a rich understanding of architectural design principles, processes, and evaluation criteria. They make the case that just because an evaluation is done subjectively by an expert, it does not mean that the results will be wild and uncontrolled. The work of expert assessors is built on a common foundation of knowledge, fundamentals, and conventions. The striking similarity to our situation should not be surprising-it is all about design.
18.5 CONNECTING BACK TO THE LIFECYCLE
Congratulations! You have just completed one complete iteration through the Wheel interaction design and evaluation lifecycle template. This ends the process part of the book. You have only to implement your chosen design solutions and realize the benefits of improved usability and user experience, connecting back to the UX lifecycle, cycling back through design and prototyping and evaluation again.
When you connect each UX problem back to the lifecycle for fixing and iteration, where do you connect? You have to use your engineering judgment about what each problem needs in order to get fixed.
For most simple UX problems, much of the work for fixing the problems has been done when you worked out design solutions for each of the observed problems. However, a UX problem that seems to involve a lack of understanding of work practice and/or requirements may need to connect back to contextual inquiry and contextual analysis rather than just going back to design.
UX Methods for Agile Development
19.1 INTRODUCTION
Just as our use of the term UX is convenient shorthand for the entire broad concept of designing for user experience, we use the term "SE" to refer to the entire broad concept that embraces the terms software engineering, software, software development, and the software engineering domain. This chapter is about an important way those two domains can come together in an efficient project development environment.
We believe that the rigorous UX process (Chapters 3 through 12 and 14 through 17) is the most effective path to ensuring a quality user experience with systems with complex work domains and complex interaction. However, because the fully rigorous UX process is also the most expensive and time-consuming, it cannot always be applied. Nor is it always appropriate, for example, for systems and products at the other end of the system complexity space in Figure 2-7.
Less-than-perfect development environments, short schedules, and limited budgets demand effective ways to adapt UX process methods to the turmoil and pressure of the real professional world. Anxious customers, especially for systems with simple domains such as commercial products, may demand early deliverables, including previews of prototypes about which they can give
feedback. In such cases practitioners can use abridged versions of the fully rigorous lifecycle process, skipping some process activities altogether and using rapid techniques for others.
Alternatively, the software development side might require an agile development environment. Agile SE approaches, now well known and popularly used, are incremental, iterative, and test-driven means of delivering pieces of useful working software to customers frequently, for example, every two weeks. However, agile SE approaches do not account for UX. Because traditional UX processes do not fit well within a project environment using agile SE methods, the UX side must find ways to adjust their methods to fit SE constraints.
Therefore, the entire system development team needs an overall approach that includes UX while retaining the basics of the SE approach. In this chapter we present a variation of our UX process methods that will integrate well
with existing agile SE processes by accounting for the constraints imposed by those agile SE processes.
We begin by describing the essence of the agile SE approach and then identify what is needed on the UX side so that the two processes fit together. Finally, we describe an approach that brings the UX lifecycle process and agile SE together, retaining the essentials of each but requiring some adjustments on both sides.
19.2 BASICS OF AGILE SE METHODS
Much of this section is based on Beck (2000), one of the most authoritative sources of information on agile SE development methods as embodied in the approach called eXtreme Programming (XP).1 We have taken words from Beck and other authors and tried to blend them into a summary of the practice.
Accurate representations are credited to these authors while errors in representation are our fault.
19.2.1 Characteristics of Agile SE Methods
Agile SE development methods begin coding very early. Agile SE has a shorter, almost nonexistent, requirements engineering phase and far less documentation than that of traditional software engineering. As typified in XP, agile SE code implementation occurs in small increments and iterations.
Small releases are delivered to the customer after each short iteration, or development cycle. In most cases, these small releases, although limited in
1There are other "brands" of approaches to agile SE methods beyond XP, including Scrum (Rising & Janoff,
2000), but for convenience we focus on XP.
functionality, are intended to be working versions of the whole system, which run by themselves. Nonetheless, each release is supposed to deliver some useful capability for the customer.
In simplest terms, agile SE development methods "describe the problem simply in terms of small, distinct pieces, then implement these pieces in successive iterations" (Constantine, 2002, p. 3). Each piece is tested until it works and is then integrated into the rest. Next the whole is tested in what Constantine calls "regression testing" until it all works. As a result, the next iteration
always starts with something that works.
To clarify the concept of agile software methods, a group met at a workshop in Snowbird, Utah in February 2001 and worked out an "agile manifesto" (Beck, 2000). From their stated principles behind this manifesto, goals for agile software development emerged:
� Satisfy the customer by giving them early and continuous deliverables that produce valuable and working software.
� Recognize that changing requirements are the norm in any software development effort.
� Understand that time and budget constraints must be managed.
Practitioners of agile SE methods value (Beck, 2000):
� Individuals and interactions over processes and tools
� Working software over comprehensive documentation
� Customer collaboration over contract negotiation
� Responding to change over following a plan
The agile software development methods are further characterized by the need for communication, especially continuous communication with the customer. Informal communication is strongly preferred over formal. Close communication is emphasized to the point that they have an onsite customer as part of the team, giving feedback continuously.
A main principle of agile SE methods is to avoid Big Design UpFront (BDUF).
This means the approach generally eschews upfront ethnographic and field studies and extensive requirements engineering. The idea is to get code written as soon as possible and resolve problems by reacting to customer feedback later.
And because change is happening everywhere, SE practitioners verify that they are writing the code correctly by the practice of pair programming. Code is written by two programmers working together and sharing one computer and one screen, that is, always having a colleague watching over the programmer's shoulder.
Of course, pair programming is not new with agile methods. Even outside agile SE methods and before they existed, pair programming was a proven technique with a solid track record (Constantine, 2002). Another way they verify the code being written is via regular and continuous testing against an inventory of test cases.
Figure 19-1
Comparison of scope of development activities across methodologies, taken with permission from Beck (1999, Figure 1).
Figure 19-2
Abstraction of an agile SE release iteration.
19.2.2
Lifecycle Aspects
If this process were to be represented by a lifecycle diagram, it would not be a waterfall or even an iteration of stages, but a set of overlapping micro- development activities. In the waterfall approach, developers finish entire requirements analysis before starting design and the entire design before starting implementation.
But in agile approaches developers do just enough-a micro-level of each activity-to support one small feature request; see Figure 19-1, which illustrates XP as an example agile method. In the middle of these extremes are approaches where these activities are performed in larger scope units.
In these middle-of-the-road approaches, lifecycle activities are applied at the level of overall system components or subsystems. In contrast, in agile methods it
is applied at the level of features in those components.
For example, building an e-commerce Website in the waterfall approach would require listing all requirements that must be supported in the Website before starting a top-down design. In an agile approach, the same Website would be built as a series of smaller features, such as a shopping cart or check out module.
19.2.3 Planning in Agile SE Methods
In our discussion of how an agile SE method works, we are roughly following XP as a guide. As shown in Figure 19-2, each iteration consists of two parts: planning and a sprint to implement and test the code for one release.
Customer stories
The planning part of each iteration in Figure 19-2 yields a set of customer-written stories, prioritized by cost to implement. A customer story, a key
concept in the process, has a role a bit like that of a
use case, a scenario, or a requirement. A customer story, written on a story (index) card, is a description of a customer- requested feature. It is a narrative about how the system is supposed to solve a problem, representing a chunk of functionality that is coherent in some way to the customer.
Story-based planning
Expanding the "planning" box of Figure 19-2, we get the details of how customer stories are used in planning, as shown in Figure 19-3.
As shown in Figure 19-3, developers start the planning process by sitting down with onsite customer representatives. They ask customer representatives to think about the most useful chunks of functionality that can add business or enterprise value. The customer writes stories about the need for these pieces of functionality. This is the primary way that the developers understand users and their needs, indirectly through the customer representatives.
Developers assess the stories and estimate the effort required to implement (program) a solution for each, writing the estimate on the story card. Typically, in XP, each story gets a 1-, 2-, or 3-week estimate in "ideal development time."
The customer sorts and prioritizes the story cards by choosing a small set for which the cost estimates are within a predetermined budget and which represent features they want to include in a "release." Prioritization might result in lists of stories or requirements labeled as "do first," "desired-do, if time," and "deferred-consider next time." Developers break down the stories into development tasks, each written on a task (for the developers
to do) card.
The output of the planning box, which goes to the upcoming implementation sprint, is a set of customer-written stories, prioritized by cost to implement.
Controlling scope
Customer stories are the local currency in what Beck (2000, p. 54) calls the "planning game" through which the customer and the developers negotiate the scope of each release. At the beginning there is a time and effort "budget" of the
Figure 19-3
Customer stories as the basis of planning.
person-hours or level of effort available for implementing all the stories, usually per release.
As the customer prioritizes story cards, the total of the work estimates is kept and, when it reaches the budget limit, the developers' "dance card" is full. Later, if the customer wants to "cut in" with another story, they have to decide which existing customer story with an equal or greater value must be removed to make room for the new one. So no one, not even the boss, can just add more features.
This approach gives the customer control of which stories will be implemented but affords developers a tool to battle scope or feature creep. Developer estimates of effort could be way off, probably in most cases underestimating the effort necessary, but at least it lets them draw a line.
With experience, developers get pretty good at this estimation given a particular technology platform and application domain.
19.2.4 Sprints in Agile SE Methods
Expanding the "sprint" box of Figure 19-2, as shown in Figure 19-4, each agile SE sprint consists of activities that are described in the following sections.
Acceptance test creation
The customer writes the functional acceptance tests. There is no process for this, so it can be kind of fuzzy, but it does put the customer in control of acceptance of the eventual code. With experience, customers get good at this.
Unit code test creation
The team divides the work by assigning customer stories to code for that sprint. A programmer picks a customer story card and finds a programming partner. Before any coding, the pair together writes unit tests that can verify that functionality is present in the code that is yet to be written as an implementation.
Figure 19-4
An agile SE sprint.
Implementation coding
The programming pairs work together to write the code for modules that support the functionality of the customer story. As they work, the partners do on-the-fly design (the agile SE literature says almost nothing about design).
The programmers do not worry about the higher level architecture; the system architecture supposedly evolves with each new slice of functionality that is added to the overall system. The programming pair integrates this code into the latest version.
Code testing
Next, the programming pair runs the unit code tests designed for the modules just implemented. In addition, they run all code tests again on all modules coded so far until all tests are passed. By testing the new functionality with not only tests written for this functionality, but with all tests written for previous pieces of functionality, SE developers make sure that the realization of this functionality in code is correct and that it does not break any of the previously implemented modules. This allows developers to make code modifications based on changing requirements, while ensuring that all parts of the code continue to function properly.
Acceptance testing and deployment
Developers submit this potentially shippable product functionality to the customer for acceptance review. Upon acceptance, the team deploys this small iterative "release" to the customer.
19.3 DRAWBACKS OF AGILE SE METHODS FROM THE UX PERSPECTIVE
From the SE perspective, much about agile SE methods is, of course, positive. These methods make SE practitioners feel productive and in control because they, and not some overarching design, drive the process. These methods are less expensive, faster, and lighter weight, with early deliverables. The pair- programming aspect also seems to produce high-reliability code with fewer bugs. Nonetheless, agile SE methods are programming methods, developed by programmers for programmers, and they pose some drawbacks from the UX perspective.
Agile SE methods, being driven predominantly by coders, are optimized for code, judged by quality of code, and have a strong bias toward code
concerns. There is no definition or consideration of usability or user experience (Constantine, 2002). Users, user activities, and the user interface are not part of the mix; the user interface becomes whatever the code hackers produce by chance and then must be "fixed" based on customer feedback.
In addition, there is no upfront analysis to glean general concepts of the system and associated work practice. The one customer representative on the team is not required even to be a real user and cannot represent all viewpoints, needs and requirements, usage issues, or usage context. There may be no
real user data at all upfront, and coding will end up being "based only on assumptions about user needs" (Memmel, Gundelsweiler, & Reiterer, 2007, p. 169). There is no identification of user tasks and no process for identifying tasks.
Beyer, Holtzblatt, and Baker (2004) echo this criticism of using a customer representative as the only application domain expert. As they point out, under their "Axiom 2: Make the user the expert," many customer representatives are not also users and, therefore, cannot necessarily speak for the work practice of others. Instead, they recommend devoting an iteration of contextual analysis with real users to requirements definition. They say they have done quick contextual design with five to eight users through early design solutions in one to two weeks.
Beyond these specific drawbacks, the agile SE method has no room for ideation in design. And it can be difficult to scale up the process to work on very large systems. The number of customer stories becomes large without bounds.
Similarly, it is difficult to scale up to larger development groups; the time
and effort to communicate closely with everyone eventually become prohibitive and close coordination is lost.
19.4 WHAT IS NEEDED ON THE UX SIDE
In some ways, the UX process lifecycle is a good candidate to fit with agile software methods because it is already iterative. But there is a big difference.
The traditional UX lifecycle is built on having a complete understanding of users and their needs long before a single line of software code is ever written. In this section we discuss the considerations necessary to adjust UX methods to adapt to agile SE approaches (Memmel, Gundelsweiler, & Reiterer, 2007).
To work in the UX domain, an agile method must retain some early analysis activities devoted to understanding user work activities and work context and
gleaning general concepts of the system. At the same time, to be compatible with the agile SE side of development, UX methods must:
� be lightweight
� emphasize team collaboration and require co-location
� include effective customer and user representatives
� adjust UX design and evaluation to be compatible with SE sprint-based incremental releases by switching focus from top-down holistic design to bottom-up features design
� include ways to control scope
Thoughts on Integrating UX into Agile Projects
Dr. Deborah J. Mayhew, Consultant, Deborah J. Mayhew & Associates1 CEO, The Online User eXperience Institute2
I have been working in software development since 1975 and have watched a number of formal, commercial
software engineering methodologies (including one of the first in the industry) emerge, evolve, and be abandoned for the next new idea. The latest one is agile, which has been around for almost 10 years now. When it first emerged
I had been working in the UX field for about 20 years. I had written my book The Usability Engineering Lifecycle (1999b,
Morgan Kaufmann Publishers), which describes a top-down, iterative approach to software user interface design set in the context of the prevalent development methodologies at that time: object-oriented software engineering and iterative development. The Usability Engineering Lifecycle (UEL) fit in quite nicely with those software engineering methodologies.
Then the agile methodology began to get popular. Initially, I learned about this new development methodology by working with a client who was building project management software tools to support agile project managers.
I was trying to help this client design a usable interface to their software tools, and they were developing their tools using the agile methodology, so I had to try to adapt my approach to usability engineering into that methodology.
I struggled with this because it seemed to me that there was an inherent conflict: the agile methodology focuses on very small modules of functionality and code at a time, doing requirements analysis, design and development start to finish on each one, often with different development teams assigned to different modules being developed on overlapping timelines.
In contrast, the successful design of a user interface requires a top-down approach because all the functionality must be presented through a consistent overall architecture. A piecemeal approach to interface design, in which many different developers design separate modules of code, had always been a big stumbling block to good user
1http://drdeb.vineyard.net 2http://www.ouxinstitute.com
interface design in the past, and it seemed that the agile methodology would perpetuate that problem even while solving others.
So, I was skeptical that good usability engineering practices could be applied successfully in an agile development environment.
I have only recently had another opportunity to work with a development team following some version of the agile methodology and have evolved with them what I think is a successful approach to overcoming the potential conflicts between agile and the UEL approach to designing for an optimal user experience.
My client had already adapted the agile methodology in a way consistent with Nielsen's findings about best practices for integrating usability engineering into agile development projects (Jakob Nielsen's Alertbox, November 4, 2009, http://www.useit.com/alertbox/agile-user-experience.html). In Nielsen's words, the key things are to
Separate design and development, and have the user interface team progress one step ahead of the implementation team ...
and to
Maintain a coherent vision of the user interface architecture. Create the initial vision during a 'sprint zero' period-before any implementation has started-and maintain it through annual (or semi-annual) design vision sprints .. .
I believe that these approaches are necessary but not sufficient. They do not solve the problem of separate agile teams creating different modules of code, with no one overseeing the user interface across the whole system. Each team can do visioning up front, and design before development in a "sprint zero" phase, but that does not ensure consistency in the user interface across code modules developed by different teams.
My client had embarked on a very large, multiyear project. They were breaking down what in the end would be one very large functionally rich system into small chunks of functionality and assigning these chunks to different agile teams, each with their own assigned user interface designers. These teams were working mostly independently on overlapping project schedules. They had an approach similar to Nielsen's ideas of a planning phase at the beginning of each project for high- level design, and a "sprint zero" in which detailed user interface design could start and stay a step ahead of coding. There was, however, only haphazard communication and coordination across teams regarding user interface design.
To be a little more concrete, an analogous project to the one I actually worked on would be the development of a system to support the customer support representatives of a credit card company. The functional chunks cast as separate projects in this analogy would be individual user tasks, such as processing requests for monthly payments, balance transfers from other credit cards, adding new credit cards to an account holder's account, closing out an account, contesting a fraudulent charge, and the like. A single customer service representative might be handling all these types of tasks, but different agile teams were designing and developing them.
When I got involved with my client's overall effort, separate projects were at many different stages in their agile processes, some still in the early planning and designing stages, others well along in their sprint process. My role was to provide feedback on design ideas generated by the project interaction designers on the projects, some of which were in early planning, some in sprint zero, and some halfway or more through their sprints.
I started performing heuristic evaluations for individual agile teams, one at a time. As I proceeded from project to project, I naturally started seeing two things: inconsistencies in interface design across projects and less than optimal designs within projects. In response to these observations, I started doing two things.
First, with each heuristic evaluation, I documented some sample redesign ideas in wireframe form to better communicate the issues I was identifying. As I went from one project to the next, I consistently applied the same redesign approaches when I encountered analogous design situations. So, for example, the first time I discovered a need for a widget to expand and contract details, I would document a particular design for that purpose. Then when I encountered the same need on another project, I would be sure to recommend the exact same design to address that need.
Second, I started capturing in list form those design situations that were coming up repetitively across pages and projects (the need to expand and contract information details would be an example of something in that list). This list became the foundation of what would eventually become a set of user interface design standards for the whole system. Those standards might not exactly reflect my design suggestions, but at least everything that a standard could be designed for would be identified and documented in one place.
Sometimes I would design for a situation when I first encountered it, apply it in later situations, and then even later come across the same situation on a new page or project in which my design solution just did not work well. In those cases, I would revisit the consistent design I had generated to date, redesign it, and then reapply the new standard to all analogous situations encountered across all projects I had evaluated to date, as well as going forward.
In this way, a common set of standards was continuously developed and evolved as each new agile project launched and proceeded. Rework was required when the need for a different standard was discovered on new but analogous functionality, but at least there was a single mind (mine) overseeing all the related agile projects so that opportunities for consistency were discovered and ultimately attended to. This just does not happen when different designers are responsible for different modules of functionality and no one is keeping track of the big picture.
I think proceeding in this way to design the overall user interface of a system that is divided up into many agile projects is analogous to designing the system architecture in this methodology. Modules of code designed and developed by separate and relatively independent teams have a similar risk of ending up being a mish mash of inefficient and hard to maintain code. Someone needs to oversee the evolution of the final system architecture, and rework may be required to go back and recode modules that did not adhere to the final system architecture model when they were developed initially. If we are willing to do the rework, this is a reasonable way to address both system architecture and user interface architecture in a methodology that certainly has other benefits but carries the risk of resulting in systems with no underlying models that support both technical and human needs.
19.4.1 The UX Component Must Be Lightweight
According to Memmel, Gundelsweiler, and Reiterer (2007), many of the rigorous UX processes out there, such as Mayhew (1999b), Rogers, Sharp, and Preece (2011), and the full Wheel process described in this book, are considered as heavyweight processes, too cumbersome for the unstoppable trend of shorter time-to-market and shorter development lifecycles. As a result, developers are turning to lighter-weight development processes.
However, the term "lightweight process" can be thought of as a euphemism for "cutting corners." As Constantine (2001) puts it, "shortcutting a proven process means omitting or short-changing some productive activities, and the piper will be paid, if not now, then later."
Sometimes, however, we have no choice. The project parameters demand fast turnaround and the rigorous process simply will not do. Therefore, we seek something in the middle ground, a lighter-weight process that, although compromising quality somewhat, can still meet schedule demands and allow us to deliver a system or product with value for the customer.
Traveling light means communicating rather than documenting. In general, heavyweight SE processes require detailed, up-to-date-documentation and models, whereas lightweight SE processes rely on index cards and hand-drawn abstract models. The artifacts maintained should be few, simple, and valuable (Beck, 2000, p. 42).
19.4.2 The UX Component Requires Collaboration and Co-Location with the SE Team
Traditional UX practice often implied handing off a refined interaction design as a formal hi-fi prototype or a complete wireframe deck. We did our contextual inquiry and analysis independently of their requirements gathering. Now this "fire-walling" of the UX and SE teams will not work. We must work together with the same customer representatives and users.
Our deliverables will now be less formal and somewhat incomplete because the details will be handled on a social channel, meaning we will communicate directly, person to person. Each team, UX and SE, must have access and visibility into the other team's progress, challenges, and bottlenecks so that they can plan to maintain synchronization.
To achieve this intimate communication, the entire project team has to be co-located. You all have to work in the same room, a working arena plus
walls for whiteboards, posters, and diagrams. Everyone has to be continuously present as part of the team-readily available and knowledgeable.
You cannot rely on just email or a call on the phone. When you need to talk with someone else on the team, that person must be sitting with you as you work. This imperative for co-location in the agile approach can be a show- stopper. If, for any reason, your organization cannot afford to keep the entire project team in one location, it will preclude the agile approach.
19.4.3 Effective Customer and User Representatives Are Essential
An important SE requirement is continuous access to one or more co-located customer representatives, but on the UX side we will also need access to real users. Many "methods" call for including customers and users. So when you say that you need a customer representative in your project, others in your organization and in the customer organization may not understand how seriously this role is taken in agile methods and how integral it is to project success. Unless you have articulated criteria for the customer representative role, you are likely to get someone who happens to be available regardless of their real connection to the system.
Your customer and user representatives must truly represent the organization paying for development and must care about the project as a real stakeholder in the outcome. These representatives must have a good knowledge of all work roles, corresponding user classes, workflows, and the work domain. And perhaps the most important requirement is that these representatives must have the authority to make decisions about project scope and enough knowledge about what the organization really needs.
19.4.4 A Paradigm Shift: Depth-First, Vertical Slicing
Almost everything in both the process and the deliverables depends on whether the project approach is breadth first or depth first. Acceptance of the outcome will depend on how well the customer understands these choices and agrees to the approach you choose; it is up to you to set expectations accordingly.
The traditional UX process is breadth first, looking at the whole system broadly from the beginning-methodically and systematically building horizontal slices and integrating and growing them vertically. The resulting product is an integrated system design built top-down or inside-out in "horizontal" layers. This approach involves building a whole elephant
from the inside out, laying down a skeleton, adding inner organs, fleshing it out with muscles to hold it all together, and wrapping it up with a skin.
However, as we said before, this approach works against early deliverables to the customer. There is often nothing to show customers early on. There just is not anything that even looks like an elephant until halfway or even later through the project.
In traditional development methods, associated deliverables will begin with documentation of development work products and descriptions of design- informing models, such as personas, user classes, or task descriptions.
Development does not get to design-representing deliverables such as screen sketches, storyboards, and low-fidelity prototypes until later in the process.
So the customer has to be patient, but "patient" does not describe most customers we have met so far. Nor can you blame them; they do not want to be paying the bills for a long time without seeing any results.
Alternatively, and in almost complete contrast, agile methods are depth first, taking a narrow product scope but starting with more depth, building vertical slices and integrating and growing them horizontally. This is the approach you need when you have limited resources and have short-term demand for design-related deliverables, such as a prototype. The narrow product scope means addressing only a few selected features supporting related user work activities and system functions, but developing them in some depth.
This is like building an elephant by gluing together deep, but narrow, vertical slices. It might be for only a slim section of the backbone, maybe a part of a kidney or a slice of the liver, and a little bit of skin. But you are not going to see anything of the face, the feet, or the tail, for example. In other words, the customer might see some screen sketches and a low-fidelity prototype a lot earlier but they will be limited to a narrow set of features. This agile approach has a benefit in today's development market in that you can get at least something as a running deliverable much faster.
As more and more vertical slices become available, you put them together to construct the whole system. If slices here and there do not quite line up in this integration step, you must adjust them to fit. As you add each new slice, adjust the new slice and/or the rest of the elephant, as needed.
19.4.5 Controlling Scope Is a Necessity
Earlier in this chapter, we explained that agile SE customer stories are the basis for planning, through which the customer and the developers negotiate the scope of each release. At the beginning there is a time and effort "budget" of only so many person-hours or only a certain level of effort available for implementing all the stories.
Exactly the same approach to controlling scope works when UX and SE are integrated, still using the cost to implement the software as the criterion for setting scope boundaries. However, UX plays an involved role in negotiating with customers based on early conceptual design and user experience needs. More about this soon.
19.5 PROBLEMS TO ANTICIPATE
In a special-interest-group workshop at CHI 2009 (Miller & Sy, 2009), a group of UX practitioners met to share their experiences in trying to incorporate a user-centered design approach into the agile SE process. Among the difficulties experienced by these practitioners in their own environments were:
� sprints too short; not enough time for customer contact, design, and evaluation
� inadequate opportunities for user feedback and the user feedback they did get was ignored
� customer representative weak, not committed, and lack of co-location
� no shared vision of broader conceptual design because focus is on details in a bottom-up approach
� there is a risk of piecemeal results
Regarding the last bullet, building a system a little piece at a time is not without risks. Nielsen (2008) claims that agile methods can end up being a terrible way to do usability engineering. His reasons centered mostly on the fact that taking one piece at a time tended to destroy the whole picture of user experience. If requirements come in piecemeal, it is harder to see the big picture or the conceptual design. He claims that a piecemeal process hinders consistency and is a barrier to an integrated design, leading to a fragmented user experience.
Beyer, Holtzblatt, and Baker (2004) also believe that it is difficult to design small chunks of the interaction design without first knowing the basic interaction design architecture-how the system is structured to support user tasks and how the system functions are organized. In contextual design, interaction architecture is established with storyboards and what they call the "user environment design," which they say is just what you need for effective user stories.
In the end, it is up to skilled and experienced UX practitioners to keep the big picture in mind and do as much as possible along the way to maintain coherence in the overall design.
19.6 A SYNTHESIZED APPROACH TO INTEGRATING UX
Because traditional agile SE methods do not consider the user interface, usability, and user experience, there is a need to incorporate some of the user- centered design techniques of UX into the overall system development process.
Most of the related literature is about either adjusting "discount" UX or user-centered design methods to somehow keep pace with existing agile SE
methods or trying to do just selected parts of user-centered design processes in the presence of an essentially inflexible agile SE method.
While it is possible that XP, for example, and some abbreviated user-centered design techniques can coexist and work together, in these add-on approaches the two parts are not really combined (McInerney & Maurer, 2005; Patton, 2002, 2008). This creates a coping scenario for the UX side, as UX practitioners attempt to live with the constraints while trying to ply their own processes within an overall development environment driven solely by the agile SE method.
The traditional user-centered design process, even rapid or abridged versions, and the agile SE process are a fundamental mismatch and will always have difficulty fitting together within a project. This means that we need to synthesize an approach to allow the UX process in an integrated agile environment without compromising on essential UX needs, the topic of this section.
Here we especially acknowledge the influence of Constantine and Lockwood (2003), Beyer, Holtzblatt, and Baker (2004), Meads (2010), and Lynn Miller (2010). What we have synthesized here is also built on our experience with traditional UX methods and our broad experience in industry consulting and practice that required quicker and less costly design methods and where customers often demanded early deliverables.
19.6.1 Integrating UX into Planning
Figure 19-5 shows a scheme for integrating the UX role into the planning box of Figure 19-2.
Add some small upfront analysis (SUFA)
If we simply try to include the UX role as an add-on to the agile SE process, the entire operation would still proceed without benefit of any upfront analysis or contact with multiple people in the customer organization and with multiple users in all the key work roles. As a result, there would be no initial knowledge of requirements, users, work practice, tasks, or other
design-informing models. This would be crippling for any kind of UX lifecycle process.
Any serious proposal for integrating UX into planning must include an initial abbreviated form of contextual inquiry and contextual analysis, something that we call "Small UpFront Analysis" (SUFA), in the left-most box of Figure 19-5. The UX role works with the customer to perform some limited contextual inquiry and analysis (Chapters 3 and 4).
In addition, the UX person also assists the customer in other responsibilities, such as writing and prioritizing stories. These stories are now called users stories rather than customer stories because their substance came from users in the upfront analysis.
Although this begins to change the basic agile pattern, it gives the UX team more traction in bringing UX into the overall process. Interest in enhancing this kind of additional upfront analysis is gaining ground. There is some initial agreement (Beyer, Holtzblatt, & Baker, 2004; Constantine & Lockwood, 1999) on the necessity for talking with multiple customer representatives and real users to help understand the overall system and design needs.
Some (Constantine & Lockwood, 1999; Memmel, Gundelsweiler, & Reiterer, 2007) add that a measure of user and/or task modeling would be a very useful supplement in that same spirit. There is obviously a resulting loss of agility but, without these additions, the whole approach might not work for UX.
Beyer and Holtzblatt's "original" approach to upfront analysis and design is called contextual design (Beyer & Holtzblatt, 1998) and has a head start toward agile methods because it is already customer centered. They took another step toward agility with the follow-up book (Holtzblatt, Wendell, & Wood, 2005)
Figure 19-5
Integrating the UX role into planning.
and developed that into a true agile method in Beyer, Holtzblatt, and Baker (2004). Much of this section is based on their explication of the agile version of rapid contextual design in this latter reference.
Goals of the SUFA include:
� understand the users' work and its context
� identify key work roles, work activities, and user tasks
� model workflow and activities in the existing enterprise and system
� forge an initial high-level conceptual design
� identify selected user stories that reflect user needs in the context of their work practice
Because of the "S" in SUFA, the contextual inquiry and analysis involved must be very limited, but even the most abbreviated contextual studies can yield a great deal of understanding about the work roles and the flow model as well as some initial task modeling. By adding this SUFA we can build a good overview of the system as a framework for talking about the little pieces we will be developing in the agile method.
A broad understanding of scope and purpose of the project (second box from the left in Figure 19-5) will allow us to plan the design and implementation of a series of sprints around the tasks and functionality associated with different work roles. This SUFA has to be focused carefully so that it can occur in a very short cycle-maybe even in one week!
Even though the UX person is trained to do a SUFA and could do it alone, the customer should help with SUFA, as shown at the lower left in Figure 19-5, to be in a better position to later write user stories (next section).
User interviews and observation. Your customer will help you identify users to interview. Create a flow model on the fly and in collaboration with the customer representative. Identify all key work roles in this diagram. Annotate it with all important tasks and activities that can be deduced from the user stories.
Agile contextual inquiry can be as brief or as lengthy as desired or afforded. We suggest interviewing and observing the work practice of at least one or two people in each key work role. There is no recording and no transcript of interviews.
The UX practitioners write notes by hand directly on index cards-a Constantine hallmark. Use small size, 300 x 500, index cards to discourage verbosity in the notes. Aim toward effective user stories. We are looking for user stories to drive our small
pieces of interaction design and prototyping. But what kind of user stories do we seek? Stories about work activities, roles, and tasks can still be a good way for designers to start. However, as Meads (2010) says, users are not interested in
tasks per se, but are more interested in features. Following his advice, we focus on features, which are used to carry out related user work activities and tasks within a work context.
UX role helps customer write user stories
In the third box from the left in Figure 19-5, the UX person helps the customer write user stories. Because both roles participated in the SUFA, user story writing will be easier, faster, and more representative of real user needs. The UX
role influences the customer toward creating stories based on workflows observed in the agile contextual inquiry part of SUFA.
UX role helps customer prioritize user stories By helping the customer representative prioritize the user stories, the UX person can keep an eye on the overarching vision of user experience and a cohesive conceptual design, thereby steering the result toward an effective set of stories for an iteration.
19.6.2 Integrating UX into Sprints
In Figure 19-6 we show UX counterpart activities occurring during an agile SE sprint (Figure 19-4).
While the SE people are doing a sprint, the UX person and customer perform their own version of a sprint, which is shown in Figure 19-6. They begin by picking a story and, with the conceptual design in mind, start ideation and sketching of an interaction design to support the functionality of the user story.
The design is cast in a narrow vertical prototype for just this feature for evaluation. Often time permits only a low-fidelity (e.g., paper) prototype. If there is time the design partners make a set of wireframes to describe the
Figure 19-6
UX counterpart of an agile SE sprint.
interaction design. This feature prototype is integrated into their growing overall user interface prototype.
If there is time, the UX design partners do some user experience evaluation on this one part of the design and iterate the design accordingly. If there is even more time (unlikely in an agile environment), in the spirit of agile SE methods, the UX design partners can run this collection of all evaluations again on the whole integrated prototype to ensure that the addition of this design feature did not break the usability of any previous features.
UX practitioners submit this user interface prototype to the customer for acceptance review. Finally, the team "deploys" this small iterative interaction design "release" by sending it on to agile SE developers for coding as part of their next sprint.
19.6.3 Synchronizing the Two Agile Workflows
We have described agile SE planning and agile SE sprints, plus UX integration into planning and UX integration into sprints earlier in this chapter. But we have not yet talked about how the UX and SE teams work together and synchronize the workflow in their respective parts of the agile process.
Dove-tailed work activities
Miller (2010) proposed a "staggered" approach to parallel track agile development that featured a "criss-cross" interplay between UX activities and SE activities across multiple cycles of agile development. As Patton (2008) put in his blog, the overall approach is characterized as "work ahead, follow behind."
As Patton says, UX people on agile teams "become masters of development time travel, nimbly moving back and forth through past, present, and future development work."
Based roughly on Miller's idea, we show a scheme in Figure 19-7 for how UX people and SE people can synchronize their work via a dovetail alternation of activities across progressive iterations.
In the original agile SE approach, SE people started first with sprint 1, taking a set of stories and building a release. That worked when the only thing that was happening was implementation. Now that we are bringing in UX design and evaluation into the mix, we need a few changes.
First, the UX people need some lead time in their sprint 0, in Figure 19-7, to get the interaction designs ready for the SE people in their sprint 1. During this ramping-up sprint 0, SE people can focus on building the software infrastructure and services required to support the whole system, what Miller calls building the "high-development, low-UI features." When the UX people are done with designs for release 1, they hand them off to SE people for
implementation in sprint 1, which includes implementation of both functional stories and interaction design components for that cycle.
Previously, the team had something to release right after sprint 1, but now it takes two sprint cycles, sprint 0 and sprint 1, to get the first release out. This is just a start-up artifact and is not a problem for subsequent releases.
Not all UX design challenges are equal; sometimes there is not enough time to address UX design adequately in a given sprint. Then that design will have to evolve over the design and evaluation activities of more than one sprint.
Also, sometimes in interaction design, we want to try out two variations because we are not sure, so that will have to take place over multiple sprints.
Because of the staggering or dovetailing of activities, people on each part of the team are typically working on multiple things at a time. For example, right after handing off the designs for release 1 to the SE people, the UX people start on designs for release 2 and continue doing this until the end of sprint 1 (while the SE people are coding release 1). In any given sprint, say, sprint n, the UX
people are performing inquiry and planning for sprint n � 2, while doing
interaction design (and prototyping) for sprint n � 1, and evaluation of the interaction design for sprint n - 1.
Figure 19-7
Alternating UX and SE workflow in an agile process.
Following the "lifecycle" of a single release, release n, we see that in sprint
n - 1, the UX role designs for release n, to be implemented by the SE
people in sprint n. UX evaluates release n in sprint n � 1. SE fixes it in sprint n �
2 and re-releases it at the end of that sprint.
Prototyping and UX evaluation
At the end of each sprint, UX people must be able to deliver their UX design to SE people for implementation in the next sprint. This means they must embody their design solution within some kind of prototype, usually a narrow vertical prototype encompassing multiple related user stories for just the feature (user story) they are considering. There will be time only for a low-fidelity prototype; these days wireframes are the de facto standard for prototypes in this kind of development environment.
Perhaps an even more agile form of low-fidelity prototype is a design scenario, maybe in the form of storyboards, which can be used as an early and simple vehicle to draw out feedback from the customer and users. Kane (2003, pp. 2, Figure 1) shows how a scenario can be seen as a mini-prototype, both narrow and shallow, at the intersection between a vertical and a horizontal prototype.
A scenario can distill "the system to the minimal essential elements needed for useful feedback" (Nielsen, 1994 Nielsen, 1994a).
The traditional UX process, of course, calls for extensive UX evaluation of the prototype. However, there will almost never be time to evaluate the prototype in this same sprint, but you will be able to evaluate this UX design in the next sprint.
Prototype integration
UX is all about holistic designs, and you cannot ensure that your emerging design is on track to provide such a holistic user experience unless you have a representation of the overall design and not just little pieces. Therefore, after each new feature is manifest as a prototype (e.g., wireframe), it is integrated into the growing overall user interface prototype, covering all the user stories so far and the broader UX vision. The small feature prototype drives coding but the integrated prototype helps everyone with a coherent view of the overall emerging interaction design.
The value of early delivery
As Memmel, Gundelsweiler, and Reiterer (2007) say, you have the potential to deliver design visions to customers before you even have completed requirements analysis. In contrast, with the traditional rigorous UX process, this is an amazingly early deliverable and amazingly early involvement of the customer.
Feedback for the first feature can go far beyond just that one feature and its usage. This is the first opportunity for the team to get any real feedback. A lot of additional things will come out of that, not specific to that feature.
For example, you will get feedback on how the process is working. You will get feedback on the overall style of your design. You will hear other questions and issues that customers are thinking about that you would not have access to until much later in the fully rigorous process. Your customers may even reprioritize the story cards based on this interaction. This early feedback fits well with the agile principle of abundant communication.
Exactly what is "delivery" in an agile environment?
A stated goal of agile SE methods is to release a product every few weeks. Why would anyone want to do that? Why would customers and users put up with it? Well, if you are talking only about functional software and not the user interface, customers will love it. They get to see very early manifestations of a working system, however severely the functionality may be limited. Also, agile SE developers can actually release their small iterations of functional code even to end users, as changes in the internal code are not visible to customers or users.
However, multiple releases of the user interface, each with a changing design, are not a good thing for users, who cannot be expected to track the continuous changes. They invest in learning a user interface; so a constant flow of even small changes, even if they are improvements, will not be acceptable.
This is always a risk with an agile approach, so it is up to the UX person to mitigate these transitions by making each release an addition to the capabilities but not completely new. Users should be able to do more, but not necessarily change how they do things that are already delivered.
Continuous delivery
Delivery to customers and users is continuous but in pieces. At the end of any given sprint, call it sprint n, the customer sees a UX prototype of the upcoming release and in the next sprint, sprint n � 1, they see the full functional implementation of that prototype. In sprint n � 2, they see UX evaluation findings for that same prototype and, in sprint n � 3, they see the final redesign. Each of these points in time is an opportunity for the customer to give feedback on the interaction design.
Planning across iterations
Figure 19-7 shows planning in a single box at the bottom, extending across all the sprint cycles. That is to convey the idea that planning does not occur in discrete little boxes over time at just the right spot in the flow. Planning is more
of an "umbrella" activity, distributed over time and is cumulative in that the process builds up a "knowledge base" founded on agile contextual inquiry with users. The planning process does not start over for the planning of each cycle.
Instead the same knowledge base is consulted, updated, and massaged, working with the original SUFA results and anything added to supplement those results. Because an overview and conceptual design are evolving in the process, this kind of UX planning brings some top-down benefits to an otherwise exclusively bottom-up process.
Communication during synchronization
This kind of interwoven development process brings with it the risk of falling apart if anything goes wrong. This intensifies the need for constant
communication so that everyone remains aware of what everyone else is doing, what progress is being made by others, and what problems are being encountered.
Agile processes can be more fragile than their heavyweight counterparts. Because each part depends on the others in a tightly orchestrated overall activity, if something goes wrong in one place, there is no time to react to surprises and the whole thing can collapse like a house of cards.
Including emotional impact
How can you take emotional impact into account within an agile approach? It is more difficult to think about emotional impact within an agile approach because you do not have good ways to connect to the overall user experience for the system. You will have limited time to create a conceptual design that fosters a strong positive emotional response. You will have to manage and do as much as you can by including emotional impact as part of the small upfront analysis, in design ideation, and in evaluation.
Style guides
Maintaining a style guide throughout the UX part of an agile development process is perhaps even more important than it is in the fully rigorous process (Constantine & Lockwood, 2003). An agile style guide with the minimal design templates, motifs for visual elements, and design "patterns" (e.g., a standard design for a dialogue box) supports reuse, saves the time of reinventing common design elements, and helps ensure design consistency of look and feel across features. Your style guide can also document "best practices" as you gain experience in this agile approach.
Affordances Demystified 20
Objectives
After reading this chapter, you will:
1. Have acquired a clear understanding of the concept of affordances
2. Know and understand differences among the types of affordances in UX design
3. Know how to apply the different types of affordances together in UX design
4. Be able to identify false affordances and know how to avoid them in UX design
5. Recognize user-created affordances and their implications to design
To begin with, we gratefully acknowledge the kind permission of Taylor & Francis, Ltd. to use a paper published in Behaviour & Information Technology (Hartson, 2003) as the primary source of material for this chapter. This chapter
is a prerequisite for the next two chapters, on the User Action Framework and
design guidelines, but we hope that you find it an interesting topic on its own.
20.1 WHAT ARE AFFORDANCES?
Although a crucially important and powerful concept, the notion of "affordance," as pointed out by Norman (1999), has suffered misunderstanding and misuse (or perhaps uninformed use) by researchers and practitioners alike and in our literature. In this section we define the general concept of affordances as used in human-computer interaction (HCI) design and give more specific definitions for each of four kinds of affordance.
20.1.1 The Concept of Affordance
The relevant part of what the dictionary says about "to afford" is that it means to offer, yield, provide, give, or furnish. For example, a study window in a house may afford a fine view of the outdoors; the window helps one see that nice view. In HCI design, where we focus on helping the user, an affordance is something that helps a user do something. In interaction design, affordances are characteristics of user interface objects and interaction design features that help users perform tasks.
20.1.2 Definitions of the Different Kinds of Affordance
In an effort to clarify the concept of affordance and how it is used in interaction design, we have defined (Hartson, 2003) four types of affordances, each of which plays a different role in supporting users during interaction, each reflecting user processes and the kinds of actions users make in task performance. Those kinds of affordances are as follows:
� Cognitive affordances help users with their cognitive actions: thinking, deciding, learning, remembering, and knowing about things.
� Physical affordances help users with their physical actions: clicking, touching, pointing, gesturing, and moving things.
� Sensory affordances help users with their sensory actions: seeing, hearing, and feeling (and tasting and smelling) things.
� Functional affordances help users do real work (and play) and get things done, to use the system to do work.
In analysis and design, each type of affordance must be identified for what it is and considered on its own terms. Each type of affordance uses different mechanisms, corresponds to different kinds of user actions, and has different requirements for design and different implications in evaluation and problem diagnosis.
As an example to get you started, consider a button available for clicking somewhere on a user interface. Sensory affordance helps you sense (in this case, see) it. Sensory affordance can be supported in design by, for example, the color or location of the button. Cognitive affordance helps you understand the button by comprehending what the button is used for, via the meaning of its label.
Physical affordance helps you click on this button, so its design support could include the size of the button or its distance from other buttons.
20.2 A LITTLE BACKGROUND
Who "invented" the concept of affordances? Of course we all know it was Donald Norman. Well, not quite. While Norman did introduce the concept to HCI, the
concept itself goes back at least as far as James J. Gibson (1977, 1979), and
probably further.
Gibson is a perceptual psychologist who took an "ecological" approach to perception, meaning he studied the relationship between a living being and its environment, in particular what the environment offers or affords the animal.
Gibson's affordances are the properties and objects of the environment as reckoned relative to the animal, and "the 'values' and 'meanings' of things in the environment [that] can be directly perceived" (Gibson, 1977) by the animal. In this book, human users of computer systems are the animals (are not they, though?).
Norman (1999) begins his interactions article by referring to Gibson's definitions of "afford" and "affordance," as well as to discussions he and Gibson have had about these concepts. Paraphrasing Gibson (1979, p. 127) within an HCI design context, affordance (as an attribute of an interaction design feature) is what that feature offers the user, what it provides or furnishes.
Here Gibson is talking about physical properties. Gibson gives an example of how a horizontal, flat, and rigid surface affords support for an animal for standing or walking. In his ecological view, affordance is reckoned with respect to the animal/user, which is part of the affordance relationship.
Thus, as Norman (1999) points out, Gibson sees an affordance as a physical relationship between an actor (e.g., user) and physical artifacts in the world reflecting possible actions on those artifacts. Such an affordance does not have to be visible, known, or even desirable.
Since Norman brought the term affordance into common usage in the HCI domain with his book The Design of Everyday Things (Norman, 1990), the term has appeared many times in the literature. However, terminology surrounding the concept of affordance in the literature has been used with more enthusiasm than knowledge, and we are left with some confusion.
Beyond Gibson and Norman, Gaver (1991) and McGrenere and Ho (2000) have influenced our thinking about affordances. Gaver (1991) sees affordances in design as a way of focusing on strengths and weaknesses of technologies with respect to the possibilities they offer to people who use them.
He extends the concepts by showing how complex actions can be described in terms of groups of affordances, sequential in time and/or nested in space, showing how affordances can be revealed over time, with successive user actions, for example, in the multiple actions of a hierarchical drop-down menu. Gaver (1991) defined his own terms somewhat differently from those of Norman or Gibson. That McGrenere and Ho (2000) also needed to calibrate their terminology against Gaver's further demonstrates the difficulty of discussing these concepts without access to a richer, more consistent vocabulary.
In most of the related literature, design of cognitive affordances (whatever they are called in a given paper) is acknowledged to be about design for the cognitive part of usability, ease of use in the form of learnability for new and intermittent users (who need the most help in knowing how to do something).
All authors who write about affordances give their own definitions of the concept, but there is scant mention of physical affordance design.
Sensory affordance is neglected even more in the literature. Most other authors include sensory affordance only implicitly and/or lumped in with cognitive affordance rather than featuring it as a separate explicit concept. Thus, when these authors talk about perceiving affordances, including Gaver's (1991) and McGrenere and Ho's (2000) phrase "perceptibility of an affordance," they are referring (in our terms) to a combination of sensing (e.g., seeing) and understanding physical affordances through sensory affordances and cognitive affordances.
Gaver refers to this same mix of affordances when he says "People perceive the environment directly in terms of its potential for action." As we explain in the next section, our use of the term "sense" has a markedly narrower orientation toward discerning via sensory inputs such as seeing and hearing.
20.3 FOUR KINDS OF AFFORDANCES IN UX DESIGN
20.3.1 Cognitive Affordance
Cognitive affordance is a design feature that helps, aids, supports, facilitates, or
enables thinking, learning, understanding, and knowing about something. Cognitive affordances play starring roles in interaction design, especially for less experienced users who need help with understanding and learning.
Because of this role, cognitive affordances are among the most significant usage-centered design features in present-day interactive systems, screen based or otherwise. They are the key to answering Norman's question (1999, p. 39) on behalf of the user: "How do you know what to do?"
As a simple example, the symbol of an icon that clearly conveys its meaning could be a cognitive affordance enabling users to understand the icon in terms of the functionality behind it and the consequences of clicking on it. Another cognitive affordance might be in the form of a clear and concise button label.
Cognitive affordance is usually associated with the semantics or meaning of user interface artifacts. In this regard, cognitive affordance is used as feed forward. It is help with a priori knowledge, that is, knowledge about the associated functionality before selecting an object such as a button, icon, or menu choice. In short, a button label helps you in knowing about what functionality will be invoked if you click on that button.
Communication of meaning via cognitive affordance often depends on shared conventions. The symbols themselves may have no inherent meaning, but a shared convention about the meaning allows the symbol to convey that meaning.
Another use of cognitive affordance is in feedback-helping a user know what happened after a button click, for example, and execution of the corresponding system functionality. Feedback helps users in knowing whether the course of interaction has been successful so far.
20.3.2 Physical Affordance
Physical affordance is a design feature that helps, aids, supports, facilitates, or
enables doing something physically. Adequate size and easy-to-access location could be physical affordance features of an interface button design enabling users to click easily on the button.
Because physical affordance has to do with physical objects, we treat active interface objects on the screen, for example, as real physical objects, as they can be on the receiving end of real physical actions (such as clicking or dragging) by users. Physical affordance is associated with the "operability" characteristics of such user interface artifacts. As many in the literature have pointed out, it is clear that a button on a screen cannot really be pressed, which is why we try to use the terminology "clicking on buttons."
Physical affordances play a starring role in interaction design for experienced or power users who have less need for elaborate cognitive affordances but whose task performance depends largely on the speed of physical actions. Design issues for physical affordances are about physical characteristics of a device or interface that afford physical manipulation. Such design issues include Fitts' law (Fitts, 1954; MacKenzie, 1992), physical disabilities and limitations, and physical characteristics of interaction devices and interaction techniques.
20.3.3 Sensory Affordance
Sensory affordance is a design feature that helps, aids, supports, facilitates, or
enables user in sensing (e.g., seeing, hearing, feeling) something. Sensory affordance is associated with the "sense-ability" characteristics of user interface artifacts, especially when it is used to help the user sense (e.g., see) cognitive affordances and physical affordances. Design issues for sensory affordances include noticeability, discernability, legibility (in the case of text), and audibility (in the case of sound) of features or devices associated with visual, auditory, haptic/tactile, or other sensations.
While cognitive affordance and physical affordance are stars of interaction design, sensory affordance plays a critical supporting role. As an example, the legibility of button label text is supported by an adequate size font and appropriate color contrast between text and background. In short, sensory affordance can be thought of as an attribute that affords cognitive affordance
and physical affordance; users must be able to sense cognitive affordances and physical affordances in order for them to aid the user's cognitive and physical actions.
Why do we call it "sensory affordance" and not "perceptual affordance?" In the general context of psychology, the concepts of sensing and perception are intertwined. To avoid this association, we use the term "sensing" instead of "perception" because it excludes the component of cognition usually associated with perception (Hochberg, 1964). This allows us to separate the concepts of sensory and cognitive affordance into mostly non-overlapping meanings.
While overlapping and borderline cases are interesting to psychologists, HCI designers need to separate the concepts because design issues for user sensory actions are almost entirely orthogonal to design issues for cognitive actions.
As an illustration, consider text legibility, which at a low level is about identifying shapes in displayed text as letters in the alphabet, but not about the meanings of these letters as grouped into words and sentences.
But text legibility can be an area where user perception, sensing, and cognition overlap. To make out text that is just barely or almost barely discernable, users can augment or mediate sensing with cognition, using inference and the context of words in a message to identify parts of the text that cannot be recognized by pure sensing alone. Context can make some candidate letters more likely than others. Users can recognize words in their own language more easily than words in another language or as groups of nonsense letter combinations.
In HCI, however, we seek to avoid marginal design and ensure that designs work for wide-ranging user characteristics. Therefore, we require effective design solutions for both kinds (sensory and cognitive) of affordances, each considered separately, in terms of its own characteristics. Simply put, a label in a user interface that cannot be fully discerned by the relevant user population, without reliance on cognitive augmentation, is a failed HCI design.
Thus, we wish to define sensing at a level of abstraction that eliminates these cases of borderline user performance so that HCI designers can achieve legibility, for example, beyond question for the target user community. In other words, we desire an understanding of affordance that will guide the HCI designer to attack a text legibility problem by adjusting the font size, for example, not by adjusting the wording to make it easier to deduce text displayed in a tiny font.
In our broadest view, a user's sensory experience can include gestalt, even aesthetic, aspects of object appearance and perceptual organization
(Arnheim, 1954; Koffka, 1935), such as figure-ground relationships, and might sometimes include some judgment and lexical and syntactic interpretation in the broadest spatial or auditory sense (e.g., what is this thing I am seeing?), but does not get into semantic interpretation (e.g., what does it mean?).
In the context of signal processing and communications theory, this kind of sensing would be about whether messages are received correctly, but not about whether they are understood.
20.3.4 Functional Affordance
Functional affordances connect physical user actions to invoke system, or back
end, functionality. Functional affordances link usability or UX to usefulness and
add purpose for physical affordance. They are about higher level user enablement in the work domain and add meaning and goal orientation to design discussions.
We bring Gibson's ecological view into contextualized HCI design by including a purpose in the definition of each physical affordance, namely associated functional affordance. Putting the user and purpose of the affordance into the picture harmonizes nicely with our interaction- and
user-oriented view in which an affordance helps or aids the user in doing something.
Yes, a user can click on an empty or inactive part of the screen, but that kind of clicking is without reference to a purpose and without the requirement or expectation that any useful reaction by the system will come of it. In the context of HCI design, a user clicks to accomplish a goal, to achieve a purpose (e.g., clicking on a user interface object, or artifact, to select it for manipulation or clicking on a button labeled "Sort" to invoke a sorting operation).
McGrenere and Ho (2000) also refer to the concept of application usefulness, something they call "affordances in software," which are at the root of supporting a connection between the dual concepts of usability and usefulness (Landauer, 1995). In an external view it is easy to see a system function as an affordance because it helps the user do something in the work domain.
This again demonstrates the need for a richer vocabulary, and conceptual framework, to take the discussion of affordances beyond user interfaces to the larger context of overall system design. We use the term functional affordance to denote this kind of higher level user enablement in the work domain.
20.3.5 Summary of Affordance Types
Table 20-1 contains a summary of these affordance types and their roles in interaction design.
Table 20-1
Summary of affordance types
Cognitive affordance
Design feature that helps users in knowing something
A button label that helps users know what will happen if they click on it
Physical affordance
Design feature that helps users in doing a physical action in the interface
A button that is large enough so that users can click on it accurately
Sensory affordance
Design feature that helps users sense something (especially cognitive affordances and physical affordances)
A label font size large enough to be discerned
Functional affordance
Design feature that helps users accomplish work (i.e., usefulness of a system function)
The internal system ability to sort a series of numbers (invoked by users clicking on the Sort button)
20.4 AFFORDANCES IN INTERACTION DESIGN
20.4.1 Communication and Cultural Conventions
An important function of cognitive affordance is communication, agreement about meaning via words or symbols. Communication is exactly what makes precise wording effective as a cognitive affordance: something to help the user in knowing, for example, what to click on. We see symbols, constraints, and shared conventions as essential underlying mechanisms that make cognitive affordances work, as Norman (1999) says, as "powerful tools for the designer."
In the tradition of The Design of Everyday Things (Norman, 1990), we illustrate with a simple and ubiquitous non-computer device, a device for opening doors. The hardware store carries both round doorknobs and lever-type door handles. The visual design of both kinds conveys a cognitive affordance, helping users think or know about usage through the implied message
their appearance gives to users: "This is what you use to open the door." The doorknob and lever handle each suggests, in its own way, the grasping and rotating required for operation.
But that message is understood only because of shared cultural conventions. There is nothing intrinsic in the appearance of a doorknob that necessarily conveys this information. On another planet, it could seem mysterious and
confusing, but for us a doorknob is an excellent cognitive affordance because almost all users do share the same easily recognized cultural convention.
This "on another planet" idea led to an interesting exercise in a class of graduate students on how cultural conventions influence our perception of affordances. We handed out identical empty Coke bottles to several groups and asked them to look at the bottles, to hold and handle them, and to think about what kinds of uses the inherent affordances evoke. We wanted them to get down to Gibson's ecological level.
Students responded with the usual answers about affordances evinced by sight and touch. Visually, a Coke bottle has obvious affordances as a vessel to hold water, it can hold flowers, or it can serve as a rough volume measuring device. The heft and sturdiness sensed when held in one's hand indicate affordances to serve as a paperweight, a plumb bob on a string, or even an oddly shaped rolling pin.
Then we asked them if they had seen the movie called The Gods Must Be Crazy. Although this cool movie was apparently before the time of most of them, some eyebrows were raised in an "ah ha" moment. In this movie, an empty coke bottle falls out of the sky from a passing airplane and is found by a family of Bushmen in the deep Kalahari. The Kalahari Bushmen, who have never seen a bottle before, can rely on only its inherent characteristics as clues to physical affordances leading to possible uses. The perceived affordances were not influenced by cultural conventions or practice.
The Bushmen used it for a variety of tasks-to transport water, to pound soft roots and other vegetation, as an entertainment device when they figured out how to use it as a whistle, and eventually as a weapon to attack one another.
That these affordances became so apparent to the Bushmen, but might not be obvious to most people from an industrialized part of the world, indicates the impact of social experience and cultural conventions to influence, and even prejudice, one's perception of an object's affordances.
20.4.2 Cognitive Affordance as "Information in the World" Norman characterizes a view of cognitive affordance that we share (Norman, 1999, p. 39): "When you first see something you have never seen before, how do you know what to do? The answer, I decided, was that the required information was in the world: the appearance of the device could provide the critical
clues required for its proper operation." This view of cognitive affordance as information in the world to aid understanding is fundamental and resonates
with the ecological view of Gibson. The attribute that communicates the use of an object is an integral part of the object. This definitely works for, say, a label on a user interface button.
20.4.3 Affordance Roles-An Alliance in Design
In most interaction designs, the four types of affordance work together, connected in the design context of a user's work environment. To accomplish work goals, the user must sense, understand, and use affordances within an interaction design.
Each kind of affordance plays a different role in the design of different attributes of the same artifact, including design of appearance, content, and manipulation characteristics to match users' needs, respectively, in the sensory, cognitive, and physical actions they make as they progress through the cycle of actions during task performance.
As Gaver (1991, p. 81) says, thinking of affordances in terms of design roles "allows us to consider affordances as properties that can be designed and analysed in their own terms." Additionally, even though the four affordance roles must be considered together in an integrated view of artifact design, these words from Gaver speak to the need to distinguish individually identifiable affordance roles.
Coming back to the example about devices for opening doors, the simplest is a round doorknob. Its brass color might be a factor in noticing or finding it as you approach the door. The familiar location and shape of a doorknob convey cognitive affordance via the implied message that it is what you use to open the door. A doorknob also affords physical grasping and rotating for door operation. Some designs, such as a lever, are considered to give better physical affordance than that of a round knob because the lever is easier to use with slippery hands or by an elbow when the hands are full. The push bar on double doors is another example of a physical affordance helpful to door users with full hands.
Sometimes the physical affordance to help a user open a door is provided by the door itself; people can open some swinging doors by just pushing on the door. In such cases, designers often help users by installing, for example, a brass plate to show that one should push and where to push. Even though this plate might help avoid handprints on the door, it is a cognitive affordance and not a real physical affordance because it adds nothing to the door itself to help the user in the physical part of the pushing action. Sometimes the word "push" is engraved in the plate to augment the clarity of meaning as a cognitive affordance.
Similarly, sometimes the user of a swinging door must open it by pulling. The door itself does not usually offer sufficient physical affordance for the pulling action so a pull handle is added. A pull handle offers both cognitive and physical affordance, providing a physical means for pulling as well as a visual indication that pulling is required.
As an example of how the concepts might guide HCI designers, suppose the need arises in an interaction design for a button to give the user access to a certain application feature or functionality. The designer would do well to begin by asking if the intended functionality, the functional affordance, is appropriate and useful to the user. Further interaction design questions are moot until this is resolved positively.
The designer is then guided to support cognitive affordance in the button design, to advertise the purpose of the button by ensuring, for example, that its meaning (in terms of a task-oriented view of its underlying functionality) is clearly, unambiguously, and completely expressed in the label wording, to help the user know when it is appropriate to click on the button while performing a task. Then, the designer is asked to consider sensory affordance in support of cognitive affordance in the button design, requiring an appropriate label font size and color contrast, for example, to help the user discern the label text to read it.
The designer is next led to consider how physical affordance is to be supported in the button design. For example, the designer should ensure that the button is large enough to click on it easily to accomplish a step in a task. Designers should try to locate the button near other artifacts used in the same and related tasks to minimize mouse movement between task actions. But also designers should locate each button far enough away from other, non-related, user interface objects to avoid clicking on them erroneously.
Finally, the designer is guided to consider sensory affordance in support of physical affordance in the button design by ensuring that the user notices the button so that it can be clicked. For example, the button must be a color, size, and shape that make it noticeable and must be located in the screen layout so that it is near enough to the user's focus of attention. If the artifact is a feedback message, it also requires attention to sensory affordance (e.g., to notice the feedback), cognitive affordance (e.g., to understand what the message says about a system outcome), and physical affordance (e.g., to click on a button to dismiss the message box).
In sum, the concept of affordance does not offer a complete prescriptive approach to interaction design but does suggest the value of considering all four
affordance roles together in the design of an interaction artifact by asking (not necessarily always in this order):
� Is the functionality to which this interaction or artifact gives access useful in achieving user goals through task performance (functional affordance, or purpose of physical affordance)?
� Does the design include clear, understandable cues about how to use the artifact (cognitive affordance) or about system outcomes if the artifact is a feedback message?
� Can users easily sense visual (or other) cues about artifact operation (sensory affordance in support of cognitive affordance)?
� Is the artifact easy to manipulate by all users in the target user classes (physical affordance)?
� Can users easily sense the artifact for manipulation (sensory affordance in support of physical affordance)?
Considering one affordance role but ignoring another is likely to result in a flawed design. For example, if the wording for a feedback message is carefully crafted to be clear, complete, and helpful (good cognitive affordance), but users do not notice the message because it is displayed out of the users' focus of attention (poor sensory affordance) or users cannot read it because the font is too small, the net design is ineffective. A powerful drag-and-drop mechanism may offer good physical and functional affordance for opening files, but lack of a sufficient cognitive affordance to show how it works could mean that most users will not use it.
Another example of a way that cognitive affordance and physical affordance work together in interaction design can also be seen in the context of designing constraints for error avoidance. "Graying out" menu items or button labels to show that inappropriate choices are unavailable at a given point within a task is a simple, but effective, error avoidance design technique.
This kind of cognitive affordance presents a logical constraint to the user, showing visually that this choice can be eliminated from possibilities being considered at this point. In that sense, the grayed-out label is a cognitive affordance on its own, quite different from the cognitive affordance offered by the label when it is not grayed out.
If cognitive and physical affordances are connected in the design, a grayed-out button or menu choice also indicates a physical constraint in that the physical and functional affordance usually offered by the menu item or button to access corresponding functionality is disabled so that a
persistent user who clicks on the grayed-out choice anyway cannot cause harm.
Because these two aspects of graying-out work together so well, many people think of them as a single concept, but the connection of these dual aspects is important.
20.5 FALSE COGNITIVE AFFORDANCES MISINFORM AND MISLEAD
Because of the power of cognitive affordances to influence users, misuse of cognitive affordances in design can be a force against usability and user experience. When cognitive affordances do not telegraph physical affordances, it is not helpful.
Worse yet, when cognitive affordances falsely telegraph physical affordances, it is worse than not helping; it leads users directly to errors. Gibson calls this "misinformation in affordances"; for example, as conveyed by a glass door that appears to be an opening but does not afford passage. Draper and Barton (1993) call these "affordance bugs."
Sometimes a door has both a push plate and a pull handle as cognitive affordances in its design. The user sees this combination of cognitive affordances as an indication that either pushing or pulling can operate this as a swinging door. When the door is installed or constrained so that it can swing in only one direction, however, the push plate and pull handle introduce conflicting information or misinformation in the cognitive affordances that interfere with the design as a connection to physical affordances.
We know of a door with a push plate and a pull handle that was installed or latched so that it could only be pushed. A "push" sign had been added, perhaps to counter the false cognitive affordance of the pull handle. The label, however, was not always enough to overcome the power of
the pull handle as a cognitive affordance; we observed some people still grab the handle and attempt to pull the door open.
Figure 20-1 contains a photograph of a door sign in a local store that is confusing because of the conflicts among its cognitive affordances.
This sign is on the inside of the door. The explanation from a clerk in the store was that it really means to enter only from the outside and not to go through the door from the inside. One can only nod and sigh. They
Figure 20-1
A door with a confusing sign containing conflicting cognitive affordances.
were probably reusing an available design object, the "Do Not Enter" sign, instead of tailoring a sign more specific to the usage situation. The resulting mashup was nonsense.
Another example of a false cognitive affordance showed up in a letter received recently from an insurance company. There was a form at the bottom to fill out and return, with this line appearing just above the form, as seen in Figure 20-2.
Figure 20-2
False cognitive affordances in a form letter that looks like an affordance to cut.
Figure 20-3
False cognitive affordances in a menu bar with links that look like buttons.
Figure 20-4
Radio switch with mixed affordances.
Because that dashed line looked so much like the usual "Cut on this line to detach" cognitive
affordance, one might easily detach the form before realizing that the customer information above would be lost. A better design might simply omit this warning because, without it, the typical user would not even think of ripping the paper.
Examples of false cognitive affordances in user interfaces abound. A common example is seen in Web page links that look like buttons, but do not behave like buttons. The gray background of the links in the top menu bar of a digital
library Website,
Figure 20-3, makes them seem like buttons. A user might click on the
background, assuming it is part of a button, and not get any result. Because the "button" is actually just a hyperlink, it requires clicking exactly on the text.
Below-the-fold issues on Web pages can be compounded by having a horizontal line on a page that happens to fall at the bottom of a screen. Users see the line, a false affordance, and assume falsely that it is the bottom of the page and so do not scroll, missing possibly vital information below.
Sometimes a false cognitive affordance arises from deliberate abuse of a shared convention to deceive the user. Some designers of pop-up advertisements "booby trap" the "X" box in the upper right-hand corner of the pop-up window, making it a link to launch one or more new pop-ups when users
click on the "X", trapping users into seeing more pop-up ads when their intention clearly was to close the window.
As another example, consider a radio with the slider switch, sketched in Figure 20-4a, for selecting between stereo and monaural FM reception. The names for the switch positions (Stereo, Mono) are a good match to the user's model, but the arrows showing which way to slide the switch are unnecessary and introduce confusion when combined with the labels.
The design has mixed cognitive affordances: the names of the modes at the top and bottom of the switch are such a strong cognitive affordance for the user that they conflict with the arrows.
The arrows in Figure 20-4a call for moving the switch up to get monaural reception and down to get stereo. At first glance, however, it looks as though the up position is for stereo (toward the "stereo" label) and down is for monaural, but the arrows make the meaning exactly the opposite. The names alone, as shown in Figure 20-4b, are the more normal and natural way to label the switch.
As another example, Figure 20-5 is a photo of part of the front of an old microwave. The dial marks between the
settings for "Defrost" and "Cook" seem to indicate a range of possible settings but, in fact, it is a binary choice: either "Defrost" or "Cook" but nothing in between. The designer could not resist the temptation to fill in the space between these choices with misleading "design details" that are false affordances.
As a further example, in Figure 20-6, we see a sign that has made the rounds on the Internet, mainly because it is so funny. It is an example of a cognitive affordance that is misleading.
20.6 USER-CREATED AFFORDANCES AS A WAKE-UP CALL TO DESIGNERS
If a device in the everyday world does not suit the user,
we will frequently see the user modify the apparatus, briefly and unknowingly switching to the role of designer. We have all seen the little cognitive or physical affordances added to
devices by users-Post-it(tm) notes added to a computer monitor or keyboard or a better grip taped to the handle of something. These trails of user-created artifacts blazed in the wake of spontaneous formative evaluation in the process of day-to-day usage are like wake-up messages, telling designers what the users think they missed in the design.
A most common example of trails (literally) of user-made artifacts is seen in the paths worn by people as they walk. Sidewalk designers usually like to make the sidewalk patterns regular, symmetric,
Figure 20-5
Useless dial marks between power settings on a microwave.
Figure 20-6
Misdirection in a cognitive affordance.
Figure 20-7
Glass door with a user- added cognitive affordance (arrow) indicating proper operation.
and rectilinear. However, the most efficient paths for people getting from one place to the other are often less tidy but more direct. Wear patterns in the grass show where people need or want to walk and, thus, where the sidewalks should have been located. The rare and creative sidewalk designer will wait until seeing the worn paths, employing the user-made artifacts as clues to drive the design.
Sometimes the affordances are already there but they are not effective. As Gaver says, when affordances suggest actions different from the way something is designed, errors are common and signs are necessary. The signs are artifacts, added because the designs themselves did not carry sufficient cognitive affordance.
We have all seen the cobbled design modifications to everyday things,
such as an explanation written on, an important feature highlighted with a circle or a bright color, a feature (e.g., instructions) moved to a location where it
is more likely to be seen. Users add words or pictures to mechanisms to explain how to operate them, enhancing cognitive affordance.
Neither do physical affordances escape these design lessons from users. You see added padding to prevent bruised knuckles. A farmer has a larger handle welded onto a tractor implement, enhancing physical affordance of the factory- made handle and its inadequate leverage. User-created artifacts also extend
to sensory affordances.
For example, a homeowner replaces the street number sign on her house with a larger one, making it easier to see. Such user-made artifacts are a variation on the "user-derived interfaces" theme of Good et al. (1984), through which designers, after observing users perform tasks in their own way, modified interaction designs so that the design would have worked for those users.
Example:
In Figure 20-7, a photo of a glass door in a convenience store, we show an example of a user-added cognitive affordance. The glass and stainless steel
design is elegant: the perfectly symmetric layout and virtually unnoticeable hinges contribute to the uncluttered aesthetic appearance, but these same attributes work against cognitive affordance for its operation.
The storeowner noticed many people unsure about which side of the stainless steel bar to push or pull to open the door, often trying the wrong side first. To help his customers with what should have been an easy task in the first place, he glued a bright yellow cardboard arrow to the glass, pointing out the correct place to operate the door.
Example:
The icons shown in Figure 20-8 are for lightness and darkness settings on a home office copier/printer, icons with ambiguous meanings. The icon for lighter copies showed more white but the white can be interpreted as part of the copy and it seems more dense than the icon for darker copies so the user had to add his own label, as you can see in the figure.
These trails of often inelegant but usually effective artifacts added by frustrated users leave a record of affordance improvements that designers should consider for all their users. Perhaps if designers of the everyday things that Norman (1990) discusses had included usability testing in the field, they would have found these problems before the products went to market.
In the software world, most applications have only very limited capabilities for users to set their preferences. Would not it be much nicer for software users if they could modify interaction designs as easily as applying a little duct tape, a Post-it, or extra paint here and there?
In Figure 20-9 we show how a car owner created an artifact to replace an inadequate physical affordance-a built-in drink holder that was too small and too flimsy for today's super-sized drinks.
During one trip, the user improvised with a shoe, resulting in this interesting example of a user-installed artifact.
As an example, consider a desktop printer used occasionally to print a letter on a single sheet of letterhead stationery. Inserting the stationery on top of the existing plain paper supply in the printer does this rather easily. The only problem is that it is not easy to determine the correct orientation of the sheet to be inserted because:
Figure 20-8
A user-created cognitive affordance explaining copier darkness settings.
Figure 20-9
A user-made automobile cup-holder artifact, used with permission from Roundel magazine, BMW Car Club of America, Inc. (Howarth, 2002).
� there is no clear mental model of how the sheet travels through in the interior mechanism of the printer
� printers can vary in this configuration
� the design of the printer itself gives no cognitive affordance for loading a single sheet of letterhead
Figure 20-10
A user-created cognitive affordance to help users know how to insert blank letterhead stationery.
Figure 20-11
A user-created cognitive affordance added to a roadside sign; see arrow on post to left of the sign.
Thus, the user attached his own white adhesive label, shown in Figure 20-10, that says "Stationery: upside down, face up," adding yet another user- created artifact attesting to inadequate design. As
Norman (1990, p. 9) says, "When simple things need pictures, labels, or instructions, the design has failed."
As you know, the world is full of examples of user-created cognitive affordances, attesting to the need for better design for everyone. As another example here, in Figure 20-11 we show a road sign at a country road corner in Maine.
We were looking for the campground and the sign confirmed that we were close, but we were not sure which way to turn to get there. Then we saw the arrow that someone else had added on the post to the left of the sign, which helped us complete our task. It was also an indication that we were not the first to encounter this UX problem.
20.7 EMOTIONAL AFFORDANCES
Because of the importance of emotional impact as part of the user experience, we see the possibility of a new type of affordance-an emotional affordance. We suggest the value of considering emotional affordances in interaction design,
affordances that help lead users to a positive emotional response. This means features or design elements that make an emotional connection with the user. These will include design features that connect to our subconscious and intuitive appreciation of fun, aesthetics, and challenges to growth.
This new kind of affordance plays well into the original Gibson ecological view of affordances that are about the relationship between a living being and its environment. This is just what we are talking about with respect to emotional impact, especially phenomenological aspects. Gibson's affordances are about values and meanings that can be perceived directly in the environment.
Apple products are bristling with emotional affordances, and it is an operational concept in the automobile design world. The mobile world is trying to leverage emotional affordances to attract customers. Let us work together to make this a new kind of affordance in full standing with the others.
Intentionally left as blank
The Interaction Cycle and the User Action Framework
21.1 INTRODUCTION
21.1.1 Interaction Cycle and User Action Framework (UAF) The Interaction Cycle is our adaptation of Norman's "stages-of-action" model (Norman, 1986) that characterizes sequences of user actions typically occurring in interaction between a human user and almost any kind of machine. The User Action Framework (Andre et al., 2001) is a structured knowledge base containing information about UX design, concepts, and issues.
Within each part of the UAF, the knowledge base is organized by immediate user intentions involving sensory, cognitive, or physical actions. Below that level the organization follows principles and guidelines and becomes more detailed and more particularized to specific design situations as one goes deeper into the structure.
To clarify the distinction, the Interaction Cycle is a representation of user interaction sequences and the User Action Framework is a knowledge base of interaction design concepts, the top level of which is organized as the stages of the Interaction Cycle.
21.1.2 Need for a Theory-Based Conceptual Framework
As Gray and Salzman (1998, p. 241) have noted, "To the na�ive observer it might seem obvious that the field of HCI would have a set of common categories with which to discuss one of its most basic concepts: Usability. We do not. Instead we have a hodgepodge collection of do-it-yourself categories and various collections of rules-of-thumb."
As Gray and Salzman (1998) continue, "Developing a common categorization scheme, preferably one grounded in theory, would allow us to compare types of usability problems across different types of software and interfaces." We believe that the Interaction Cycle and User Action Framework help meet this need.
They are an attempt to provide UX practitioners with a way to frame design
issues and UX problem data within the structure of how designs support user
actions and intentions.
As Lohse et al. (1994) state, "Classification lies at the heart of every scientific field. Classifications structure domains of systematic inquiry and provide concepts for developing theories to identify anomalies and to predict future research needs." The UAF is such a classification structure for UX design concepts, issues, and principles, designed to:
� Give structure to the large number of interaction design principles, issues, and concepts
� Offer a more standardized vocabulary for UX practitioners in discussing interaction design situations and UX problems
� Provide the basis for more thorough and accurate UX problem analysis and diagnosis
� Foster precision and completeness of UX problem reports based on essential distinguishing characteristics
Although we include a few examples of design and UX problem issues to illustrate aspects and categories of the UAF in this chapter, the bulk of such examples appear with the design guidelines (Chapter 22), organized on the UAF structure.
21.2 THE INTERACTION CYCLE
21.2.1 Norman's Stages-of-Action Model of Interaction Norman's stages-of-action model, illustrated in Figure21-1, shows ageneric viewofa typical sequence of user actions as a user interacts with almost any kind of machine.
The stages of action naturally divide into three major kinds of user activity. On the execution (Figure 21-1, left) side, the user typically begins at the top of the
figure by establishing a goal, decomposing goals into tasks and intentions, and mapping intentions to action sequence specifications. The user manipulates system controls by executing the physical actions (Figure 21-1, bottom left), which cause internal system state changes (outcomes) in the world
(the system) at the bottom of the figure.
On the evaluation
(Figure 21-1, right) side, users perceive, interpret, and
evaluate the outcomes with respect to goals and intentions through perceiving the system state by sensing feedback from the system (state changes in "the world" or the system). Interaction success is evaluated by comparing outcomes with the original goals. The interaction is successful if the actions in the cycle so far have brought the user closer to the goals.
Norman's model, along with the structure of the analytic evaluation method called the cognitive walkthrough (Lewis et al., 1990), had an essential influence on our Interaction Cycle. Both ask questions about whether the user can determine what to do with the system to achieve a goal in the work domain, how to do it in terms of user actions, how easily the user can perform the required physical actions, and (to a lesser extent in the cognitive walkthrough method) how well the user can tell whether the actions were successful in moving toward task completion.
21.2.2 Gulfs between User and System
Originally conceived by Hutchins, Hollan, and Norman (1986), the gulfs of execution and evaluation were described further by Norman (1986). The two gulfs represent places where interaction can be most difficult for users and where designers need to pay special attention to designing to help users. In the gulf of execution, users need help in knowing what actions to make on what objects. In the gulf of evaluation, users need help in knowing whether their actions had the expected outcomes.
Figure 21-1
Norman's (1990) stages-of-
action model, adapted with permission.
The gulf of execution
The gulf of execution, on the left-hand side of the stages-of-action model in Figure 21-1, is a kind of language gap-from user to system. The user thinks of goals in the language of the work domain. In order to act upon the system to pursue these goals, their intentions in the work domain language must be translated into the language of physical actions and the physical system.
As a simple example, consider a user composing a letter with a word processor. The letter is a work domain element, and the word processor is part of the system. The work domain goal of "creating a permanent record of the letter" translates to the system domain intention of "saving the file," which translates to the action sequence of "clicking on the Save icon." A mapping or translation between the two domains is needed to bridge the gulf.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment