Thursday, October 6, 2016

Task Performance Indicator

In the increasingly complex and modern world, it has become harder to quantify customer experience and satisfaction. A new software being simpler and faster for users is no longer good enough for most tech companies. Every element of design and user experience must be quantified or else it is deemed useless. This is why it is very hard to measure customer experience. Previously, products were generally tested on a handful of consumers and that was thought to be good enough to reveal any flaws or areas of improvement. Companies soon realized that such a small sample would not be representative of the whole population, so they started releasing beta versions of products they were going to market to many more users and collected feedback from that sample, yet there still was no systematic way to quantify user experience.

That all changed when Gerry McGovern, the Design Director for Land Rover, developed a remote testing method that would measure the impact of changes on customer experience. He created what is called the Task Performance Indicator (TPI), a reliable and repeatable metric that assessed alterations to an app, website, or any product in general, in relation to a defined set of customer top tasks. The TPI will give a score to the product out of 100, and it is repeatable over time so if nothing has been changed in the product, its TPI score will be exactly the same 6 months later.

How the TPI works

The TPI assesses a product by presenting the user with a "task question". Once they understand what they have to do, the user will attempt to complete the task. At the end of the task, they must provide an answer to the question and say how confident they are in their answer.

The following are a number of factors that affect the TPI score:

  • Time: At the beginning of the task, a Target Time is established, the ideal time the task should be completed in under ideal conditions. The more a user exceeds this time, the more it affects the TPI. 
  • Time out: The person takes longer than the maximum time allocated, which is variable for the purpose of different products tested. 
  • Confidence: At the end of each task, the users are asked how confident they are in their answer. For example: low confidence in a correct answer would have a negative impact on the TPI score while high confidence in a correct answer would be positive. 
  • Minor Wrong: The person is unsure or their answer is almost correct. 
  • Disaster: The person has high confidence, but the wrong answer. Acting on this wrong answer could have serious consequences so in the event that this occurs, it would have a large negative impact on the TPI score. 
  • Gives up: The person gives up on the task. This will also have a large negative impact on the TPI score because it indicates that user experience is confusing, hard to understand, or frustrating to operate. 
A TPI of 100 means that the user has successfully completed the task within the target time and is confident in their answer.

Here is an example of what a TPI score result looks like.

In this case, the product received a TPI score of 61, which is considered fair. The pie-chart shows how often users fail to achieve what they need to do, which is 29% of the time. The biggest problem present however, is the time it takes for these tasks to be completed- 3.5 times what the target time is! This means that people can figure out how to use this product, but it requires them way more time than they should. Unless the developers of this product fix this problem, the product will just be thought of as "fair". 

Now that we know how to analyze the results, it brings us to the second, if not the most important, part of the equation-

How to Develop Task Questions

Well-designed questions are needed to provide useful data on which the developers can base their feedback on. Usually for each product, around a dozen different task questions are tested, with each specifying in a particular aspect of the product. Here are a few guidelines to keep in mind when designing these tasks: 

  • Based on customer top tasks. The tasks chosen must be ones that are in high demand from users. If you try to improve the performance on tiny tasks, ones which are in low demand from users, you may actually cause a decline in the overall experience. Do not focus on every single tiny detail of the product, instead focus on the most used parts of the product. 
  • Repeatable. Create task questions that you can test again in the future. This is the only way you will be able to see the results of your changes and whether they were useful improvements or not. 
  • Representative and typical.The task questions shouldn't be particularly difficult. They should be something every user will come across at least once in their experience with the product. 
  • Universal. Everyone can do it.Every test participant must be able to do each task. Use this as a way to check whether you've chosen representative and typical tasks, because everyone will be able to do those. There is no point testing something that only someone with specific skills can do because it means the product has not been made properly for its target users. 
  • One task, one unique answer. There should be exactly one thing the user is trying to do, and exactly one way to get that answer. This makes for easy comparisons of results. 
  • Does not contain clues. The test user should not be able to find clues that will assist them in the completion of the task because that blurs results as not every user might use the clue or even see it.
  • Short.The participant is seeing each task question for the first time, so aim to keep the operation short and its answer concise. 
  • No change within testing period. The product should not undergo changes while you are testing as that will lead to useless results that cannot be compared with each other. Wait until you've finished testing to make changes and improve the product. 
Here is an example:

The top tasks that customers of an economic and policy advice organization-Organization for Economic Cooperation and Development (OECD)-are as follows:
  1. Compare country statistical data.
  2. Retrieve statistics on a particular topic.
  3. Access and review working papers. 
Based on these top tasks, and using the aforementioned guidelines, the following task questions were developed. Note how every single guideline mentioned above is present in each of these questions.
  1. In 2008, was Vietnam on the list of countries that received official development assistance?
  2. Did more males per capita die of heart attacks in Canada than in France in 2004?
  3. Find the title of the latest working paper about improvements to New Zealand's tax system. 
(This information was retrieved from the following website: and is used here solely as an example.)

This brings us to the actual testing situation and we will look at-
How to Run the Test

To test 10-12 task questions would usually require about an hour time, and you should repeat this process with numerous participants, ideally above 15.
The beauty of the TPI method is that there is no need to assemble the participants in a lab to perform the testing; it can be done remotely. The data collected would also be more accurate because people will act more naturally in their own environment. This also reduces costs and makes the testing process more accessible to users as it is easier for them to give an hour of their time rather than spend a whole morning at a lab.
One can deliver the instructions over a Skype call and leave the participants to complete the task and reconnect with them once time has been called or they complete the task.

What to Do with the Results:
For large companies, data like this determines whether or not their product will be efficient, and that translates into how many people will use it and profits. Obviously this is an incremental part for the company and may seem distant from our world.
What many people don't realize is how easily the TPI can be applied to our everyday work in technology. As an example, anyone taking I.T. courses at our school will know that there will be project work at some point in the course, extensively if your teacher is Mr. Dodemont. From my personal experience, he places a lot of value on the documentation and testing of your product. He expects very organized and professional testing criteria created by the students, and extensive evidence that testing was done in a thorough and organized manner. This could be hard for students who struggle to identify ways to do that.
This is where they could use the TPI system of testing, as it is easy to create and execute, yet offers impeccably accurate data that is mathematically sound and easily visible. It would also be very easy to show the growth and improvement of a project with the TPI system.

Next time we are all unsure of whether our projects will work effectively, we can draw back on the Task Performance Indicator to find out the answer!

1 comment:

  1. Very specific informations. I like the image of the chart, but you can make that image a little bigger so it is easier to see.