UX Evaluation: User Testing
If there are 2 user profiles how many participants are needed
- 12 participants - needs 5 participants to cover the hypothesis and 2 extra to account for the no shows
If there is only one user profile how many participants are needed?
- 7 participants as there needs to be at least 6 to test the hypothesis and another one is to account for no shows
How many participants are needed for correlational studies and how many usability issues
- 8 to 10 participants identify about 80 percent of usability issues
How to calculate the NPS
- Calculate the difference between the percentage of promoters and detractors - the promoters are consumers who give a 9 or 10 - the detractors are those who gave score of 6 to 0 - the undecided gave the company a score of 7 to 8 - the NPS is presented in absolute number between -100 to 100
Correlational studies
- Carries out a homogenous sample of participants who must perform a certain number of task with the evaluation object - can't claim causation
Performance
- NASA TLX - good to poor - how successful were you in performing the task - how satisfied were you with the performance
effort
- NASA TLX - low to high - how hard did you have to work (mentally or physically) to accomplish your level of performance
frustration
- NASA TLX - low to high - how irritated, stress of annoyed were you with the content
Mental demand
- NASA TLX - low to high - how much mental and perceptual activity was required - was the task easy or demanding, simple or complex
Physical demand
- NASA TLX - low to high - how much physical activity was required - was the task easy or demanding, slack or strenuous
Temporal demand
- NASA TLX - low to high - how much pressure did you feel due to the pace at which the task occurred - was the pace slow or rapid
Moderator
- Neutral person who conducts the user test sessions. They lead the sessions and give instruction to the participant to complete the user test tasks - observes the reactions and actions of the participant - this allow to gather contextual information - at the end they conduct an interview with the participant in which they are asked about their overall impressions and their experiences. A question guide is used to make sure each is consistent
Is there one magic structure that you should use when writing the mandate?
- No, it's to you and your experiencd to structure tne sections of the report and if necesssry add aditonal sections
context of use
- Refers to the users, tasks, equipment, physical, and social environment in which a product is used. - the very different content of who is and where they are additional usability constraints - the usage contexts will have to be anticipated to ensure that the interface is adequate effectiveness according to the anticipated by users
What are the advantages of the SAM scale?
- SAM scald is highly correlated with implicit measures abs captures part of the user automatic and unconscious reaction - it useful to quickly measure the perception of emotions
What measure are used to measure someone cognitive when using a interface
- The NASA TLX - The customer effort score (CES)
NASA TLX
- The scale was developed in 1988 by Hart and Staveland for NASA to measure the mental load perceived by users after interaction with a technology. - it was used at end of each task or at the end of the test - there are six dimension which are mental demand, physical demand, temporal demand, performance, effort, and frustration level
Are there any rules governing for how many words should be used in a stand alone report
- There isn't and the the focus should be more on making sure that the report makes sense on if own without seeing the presentation. - this includes taking advantage of boxes and footnotes to add details essential to understanding your analysis
Guerrilla testing
- To conduct fast and inexpensive user test - use it in the formative mode to get low cost feedback as often as possible - could lead to wrong conclusions and be more costly in the long run
What are the different parts of the mandate
- a description of the object of the evaluation - the goals of the evaluation - the research questions - the user groups targeted by the evaluation - the experience boundaries - the format of the deliverables
inductive reasoning for classifying info
- appropriate where prior knowledge concerning the topic is limited or when the technological field is not mature enough - categories or themes emerge during the analysis of the data by the UX researcher - this is the approach that is most often used
inductive reasoning coding
- begins without a predetermined category - used for exploratory research - read all the observation collected through once fully. To know the vocabulary the person uses - the second time the researcher will gradually will see different themes emerge - the researcher will create as many themes as they think are necessary - review the themes to review the different themes and merge any if necessary
What are the four main categories of measure
- behavioral - cognitive and emotional state measures - perceptual and attitudinal measures - measure of intention
What are the three main types of experimental design
- between subject design - within-subjects design - correlational design
What is measure of success used to analysis
- binary values with only two modalities (0 = fail or 1= success) - sometimes it could be useful to compare partial success thus you can code for 0 = failure, 1= partial success 2= success - Preference: If you want to compare interface A vs B there has to be binary variable. If the participation preferences A put one and 0 for B
How do you report the findings of the tests?
- by the severity of the need or problem of what was found - there is no object test to establish what is the most critical or severe. You have to use your opinion to determine this. It's important to use example from the data to show the severity
What consequences does not testing a product have?
- can have serious economic consequences for an organization - research has shown that discovering and correcting problems very early on in the product development costs 100 times less than fixing the problem while the product/service is already in production
What can happen if the evaluation mandate doesn't clearly define target groups
- can lead to very heterogeneous results - will be hard to make concrete recommendations to the developers - lead to wrong conclusions and lead the team in the wrong directions
What should the participant be informed with before they start?
- clear instruction of the task at hand - If the task has a time limit inform the participant - inform situations that may arise and give moderators instruction in the protocol
How to clearly establish the object of the evaluation
- clearly identify the artifact to be evaluated by saying the formal name and version - specify on which software platforms the evaluation will be carried out - specify the functions or function enabled by the artifact
What are the steps of the scientific method
- coming up with research questions - lit review -formulate hypothesis - test hypothesis through experiment - data is analyzed to see if it will confirm hypothesis - leading to answers to research questions
How should headlines be written
- could be used for a title newspaper with this they are descriptive in which they give you an idea what the content is going to be. They also should be interesting. - it's not interesting to just write results as that is boring and doesn't want to make people read forward
What type of variable is measurement of scales and where are they found
- discrete and ordinal between two limits
What is important to mention during the intro of user testing?
- duration of the study - how much they will be paid - that they are allowed to ask questions
within-subjects design
- each subject is consecutively and randomly exposed to different experimental manipulations.
What are the 12 dimension of WebQual
- ease of understanding - intuitive operations - response time - trust in website - tailored communications - informational fit to track - emotional appeal - visual appeal - innovativeness - consistent image - online completeness - relative advantage
The participant selection should be in line with the what?
- evaluation objective - the objective is not always to test all user groups as it is important to choose only the user group that will provide answers to the research questions
What should the test plan include?
- experimental design - profiles of participants and recruitment strategy - the interfaces being evaluated - scenarios and tasks - the test environment - tools and measures - the detailed protocol
What are the two main ways to conduct evaluation
- expert evaluation which is based on the judgement of one or more of the evaluators who asses a system to identify potential usability issues a system to identify potential usability issues including deviations from established performance criteria. - user based: participation of representatives users performing tasks to identify usability issues or address other dimensions - conformity assessment: used to demonstrate that specified requirements for a process or system are met
What are the three types of measurements
- explicit - implicit - think aloud
When and how should content transcription
- ideally it should happen in real time that two people take notes and compare them afterwards - it is important to do this as close to actual session as possible to not forget context
When to do a two-tailed test
- if we hypothesize that there is a difference between two phenomena without assuming the direction of the difference
When is one tailed test needed
- if we hypothesized there is not only an difference but the difference will be smaller or larger
Eye tracking
- implicit - behavioral - measures the user attention during the task
The feature usage rate
- in some cases it may be useful to measure participants actions - only rely on video recording of the sessions and encode the results afterwards
The preference test
- in the context of an evaluation choices are often about alternative design - different version are presented and at the end they have to tell which one they prefer - this often linked to the success rate, completion time -represented in a ratio - behavioral measure of preference
Self-Assessment Manikin
- is a discrete non-verbal photographic scale developed by Bradley and Lang in the 1980s - it measures three construct which are valance, intensity of the emotion, and level of dominance - valance is defined as directionality of the emotion ( from unhappy to happy) - arousal as the intensity of the emotion (non-excited to excited) - the participant's level of dominance during the interaction (from controlled to in control)
Customer satisfaction score
- is a tool to measure satisfaction through a single question - clear and quick to answer and is rarely skipped by participants - rarely skipped by participants - are satisfied with your product/ the answers provided by our employees and service rendered - discrete 5 point scale that goes from very satisfied to not satisfied at all
Net Promoter Score
- is one of the common measure of customer experience used by businesses around the world. - it used to measure user loyalty - used before or after the task or test
What is the role of interview guide
- is to collect additional info during the experiment
Does there has to be compensation?
- it depends on what is being required of the participants - if the participants is short it's possible to get people to participant - but for an one hour it is best to offer an compensation it doesn't have to be monetary
When a formal report and oral presentation always be required
- it will typically depend what is required by the client and organization. Also, how the results will be used. - when you're doing an evaluation part of a developmental project a report in form of presentation will be requested - the report can be informal as well
Why choose a correlational design
- it's a matter of time and resources - the test budget might be enough to recruit different groups of participants - can be used as an alternative to experiments designs especially for formative tests when the project is still in development
How many grouping should a affinity diagram lead to?
- less than 10
How to enter time measurements
- limited to upper bound by maximum time allowed for a task - converted to seconds
Research questions
- linked to the tasks that the user performs on the interface and should be drafted with an action verb - clearly operationalized
Recruiting very specific individuals for an evaluations can be what ( 3 words)
- logistically complex - time consuming - costly
What does the correlational design make it possible to do?
- make some correlations within the sample to outline a avenues for further reflection
What is a requirement for a experiment
- manipulation therefore an evaluation where the characteristics of the participants and the interface are not manipulated it's not a experiment - in short without manipulation is not possible to have an formal hypothesis or to measure rigorously the effect on a dependent variable
The customer effort score
- measures users effort to perform an interaction with the object being evaluated - how much effort did you have to put into using our digital service/application - discrete five anchor scale very weak, weak, neural, strong, and very strong - divide the sum of the CES of all of the participants by the total number of participants
Where should user testing be conducted
- must take place in a controlled space so the external stimuli will not disturb the user - this won't replicate the exact experience of being in the environment but it's possible to conclude that the issues would be worse in real context
What measures is the CES usually used with?
- net promoter score - customer satisfaction score
What are the two score that measure measure of intention
- net promoter score or NPS - customer satisfaction score CSAT
Is it necessary to write the significance, the name of the test, and other statistical details
- no at least not in the main body of the report instead asterisk can be used so in the footnote people can see the level of significance - it's also common to not include the test name or other statistical details as it can clutter the report
Is 7 to 10 enough to test a hypothesis
- no there isn't there needs to be at least 12
Should you always use all 12 dimension of the WebQual
- no, as it depends on the context of the study
Is it common to make a complete textual analysis
- no, it is not instead it's important to pick the important insights from the interview and capture every world of the user
In a within-subject design are all the participants exposed to all of the conditions
- no, there is going to be cases where there are too many conditions that each participants would have to go to - this could impossible just in general but especially since usability test should never go on for over an hour to avoid fatigue
Need
- perquisite requires as necessary for a user or a set of user to achieve a desired result implicit or declares in a specific context l
What are ways to reduce systematic bias?
- pre-registration process via an online form that you can advertise on different platforms - put a short list of questions to qualify participants and thus generate diversity among your groups of participants
How should you end the report
- provide the client food for thought on future research questions that would deepen or develop understanding of the user experiment
What type of data is quantitative analysis used for
- psychometric-type data such as standardized measures scales and behavioral measures like time and success
What reminders containing instruments should be included
- related to the measurement tools - if you are using a camera it is important to indicate when the recording will start in the detailed log - if you are using oculometric of neuropsychological measurements you will provide instruction for mounting and calibration of these instruments
Why is good to not have too many targeted groups
- requires more participants which may require more financial and other resources
The user testing process
- research aim and associated questions - through stakeholder interviews you will understand what they want to test - forms a hypothesis - tests the hypothesis using usability study - analysis results and sees if data supports hypothesis - the observation will be converted into recommendations listed according to severity
What biases is your perception subjected to
- retrospective bias: it can be difficult for the user to accurately recall the interaction
What is a contextual interview
- seek ti understand the why behind specific actions or reactions from the participants
What should interviews be
- short - focused
deductive reasoning categorization
- start with known theories and preconceived codes/ categories - the goal is to build upon a known typology
What might a participant do before they begin the task?
- take a pre-test survey - link to the survey should be included in the protocol - may be useful to have a PowerPoint slide with the instruction and link - if there are multiple surveys there should be identifiers to connect the surveys with the person
How to calculate the WebQual results
- take the average of the three questions corresponding to each of the dimension
What could happen if the target group sample size is small
- the company has to be able to put the necessary resources into recruiting and know that this can be the most costly part
What happens when the experience boundaries are hard to define
- the experience may not start or end with the participants using the product/service as they may have to research, open the box, and put it together for example - there could be various different end points - with both of these it should be established with the stakeholders what they want the start and end point to be
What does the experimental design identifies
- the manipulated factors or independent variable ( for example, version A and B of an interface) - as well as the dependent variable on which the independent is likely to have an effect ( for example the success or failures of a task by the user)
Where should the important findings and recommission be put in the report
- the most important findings should be presented early on in the executive summary or highlight sections
Why deductive approach easy for coding
- the researchers uses an already existing coding scheme with different categories - a revision will take place to adapt to fit the objectives - will measure have times a theme appears and will compare them - typically to verify existing hypothesis
What are measure are used to measure someone emotions when using a interface
- the self-assessment manikin (SAM) Scale - The Affective Slider
What do participant tell in the interview
- their thoughts - their opinions - what they believe - their value - what motivates them - why they did what they did
If there are three user profiles how many participants are there?
- there needs to be 4 participants of each user profiles - since there is overlap there doesn't have to be as many participants - there has to be 2 extra participants - 14 in total
How to identify tasks that the users will perform during the evaluation
- they should be directly related to objective and the research questions - keep in mind the expected duration of the evaluation and this tell you many task you can do
What are the inclusion and exclusion criteria
- to identify what are the characteristics that will determine whether or not user is eligible for the usability tests
What are the steps to making a affinity diagram
- transcribe unitary unitary ideas that is each of the category of ideas on posit notes. They come ideas that were recorded verbatim. Note how many times that each occurred to keep in mind how importance is each - group these based on higher level concepts to give a name to larger categories
What is the challenge carry out user tests? What is the solution
- typically used testing had a very small simple size but in stats you typically need to large sample size to make assumptions about it. This is why a parametric statistics is used because it doesn't make assumptions
What does testing allow us to do
- uncover usability issues - provide ideas on how to improve the interface and functionalities of the product/service - will ensure that the users will want to use it again
Is usability a binary measure and explain?
- usability is not a binary measure and it's a gradient
What can result from systematic bias?
- usability issues can missed because the test wasn't done with the right users
How to reflect the voice of the user in the report?
- use excerpts from your participants' verbatim to support the results or give an example of their reaction - use the exact words of the participants even if the vocabulary or syntax aren't great - put what number of participant or their first name to identify nsnd
What are the six dimensions of usability
- utility - effectiveness - efficiency - satisfaction - learnability - accessibility
Affective slider
- was developed in 2016 by Betella and Vershue with the objective of replacing the SAM scale in online surveys - two slider controls are used to allow the users to indicate their level of pleasure and arousal by dragging the cursor on the response bar
What are the three core elements?
- what are the typical tasks engaged in by your target users - what is the typical duration or desired duration to complete those tasks - what are the success criteria associated with each of those task
Sources of info for interview guide
- what does the mandate want the test to clarify - improve the questions with info from the unofficial heuristic review that was conducted for the study
When can systemic bias result from?
- when you cut corners and choose convenience it's possible that the users can be too similar that could skew the results - the participants may not be representative of the population
It is useful to have an interface image? If so, how should it be used?
- yes, the report should arrows to point to different elements of interest for which you have recommendations
System Usability Scale (SUS)
-developed by Brooke - used to evaluate the perception of an application after it use - it has 10 items with 5 anchor points - normative nature
What are the two different ways that content analysis can be used? Depending on what?
-inductive and deductive - context and the maturity of what is being studied
How many more participants should recruit to account for the no shows?
10-20 percent or 14 participant
What is the general rule for one hour long test
3-5 tasks
How long should interviews be
5 to 15 minutes long
_______ to _______ will discover what percent of usability issues
7 to 10 80 percent
User Interface element
A basic component of a user interface that is presented to the user by the interactive system
How many questions does the interview have
A few questions
Usability Defect
Product attributes that leads to a mismatch between user intention and/or user actions and the system attributes and behavior
content analysis
Qualitative research method that allows for subjective interpretation of written or oral documents through a systematic classifucation process The goal is to group them into different themes typically have a lot of detail under each theme but it depends on the objectives of the study
User
Refers to the person who interacts with a system, product, or service.
Perception
Refers to the user perceived experience it is conscious experience that the user process cognitively and is able to verbalize and explain
quantitative analysis
Represents the second type of analysis necessary to carry out a rigorous evaluation
Modal window
Requires users to interact with it in someway before they can return to the system
What comes after the objective of evaluation
Research questions
A UX evaluation is first the study of an interface using _________________
Scientific approach
Measure of intention
Seek to understand the future actions in relation to the experience and what is being evaluated Measure before and after a task
Message box
Small window that provides information to users and requires them to take action before going forward
Usability is measured for a population of user with
Specific characteristics
What is the terms of reference
Specifies how you and your term members will work together to accomplish common goals. To create shares expectations but also accountabilities among the team members
How to calculate and analyze: the SUS
Subtract 1 from odd items Subtract 5 from even items Add the recalculated items Multiply the sum by 2.5
What is important to avoid no matter what the recruitment method
Systematic bias
What is the next step after the evaluation purpose
Term of reference
satisfaction
The absence of discomfort and a positive attitude towards the use of the product/service
Usability
The degree to which a digital product or service can be used by users in order to achieve specific goals with effectiveness, efficiency, and satisfaction for a product/service in a specific context
Effectiveness
The precision and level of completion in which a users can achieve a specific goal
Objective of evaluation
The purpose is not to explain the specific problem while using the object. But a specific objective which will make it possible to identify important issues. To justify the investment that needs to be made. It should include one or several of the usability attributes
Efficiency
The resources devoted to the accuracy and level of completion with which users reach the objectives
After the introduced what should as soon as possible
The test environment in order to foster as much trust as possible. Depending on the environment and the tools used it could intimidate the participants. If there are cameras be sure to point all the cameras out. If there is a one-way mirror be sure to explain who is behind the mirror.
Note taker
This is also recommended to have and they note observation during the test
Counterbalancing
This is where all participants are all exposed to all the conditions. In different orders with equal number of participants being exposed to the different conditions.
Why is it important to have a standard speech?
To ensure consistency in the information communicated to the participants before
What is the aim of the evaluation
To highlight usability issues abs ways to improve the user experience to ensure uninterrupted use of the interface - the usability of the system may lead to usability errors which will be judged on their severity
What is purpose of the interview
To know why the participant did what they did
Why do participant complete a post task survey
To measure perception, attitude, and state of the individual in the relation to the task
What can content analysis be used to analyze
Virtually any type of qualitative UX data methods
SUS Score less than 25
Worst imaginable
Are quantitative and qualitative data necessary to have complete perspective
Yes, an assessments that does not use these two types of data would be incomplete and might not be able to properly meet the needs of the Mac date
Should you put limits and if so. where?
Yes, as all studies have limits the clients should what your were. In the conclusion there should be a reminder of the methodological choices and the limits
Behavioral measures
aim to capture the user actions and choices during a task
SUS score 40 to 52
Acceptable
Evaluation mandate
Aims to define the objective and the scope of the technological interface to be evaluated. This is crucial step in the preparation of an evaluation project. A poorly written mandate can have serious consequences on the process
User Interface
All components of an interactive system that provides information and controls for the user to accomplish specific tasks with the interactive system
Checkboxes
Allow the user to select one or more options from a set.
What does choosing compare different experiences for the same participant
Allows to answer questions relating to intra-related designs. This type of design leads to data organized in pairs ie each participant is associated with at least two measures. For example, compare the completion time of two different interfaces with the same group of participants
Text fields
Allows user to enter text. It can allow enter a single line or multiple lines of text
Dropdown list
Allows user to select one item at a time but are more compact allowing you to save more space.
Breadcrumbs
Allows users to identify their current location within the system by providing a clickable trail of proceeding pages to navigate by
Tooltip
Allows users to see hints when they hover over an item indicating the name or purpose of the item
Date picker
Allows users to select a date and/or time. By using a picker the information is consistently formatted and input into the system
List boxes
Allows users to select multiple items at a time but are more compact and can support a longer list of options if needed
What does choosing compare the user experience between two groups of participants
Allows you to answer for questions related to a inter-participant designs. This type of design leads to obtaining independent measures (each participant is associated with only one measure). For example, comparing the completion time of the same interface using two groups of participant
What does the ux calculator allow you to do?
Allows you to run statistical tests tailored to the main analyses that a UX professional will have to perform as part of assessment
Radio buttons
Are used to allow users to select individual item at a time.
What does the international organization for standardization defines user experience as
As a person perception and responses resulting from the use or intended use of a product
Perceptual and attitudinal measures
Asses the participant subjective appreciation of the experience and objective being evaluated This can be assess before and after a task
Between subject design
Assigns participants to a single experimental conditions and compare the results between groups. In other words, participants only see one experimental condition. The average of the results obtained by all participants in one condition is compared with the average of the results obtained in other conditions
SUS score 26 to 39
Bad
What type of measure is success rate of task
Behavioral measure
SUS score 86 to 100
Best imaginable
Explicit measure
Best known and most frequently used aims to measure the phenomena while the users are consciously aware of it Before the test, after the task, and after the test
Toggle
Button allows the user to change a setting between two states. They are most effective when the on/off states are visually distinct
Which of these evaluation objectives is not well defined and why? A. User satisfaction after a first experience B. The ability of users to quickly learn features C. The experience of a new application D. The accessibility of a website to a website with visual impairment
C. It's too vague what do they want to know about the experience? What do we want to measure this doesn't tell us anything. There is not one of the 6 dimension of usability mention here.
implicit measures
Captured automatically and often unconsciously during these tests
Psychophysiological measures
Captures the participant cognitive and emotional reactions using sensors
Type one error How can this happen
Concluded that there is no issues when there is Selection of users didn't really match the population and expertise of the users were too high relative to targeted users
Type 2 error Examples
Concludes that there are issues when there aren't Users selected are too novice and don't understand how to use the interface
Requirements
Conditions must be met by a system, system component, product or service to satisfy a contract, standard, or specification
The dropdown boxes
Consists of a button that when clicked displays a list of mutually exclusive items
What type of interview is used
Contextual interview
Time measurements what kind of variable
Continuous
What are majority of evaluation in UX research
Correlational studies
The cognitive and emotional state measure
Deal with the experience lived and perceived by the participants Assessed during and after the test
iterative approach
Development teams interacts with the actual user of the digital produt or service during the developmental phase to get their feedback and adapt the design accordingly.
What type of variable is measure of success
Discrete
pagination
Divides content up between pages and allows users to skip between pages or go in order through the content
The ISO CIF ISO/ IEC 25066: 2016 format is often required when
During the conformity assessments or by institutional clients
The scientific method is an___________ approach to the developmental of knowledge based on _______________ and observation
Empirical Experimentation and observation
SUS score 74 to 85
Excellent
What is the first step of the test plan
Experimental design
Accessibility
Extent to which product, service, systems, services, and environment can be used by people in a population with the broadest range of characteristics and abilities to achieve a goal in a specific context of use
Learnability
Extent to which products/service allow users to quickly become familiar with them and make good use out of all of their feature and capabilities
Testing the experience of new digital product and services before making them available to users allows you to discover and correct a problem early on in their development. This costs 10 times less than correcting the same problem when the product or service is already available. True or false and why?
False as it is 100 times and not 10 times
A evaluation can either be _____________ or ______________
Formative or summative
SUS Score 53 to 73
Good
In what format should the number of participant who liked a UI or succeeded at the task and why?
In a ratio as the number of participants/ total number of users Percentage should be avoided as you don't want to make sure reference about the total population as it's often not random or big enough to do so
Button
Indicates an action upon touch and is usually labeled using text, icon, or both
Dialogue
Interaction between the user and an interactive system as a sequence of user actions and system responses in order to achieve a goal
What should the evaluation typically end with?
Interview to collect impressions and perceptions
Prototype
Is a representation of all or part of the interactive which is limited in some way but can be used for analysis, design, and evaluation
test plan
Is a tool that the UX researcher uses to organize and structure the evaluation. It purpose is to document all activities needed for the test preparation, execution, and analysis. It must include all the asserts and instruments needed to make the evaluation possible. The test plan should also be defined and environment in which the test will be run. It is also a mean of communicating the test process to internal and external stakeholders. Defining the evaluation is also a way of getting feedback to improve the overall process
Think Aloud method
Is an alternative to the implicit measure it involves asking users to describe our loud their emotional and cognitive actions while performing a task This method helps to understand the reasoning and motivation behind each user action Used exploratory studies when seeking to understand the user mental models It tends to slow down the participants as a result task completion times are not accurate
System
Is defined as a combination of interacting elements organized to achieve one or more states purposes. It's very large and integrates all the underlying software and applications.
summative evaluation
Is done when the design is near completion or already complete. It aims is to evaluate the quality of a developed system and check all the requirements are key
Utility
Is the extent to which a person believes that the use of technology improves their ability to perform a tasks
What is the goal of the scientific approach
Is to provide rigorous data to make design decisions.
What is the objective of the experimental design is to what?
Is to structure the evaluation in such a way to answer the research questions.
Affinity Diagram
Is useful tool to help classify data of a qualitative nature. In a content analysis this can help gather common ideas to help reduce number of codes
Why is counterbalancing used?
It can help with learning effect, which if you have the same amount of participants starting at different conditions and going to next one in different orders. It allows for researchers to eliminate the chance of people like version B because they got familiar with it when they used version A
Success rate
It is a binary variable that represents the completion or not of a task by specific success criteria However, the users may want to look for partial successes in this cases it would not be a binary variable It's generally reported in a ratio as it doesn't make generalization about the population
What happens if the mandate is too narrow
It might miss the real issues altogether of an interface
What happens if the evaluation mandate is too vague or broad
It will lead to deviation from the recipient expectations and potentially disappoint them
Objects of the task
Key units of information or data with which users interact to perform the task
Formative evaluation
Made during the development of a system and aims to provide recommendations for the systems improvements
Observers
May be present during the test sessions. Observes the participants who are talking or doing related to the system
WebQual
Measure the quality of an eCommerce website Developed by Loiacano, Watson, and Goodhue To what extent do you agree with the following statements anchor points are generally used 1 = strongly disagree to 7 = strongly agree
UX Evaluation is based on a ________ approach
Multi-method approach
It is important to develop a ________ framework in the report
Narrative
The design of a new system and its interface requires a good understanding of user __________ in order to establish the __________
Needs Requirements
Is it necessary to include all the data you have found in the report?
No, as it's often too much and the goal is to answer the research questions and only using the relevant data to do so. If necessary the additional information can be put in the appendix
Should the instructions always be detailed and directive
No, as you occasionally you want to understand how the user will navigate by themselves on the interface. But they should still be detailed and give them a scenario
The ISO CIF ISO/ IEC 25066: 2016 format is often useful to
Often useful to meet the requirements of a tender and to assess compliance.
What is the most difficult tasks in UX evaluation is
Participant recruitment strategy
Latin square design
Participants are randomly assigned to some of the experimental conditions. All conditions are going to be seen by equal amount of participants
The purpose of the report is not only to ______________ but to _____________
Present the results of the test To answer the research