EECS 578 Week 4

Evaluation Techniques

(material from: Information Anxiety by Saul Wurman, The Psychology of Human-Computer Interaction by Stuart Card and friends, Designing the User Interface 3rd Ed. by Ben Schneiderman, Human-Computer Interaction 2nd by Dix etc)

3 goals:
    assess the extent of the system's functionality
    assess the effect of the interface on the user
    identify any specific problems with the system

Jonas Salk spent 98% of his time documenting things that didnt work before he found the thing that did

Kenneth Boulding "The moral of evolution is that nothing fails like success because successful adaption leads to the loss of adaptability ... This is why a purely technical evaluation can be disasterous. It trains people only in thinking of things that have been thought of and this will eventually lead to disaster"

How evaluation is done depends on many factors
    stage of design
    novelty of project
    number of expected users
    criticality of the user interface
    costs of product and finanses allocated for testing
    time available
    experience of the design and evaluation team

Different types of evaluation

Evalutaing the Design
   cognitive walkthrough
        detailed review of a sequence of actions
        main focus is on how easy the system is to learn for a new user
        given
            - description of system prototype
            - description of task user is to perform
            - complete written list of actions user must perform
            - indication who the users are and what experience they have
    heuristic evaluation
        multiple evaluators
        main focus is evaluating early designes
        10 heuristics
            visibility of system status
            match between system and real world
            user control and freedom
            consistency and standards
            error prevention
            recognition rather than recall
            flexibility and effeciency of use
            aesthetic and minimalist design
            help users recognize, diagnose, recover from errors
            help and documentation
    review-based evaluation
            look through existing literature for previous related experiments
    model-based evaluation
            GOMS, keystroke-level model, etc

Evaluating the Implementation

qualitative
quantitative

    - Expert reviews
    - Usability Testing
        in the laboratory - controlled but may be unrealistic and short term focus
        in the field - longer term, more realistic but harder to control

        informal testing with mockups
        thinking aloud
        video and audio tapes

        tends to emphasize first time usage and limited number of features
        pilot studies are very important to find errors in the testing procedure
            run through the entire experiment with a small group of subjects
        participation should be voluntary and FULLY informed
        user should feel they are not being tested
        important to collect data about the participant's background
        privacy of records is very important

    Surveys
        Questionaire for User Interaction Satisfaction (QUIS)
        www.lap.umd.edu/QUISFolder/quisHome.html
    Acceptance Tests
         establish specific testable criteria for the aplication
            time to learn, speed of usagem rate of errors

Controlled Experiments
come up with a hypothesis that is testable and measurable.
set up an experiment where certain control variables are varied

        subjects
            match expected users
            should have at least 10 subjects, in general more is better
        variables
            variables that are manipulated - independent variables
                each independant variable can have a number of differnet values - levels
            variables that are measured - dependent variables
            manipulate independent variables to produce different conditions for comparison
            dependent variables should be only affected by the independent variables
        hypothesis
            prediction that varying the independant variables will affect the dependant variables in a certain way
            goal is to show that this prediction is correct
            disprove the null hypothesis (no difference in the dependent variable between levels of indep. variable)
            produce values to compare to various levels of significance
            if its significant, at some level of certainty, that differences would not have occurred by chance
        experimental method
            between groups (randomized) - each subject assigned to a different condition
                each user only does 1 condition
                experimetnatl condition - the variable has been manipulated
                control - experimental condition without manipulating the variable
                need more subjects
                differences among subjects can bias the results
            within groups
                each user performs under each condition
                possible problems with transfer of learning effects
                need fewer users
        statistics
            LOOK at the data and SAVE the data

in depth discussion of Evaluation of Mouse, Rate-Controlled Isometric Joystick, Step Keys, and Text Keys for the Text Selection on a CRT by Stuart Card, William English, Betty Burr from Xeros Palo Alto Research Center in Ergonomics 21:601-613

Department of Electrical Engineering and Computer Science
University of Illinois at Chicago