Evaluation - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Evaluation

Description:

Spring 2002. CS 7450. 2. Area Focus ... Spring 2002. CS 7450. 11. Confounds. Very difficult in InfoVis to compare 'apples to apples' ... – PowerPoint PPT presentation

Number of Views:158
Avg rating:3.0/5.0
Slides: 46
Provided by: JohnSt7
Category:

less

Transcript and Presenter's Notes

Title: Evaluation


1
Evaluation
  • CS 7450 - Information Visualization
  • April 4, 2002
  • John Stasko

2
Area Focus
  • Most of the research in InfoVis that weve
    learned about this semester has been the
    introduction of a new visualization technique or
    tool
  • Fisheyes, cone trees, hyperbolic displays,
    tilebars, themescapes, sunburst, jazz,
  • Isnt my new visualization cool?

3
Evaluation
  • How does one judge the quality of work?
  • Different measures
  • Impact on community as a whole, influential
    ideas
  • Assistance to people in the tasks they care about

4
Strong View
  • Unless a new technique or tool helps people in
    some kind of problem or task, it doesnt have any
    value

5
Broaden Thinking
  • Sometimes the chain of influence can be long and
    drawn out
  • System X influences System Y influences System Z
    which is incorporated into a practical tool that
    is of true value to people
  • This is what research is all about (typically)

6
Evaluation in HCI
  • Takes many different forms
  • Qualitative, quantitative, objective, subjective,
    controlled experiments, interpretive
    observations,
  • Which ones are best for evaluating InfoVis
    systems?

7
Controlled Experiments
  • Good for measuring performance or comparing
    multiple techniques
  • What do we measure?
  • Performance, time, errors,
  • Strengths, weaknesses?

8
Subjective Assessments
  • Find out peoples subjective views on tools
  • Was it enjoyable, confusing, fun, difficult, ?
  • This kind of personal judgment strongly influence
    use and adoption, sometimes even overcoming
    performance deficits

9
Qualitative, ObservationalStudies
  • Watch systems being used (you can learn a lot)
  • Is it being used in the way you expected?
  • Ecological validity
  • Can suggest new designs and improvements

10
Running Studies
  • Beyond our scope here
  • You should learn more about this in 6750 or 6455

11
Confounds
  • Very difficult in InfoVis to compare apples to
    apples
  • UI can influence utility of visualization
    technique
  • Different tools were built to address different
    user tasks

12
Examples
  • Lets look at a few example studies that attempt
    to evaluate different InfoVis systems
  • Two taken from good journal issue whose focus is
    Empirical Studies of Information Visualizations
  • International Journal of Human-Computer Studies,
    Nov. 2000, Vol. 53, No. 5

13
InfoVis for Web Content
  • Study compared three techniques for finding and
    accessing information within typical web
    information hierarchies
  • Windows Explorer style tool
  • Snap/Yahoo style category breakdown
  • 3D hyperbolic tree with 2D list view (XML3D)

Risden, Czerwinski, Munzner and Cook IJHCS 00
14
XML3D
15
Snap
16
Folding Tree
17
Information Space
  • Took 12,000 node Snap hierarchy and ported it to
    2D tree and XML3D tools
  • Fast T1 connection

18
Hypothesis
  • Since XML3D has more information encoded it will
    provide better performance
  • But maybe 3D will throw people off

19
Methodology
  • 16 participants
  • Tasks broken out by
  • Old category vs. New category
  • One parent vs. Multiple parents
  • Participants used XML3D and one of the other
    tools per session (vary order)
  • Time to complete task measured, as well as
    judgment on quality of task response

20
Example Tasks
  • Old - one
  • Find the Lawnmower category
  • Old - multiple
  • Find photography category, then learn what
    different paths can take someone there
  • New - one
  • Create new Elementary Schools category and
    position appropriately
  • New - multiple
  • Create new category, position it, determine one
    other path to take people there

21
Results
  • General
  • Used ANOVA technique
  • No difference in two 2D tools so their data was
    combined

22
Results
  • Speed
  • Participants completed tasks faster with XML3D
    tool
  • Participants were faster on tasks with existing
    category, larger when a single parent was
    involved

23
Results
  • Consistency
  • No significant difference across all conditions
  • - Quality of placements, etc., was pretty much
    the same throughout

24
Results
  • Feature Usage
  • What aspect of XML3D tool was important?
  • Analyzed peoples use of parts of tool
  • 2D list elements - 43.9 of time
  • 3D graph - 32.5 of time

25
Results
  • Subjective ratings
  • Conventional 2D received slightly higher
    satisfaction rating, 4.85-4.5 out of 1-7
  • Not significant

26
Discussion
  • XML3D provides more focuscontext than the
    other two tools that may aid performance
  • Appeared that integration of 3D graph plus the 2D
    list view was important
  • Maybe new visualization techniques like this work
    best when coupled with more traditional displays

27
Handout Paper
  • Empirical study of 3 InfoVis tools
  • Eureka, Spotfire, InfoZoom
  • Discuss methods and results
  • What task types were the questions?

Kobsa InfoVis 01
28
Findings
  • Interaction Problems
  • Eureka
  • Hidden labels, 3 or more vars., correlations
  • InfoZoom
  • Correlations
  • Spotfire
  • Cognitive set-up costs, scatterplot bias

29
Findings
  • Success depends on
  • Properties of visualization
  • Operations that can be performed on vis
  • Concrete implementation of paradigm
  • Visualization-indept usability problems
  • Would have liked even more discussion on how
    tools assisted with different classes of user
    tasks

30
Space-Filling Hierarchy Views
  • Compare Treemap and Sunburst with users
    performing typical file/directory- related tasks
  • Evaluate task performance on both correctness and
    time

Stasko, Catrambone, Guzdial and McDonald
IJHCS 00
31
Tools Compared
Treemap
SunBurst
32
Hierarchies Used
  • Four in total
  • Used sample files and directories from our own
    systems (better than random)

Small Hierarchy (500 files)
Large Hierarchy (3000 files)
A
B
A
B
33
Methodology
  • 60 participants
  • Participant only works with a small or large
    hierarchy in a session
  • Training at start to learn tool
  • Vary order across participants

SB A, TM B TM A, SB B SB B, TM A TM B, SB A
32 on small hierarchies 28 on large hierarchies
34
Tasks
Identification (naming or pointing out) of a
file based on size, specifically, the
largest and second largest files (Questions 1-2)
Identification of a directory based on size,
specifically, the largest (Q3)
Location (pointing out) of a file, given the
entire path and name (Q4-7) Location of a file,
given only the file name (Q8-9)
Identification of the deepest subdirectory
(Q10) Identification of a directory containing f
iles of a particular type (Q11)
Identification of a file based on type and size,
specifically, the largest file of a
particular type (Q12) Comparison of two files by
size (Q13) Location of two duplicated directory
structures (Q14) Comparison of two directories
by size (Q15) Comparison of two directories by n
umber of files contained (Q16)
35
Hypothesis
  • Treemap will be better for comparing file sizes
  • Uses more of the area
  • Sunburst would be better for searching files and
    understanding the structure
  • More explicit depiction of structure
  • Sunburst would be preferred overall

36
Small Hierarchy
Correct task completions (out of 16 possible)
37
Large Hierarchy
Correct task completions (out of 16 possible)
38
Performance Results
  • Ordering effect for Treemap on large hierarchies
  • Participants did better after seeing SB first
  • Performance was relatively mixed, trends favored
    Sunburst, but not clear-cut
  • Oodles of data!

39
Subjective Preferences
  • Subjective preferenceSB (51), TM (9), unsure
    (1)
  • People felt that TM was better for size tasks
    (not borne out by data)
  • People felt that SB better for determining which
    directories inside others
  • Identified it as being better for structure

40
Strategies
  • How a person searched for files etc. mattered
  • Jump out to total view, start looking
  • Go level by level

41
Summary
  • Why do evaluation of InfoVis systems?
  • We need to be sure that new techniques are really
    better than old ones
  • We need to know the strengths and weaknesses of
    each tool know when to use which tool

42
Challenges
  • There are no standard benchmark tests or
    methodologies to help guide researchers
  • Moreover, theres simply no one correct way to
    evaluate
  • Defining the tasks is crucial
  • Would be nice to have a good task taxonomy
  • Data sets used might influence results
  • What about individual differences?
  • Can you measure abilities (cognitive, visual,
    etc.) of participants?

43
SHW5
  • Design and evaluation of some info vis system(s)
  • Focus, methodology
  • Benefits, confounds

44
References
  • All referred to papers
  • Martin and Mirchandani F 99 slides

45
Upcoming
  • Automating Design (Cathy)
  • Animation
Write a Comment
User Comments (0)
About PowerShow.com