Galaxy of News: An Approach to Visualizing and Understanding Expansive News Landscapes Earl Rennison In UIST `94, ACM Symposium on User Interface Software and Technology. New York: ACM Press, 1994. - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Galaxy of News: An Approach to Visualizing and Understanding Expansive News Landscapes Earl Rennison In UIST `94, ACM Symposium on User Interface Software and Technology. New York: ACM Press, 1994.

Description:

554 paper pres. 1 ... Paper presentation by Mark Sharp. 17:610:554 Information Visualization, Prof. Spoerri ... 554 paper pres. 16. What is not working or clear? ... – PowerPoint PPT presentation

Number of Views:134
Avg rating:3.0/5.0
Slides: 36
Provided by: Mer671
Category:

less

Transcript and Presenter's Notes

Title: Galaxy of News: An Approach to Visualizing and Understanding Expansive News Landscapes Earl Rennison In UIST `94, ACM Symposium on User Interface Software and Technology. New York: ACM Press, 1994.


1
Galaxy of News An Approach to Visualizing and
Understanding Expansive News Landscapes Earl
RennisonIn UIST 94, ACM Symposium on User
Interface Softwareand Technology. New York ACM
Press, 1994.
  • Paper presentation by Mark Sharp
  • 17610554 Information Visualization, Prof.
    Spoerri
  • 11/11/2002

2
Paper Summary
  • PROBLEM Accessing and understanding news
    information is not well-supported by the
    information infrastructure.
  • VISION An intelligent infrastructure that
    automatically builds the correlations and
    relationships between news articles and
    constructs an environment that allows readers to
    dynamically explore and gain understanding.

3
How does it work?
  • Articles have features (metadata) extracted by
    parsing algorithms, then they are clustered by
    ARN (a neural network algorithm) and mapped to a
    3D space layout.
  • Nodes keyword hierarchy / headlines / full text
  • Zoom in with left mouse button, out with right.
    direct manipulation
  • Animation (4D) helps user understand what system
    is doing.
    motion an early/pre-attentive visual cue

4
(No Transcript)
5
(No Transcript)
6
(No Transcript)
7
(No Transcript)
8
(No Transcript)
9
(No Transcript)
10
Model components
Temporal and behavior interaction controls
level-of-detail, user orientation cues,
transition to new views.
Spatial construction can be 2-, 3-, or
n-dimensional uses relationships dynamic
(appropriate for news)..
Relationships designer-specified e.g. temporal
ordering
.
News base not raw data objects and annotations
(keywords, slugwords, location, time, subject,
etc.) manually or automatically derived from raw
data.
11
reading
writing
12
(No Transcript)
13
(No Transcript)
14
Which early / pre-attentivevisual processes are
leveraged?
  • Position
  • Proximity
  • Motion
  • Brightness
  • Size
  • Color

15
What is working?
  • Principled (algorithmic) feature extraction and
    clustering.
  • Direct manipulation.
  • True zooming (seamless exploration of categories,
    document labels, and full texts).
  • Dynamic updating of content (new articles).

16
What is not working or clear?
  • Clustering based on skinny metadata rather than
    full text vectors.
  • Keywords are single words, not terms.
  • Relationships?

17
What surprised you?
  • Naivete about understanding and media studies.

18
Key Insights what I learned
  • Detailed look into the architecture of a true
    large text corpus info viz system with many
    desirable features.

19
What is the key contribution?
  • True zooming (seamless integration of all levels)
    is feasible in large text corpora.

20
Take-away messages?What can be generalized?
  • Computational feasibility forces some
    compromises.
  • What is not working
  • Human heuristics (relationships?)
  • BUT help is on the way (bigger iron)

21
3 questions for groupand class discussion.
  • Is volume and lack of organization really our
    biggest problem with modern news information?
  • Would you use Galaxy of News? Why or why not?
  • What other kinds of text data would you like to
    see this approach applied to? How might a
    different domain affect the specification of
    metadata object representations and/or
    relationships?

22
TileBars Visualization of Term Distribution
Information in Full Text Information
AccessMarti HearstProceedings of the ACM
SIGCHI Conference onHuman Factors in Computing
Systems (CHI), pp. 59-66, Denver, CO, May 1995.
  • Paper presentation by Mark Sharp
  • 17610554 Information Visualization, Prof.
    Spoerri
  • 11/11/2002

23
Paper Summary
  • PROBLEM Traditional IR is focused on text
    databases consisting of titles and abstracts
    assumptions are not necessarily appropriate for
    full text.
  • VISION Utilize term distribution within the text
    as well as overall frequency to model document
    relevance. Replace opaque ranking with a
    transparent means for swift appraisal of the
    query-document relationship.

24
How does it work?
  • TextTiling algorithm partitions full text into
    adjacent, non-overlapping, multi-paragraph
    segments reflecting subtopic structure based on
    term co-occurrence and repetition.
  • Segments are scored for similarity to query
    terms.
  • Display shows document length, term frequency,
    and term distribution across segments.

25
Length of rectangle length of
document Each gray square 1 tile
(segment) Tile darkness term
freq. Query term sets tile rows
26
(No Transcript)
27
(No Transcript)
28
Which early / pre-attentivevisual processes are
leveraged?
  • Length
  • Position
  • Darkness (gray scale)

29
What is working?
  • Elegant rep. of document length.
  • Adjacency of tiles between term rows gt overlap.
  • Gray scale leverages relative (vs. absolute)
    judgment.
  • Meaningful labels (start of text).
  • Direct click link from tiles to text segments.
  • Starting TREC/TIPSTER evaluation.

30
What is not working or clear?
  • Depends on skillful Boolean query formulation
    (e.g. no stopwords).
  • Doesnt appear to be scalable to large queries
    (gt3 conjunctive terms).

31
What surprised you?
  • Because they do have a natural visual hierarchy,
    varying shades of gray show varying quantities
    better than color.

32
Key Insights what I learned
  • Relevance ranking is not the only game in town
    for putting cognitive cues on multi-document
    retrievals.

33
What is the key contribution?
  • Text segmentation can enhance traditional
    (whole-document) IR as well as fact retrieval.
  • Novel paradigms for text retrieval can be both
    principled and computationally efficient.

34
Take-away messages?What can be generalized?
  • Marti Hearst is a major player in text mining /
    text visualization.

35
3 questions for groupand class discussion.
  • Instead of integer term frequency, what else
    could be used to color the tiles for relevance?
  • How might documents be ranked?
Write a Comment
User Comments (0)
About PowerShow.com