Document Collections - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Document Collections

Description:

... Keyword Query Keyword query, Search engine Rank ... interfaces.html Documents located between query keywords using spring model VR-VIBE Keyword Query ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 24
Provided by: ChrisN189
Category:

less

Transcript and Presenter's Notes

Title: Document Collections


1
Document Collections
  • cs5984 Information Visualization
  • Chris North

2
Where are we?
  • Multi-D
  • 1D
  • 2D
  • Hierarchies/Trees
  • Networks/Graphs
  • Document collections
  • 3D
  • Design Principles
  • Empirical Evaluation
  • Java Development
  • Visual Overviews
  • Multiple Views
  • Peripheral Views

3
Structured Document Collections
  • Multi-dimensional
  • author, title, date, journal,
  • Trees
  • dewey decimal
  • Networks
  • web, citations

4
Envision
  • Ed Fox, et al.
  • Multi-D
  • similar to Spotfire

5
Unstructured Document Collections
  • Focus on Full Text
  • Examples
  • digital libraries, encyclopedia
  • Web, homepages, photo collections
  • Tasks
  • search, keyword
  • Browse
  • Themes, subjects, topics, library coverage
  • Size, distributions

6
Visualization Strategies
  • Cluster Maps
  • Keyword Query
  • Relationships
  • Reduced representation
  • User controlled layout

today
today
7
Cluster Map
  • Create a map of the document collection
  • Similar documents near
  • Dissimilar document far
  • Grocery store concept

8
Document Vectors
  • Doc1 Doc2 Doc3
  • aardvark 1 2 0
  • banana 2 1 0
  • chris 0 0 3
  • Similarity between pair of docs
  • Layout documents in 2-D map by similarity
  • similar to spring model for graph layout

9
Cluster Algorithms
  • Partition clustering Partition into k subsets
  • Pick k seeds
  • Iteratively attract nearest neighbors
  • Hierarchical clustering Dendrogram
  • Group nearest-neighbor pair
  • Iterate

10
Kohonen Maps
  • Xia Lin, Document Space
  • samal, ying
  • http//faculty.cis.drexel.edu/sitemap/index.html

11
(No Transcript)
12
Themescapes, Cartia
  • PNL
  • Mountain height Cluster size

13
WebSOM
  • http//websom.hut.fi/websom/

14
Map.net
  • http//maps.map.net/start

15
Cluster Map
  • Good
  • Map of collection
  • Major themes and sizes
  • Relationships between themes
  • Scales up
  • Bad
  • Where to locate documents with multiple themes?
  • Both mountains, between mountains, ?
  • Relationships between documents, within
    documents?
  • Algorithm becomes (too) critical

16
Keyword Query
  • Keyword query, Search engine
  • Rank ordered list
  • Information Retrieval

17
Tilebars
  • Hearst, Tilebars
  • reenal, xueqi
  • http//elib.cs.berkeley.edu/tilebars/

18
VIBE
  • Korfhage, http//www.pitt.edu/korfhage/interface
    s.html
  • Documents located between query keywords using
    spring model

19
VR-VIBE
20
Keyword Query
  • Good
  • Reduces the browsing space
  • Map according to users interests
  • Bad
  • What keywords do I use?
  • What about other related documents that dont use
    these keywords?
  • No initial overview
  • Mega-hit, zero-hit problem

21
Assignment
  • Thurs Document Collections
  • Bederson, Image Browsing
  • Rui, anusha
  • Card, Web Book and Web Forager
  • mrinmayee, ming
  • Demo your hw3 tues or thurs

22
Next Week
  • Tues 3-D data
  • Kniss, Interactive Volume Rendering with Direct
    Manip
  • xueqi, mahesh
  • Thurs Workspaces
  • Robertson, Task Gallery
  • supriya, varun
  • Upson, AVS
  • christa, jun
  • Thanksgiving break
  • Tues 27 Debates
  • Kobsa, Empirical comparison of comm infovis
    systems
  • kunal, zhiping

23
Upcoming Sched
  • Tues 3-D data
  • Thurs Workspaces
  • Thanksgiving break
  • Tues 27 Debates
  • Thurs 29 How (not) to lie with visualization
  • Dec project presentations
  • Dec 7 CHI 2-pagers due, student posters due
Write a Comment
User Comments (0)
About PowerShow.com