TileBars: Visualization of Term Distribution Information in Full Text Information Access - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

TileBars: Visualization of Term Distribution Information in Full Text Information Access

Description:

Two common approaches- Similarity search and Boolean queries. ... The search engine uses TDB implemented in Common Lisp. Demo. Pros and Cons ... – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 13
Provided by: rema5
Category:

less

Transcript and Presenter's Notes

Title: TileBars: Visualization of Term Distribution Information in Full Text Information Access


1
TileBars Visualization of Term Distribution
Information in Full Text Information Access
  • Presenter Reenal Mahajan
  • Discusser Xueqi

2
Tilebars
  • What are Tilebars?
  • Why do we need them?
  • Titles abstracts vs. Full length texts
  • Context and structure
  • Response to a query
  • Mantra used?
  • Show relative length of each retrieved document
  • Show the frequency of the topic words
  • Show the distribution of the topic words

3
Standard information retrieval
  • Two common approaches- Similarity search and
    Boolean queries.

4
Problems with standard information retrieval
  • Reasons behind relevance rankings is unclear
  • Length of the document cannot be read easily
  • Term distribution not known
  • No way to express preferences

5
Importance of document structure
  • Different document structures
  • Many subtopic discussions
  • Sequence of subtopics set against a backdrop of
    one or more main topics
  • User should be able to query about a coherent
    subpart or subtopic
  • TextTiling

6
TileBar
TextTiling
Term sets
7
Results for the query Medical Diagnosis
8
Specifying constraints.
9
Implementation notes
  • Current implementation makes use of 132000
    documents of the Ziff portion
  • The interface uses the Tcl/TK X11-based toolkit
  • The search engine uses TDB implemented in Common
    Lisp

Demo
10
Pros and Cons
  • Displays the term distribution information along
    with relevance information.
  • Passage based retrieval
  • The patterns can be quickly scanned and
    deciphered aiding fast judgment
  • Sometimes the TileBars can be misleading because
    the user did not specify all the words in the
    query that might be used to discuss a topic.

11
Related work
The cube of contents
Infocrystal
12
Conclusions and Future Work
  • User studies should be done to determine how
    users interpret the meaning of the term
    distributions and how they may be used in
    relevance feedback
  • Determine in what situations the users'
    expectations are not met
  • Determine what additional information will help
    prevent misconceptions
Write a Comment
User Comments (0)
About PowerShow.com