IST 2140 Information Storage and Retrieval - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

IST 2140 Information Storage and Retrieval

Description:

'something that (1) is represented by a set of symbols, (2) has some structure, ... SMART, InQuery, Okapi, ZPRISE, Panoptics, Lemur. Web search engine designers ... – PowerPoint PPT presentation

Number of Views:131
Avg rating:3.0/5.0
Slides: 15
Provided by: eras
Category:

less

Transcript and Presenter's Notes

Title: IST 2140 Information Storage and Retrieval


1
IST 2140Information Storage and Retrieval
  • Week 1

2
What is information?
  • something that (1) is represented by a set of
    symbols, (2) has some structure, and (3) can be
    read and to some extent understood by users of
    information (Meadow)
  • Data to information to knowledge continuum

3
What is Information Retrieval?
  • the location and presentation to a user of
    information relevant to an information need
    expressed as a query (Korfhage)
  • "an information retrieval system is a device
    interposed between a potential user of
    information and the information collection
    itself. For a given information problem, the
    purpose of the system is to capture wanted items
    and to filter out unwanted items (Harter)
  •  "information retrieval deals with the
    representation, storage, and access to documents
    or representatives of documents (document
    surrogates) (Salton)
  • Document may be text or ??

4
What is an Information Need? 
  • Information problem that causes the user to act
  • Four levels
  • Visceral recognition of deficiency, though not
    cognitively defined (gap or anomalous state of
    knowledge defect in mental model)
  • Conscious users characterization of the
    deficiency
  • Formalized users clear articulation
  • Compromised  formalized statement as limited by
    the search system

5
The IR process
  • first approach to computer IR IR is a simple
    matching process QUERY----gt FILE ----gt
    ANSWER
  • now realize it is an extremely complex process
    because
  • Information need is amorphous/hard to express
  • Document representation is inexact/ambiguous
  • Probabilistic rather than deterministic process
  • cf. retrieval from a DBMS

6
Sample search
  • Search bridge on Google
  • http//www.google.com
  • And on Northern Light
  • http//www.northernlight.com
  • The problem of homonyms is just one of many in IR

7
Major functions of information services
8
What are the components of an IR system?
  • Document processing (indexing)
  • Query input
  • Document-query matching
  • Output module
  • Feedback module
  • User interface

9
Processing In a Boolean IR System
10
Some questions for IR
  • how does an individual approach an IR system?
  • how can the content ("value") of an item be
    represented?
  • how can the individual and the information be
    brought together ("matched")?
  • what is an appropriate way to present the
    results?
  • how can the success or failure of information
    seeking activity be measured?
  • how should the ISAR system be structured for
    maximum effectiveness? maximum efficiency?

11
Who builds information retrieval systems?
  • Commercial information providers
  • DIALOG, LEXIS-NEXIS, WestLaw, STN
  • Text Software vendors
  • InMagic, AskSam
  • IR Researchers
  • SMART, InQuery, Okapi, ZPRISE, Panoptics, Lemur
  • Web search engine designers
  • AltaVista, Google, Lycos, etc.  

12
Information retrieval system design and analysis
  • The system-centered approach
  • Salton, Croft, Van Rijsbergen, Lewis
  • The user-centered approach
  • Belkin, Marchionini, Fidel, Ingwersen

13
Sources of Information on IR
  • journals
  • Information Retrieval (IR)
  • Information Processing Management (IPM)
  • Journal of the American Society for Information
    Science and Technology (JASIST)
  • Transactions on Information Systems (TOIS)
  • conferences
  • ACM SIGIR (Special Interest Group on Information
    Retrieval)
  • American Society for Information Science and
    Technology
  • TREC (Text REtrieval Evaluation Conference)

14
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com