Information Retrieval and Evaluation - PowerPoint PPT Presentation

1 / 66
About This Presentation
Title:

Information Retrieval and Evaluation

Description:

Grounded in the study of work processes. Cooperative. Assumes a ... Computers are a part of the fabric of society. Catalytic. Computers are symbols of progress ... – PowerPoint PPT presentation

Number of Views:94
Avg rating:3.0/5.0
Slides: 67
Provided by: dougla88
Category:

less

Transcript and Presenter's Notes

Title: Information Retrieval and Evaluation


1
Information Retrievaland Evaluation
  • Session 12
  • LBSC 690
  • Information Technology

2
Agenda
  • CSCW
  • The search process
  • Information retrieval
  • Evaluation

3
Computer SupportedCooperative Work (CSCW)
  • Work
  • Grounded in the study of work processes
  • Cooperative
  • Assumes a shared objective, task
  • Technology-supported
  • Computers are just one type of tools used
  • Groupware

4
Key Issues in CSCW
  • Shared information space
  • Group awareness
  • Coordination
  • Concurrency control
  • Multi-user interfaces
  • Heterogeneous environments

5
Case Study Virtual Reference
  • Required functions
  • System architecture
  • Adoption

6
Case Study 2Your Project Team!
  • Face to face meetings
  • Teleconferences
  • Shared workspace on WAM
  • IM-synchronized work sessions
  • NetMeeting?

7
Educational Computing
  • Computer Assisted Education
  • What most people think of first
  • Computer Managed Instruction
  • What most people really do first!
  • Computer Mediated Communication
  • All that CSCW stuff applied to education
  • Computer-Based Multimedia
  • Just another filmstrip machine?

8
Rationales
  • Pedagogic
  • Use computers to teach
  • Vocational
  • Computer programming is a skill like typing
  • Social
  • Computers are a part of the fabric of society
  • Catalytic
  • Computers are symbols of progress

9
Conditions for Success
  • Most prerequisites are not computer-specific
  • Need, know-how, time, commitment, leadership,
    incentives, expectations
  • The most important barrier is time
  • Teacher time is by far the most important factor

10
Alternatives
  • Facilities
  • Computer classrooms (e.g., teaching theaters)
  • Computers IN classrooms (e.g., HBK 2119)
  • Objectives
  • Computer Literacy
  • Not so in the Maryland teaching theaters
  • Comparatively few technology classes

11
Discussion PointComputers as Educational Media
  • What are the most salient characteristics
  • Books
  • Video
  • Computers

12
Distance Education
  • Correspondence courses
  • Focus on dissemination and evaluation
  • Instructional television
  • Dissemination, interaction, and evaluation
  • Computer-Assisted Instruction
  • Same three functions w/ubiquitous technology

13
What is Information Retrieval?
  • Information
  • How is it different from data?
  • How is it different from knowledge?
  • Types of information (text, audio, video, etc.)
  • Retrieval
  • Finding information that satisfies an information
    need

14
Types of Information Needs
  • Retrospective
  • Searching the past
  • Different queries posed against a static
    collection
  • Time invariant
  • Prospective
  • Searching the future
  • Static query posed against a dynamic collection
  • Time dependent

15
Retrospective Search
  • Find documents about this
  • Find documents about the use of nuclear energy in
    China
  • Known item search
  • Find the class home page
  • Answer seeking
  • Is Lexington or Louisville the capital of
    Kentucky?
  • Directed exploration
  • Who makes videoconferencing systems?

16
Prospective Search
  • Filtering
  • Make a binary decision about each incoming
    document
  • Spam or not Spam?
  • Routing
  • Sort incoming documents into different bins?
  • Categorizing news headlines World? Nation?
    Local? Sports?

17
Information Retrieval Paradigm
Document Delivery
Browse
Search
Select
Examine
Query
Document
18
Supporting the Search Process
Source Selection
Choose
19
Supporting the Search Process
Source Selection
20
Discussion PointLibraries v.s. Computerized
Systems
  • Documents types, acquisition, indexing
  • Queries formulation and reformulation
  • Search matching queries with documents
  • Browsing/selection/examination
  • Document delivery and use

21
Why is Information Retrieval hard?
  • What do users really want?
  • User diversity task, knowledge state (subject,
    technology, etc.)
  • Information problem ? information need ? query
  • System limitations
  • Document coverage
  • Supporting tools/functionalities
  • Complexity of human languages
  • Ambiguity, synonymy, polysemy, etc.

22
Human-Machine Synergy
  • Machines are good at
  • Doing simple things accurately and quickly
  • Scaling to larger collections in sublinear time
  • People are better at
  • Accurately recognizing what they are looking for
  • Evaluating intangibles such as quality
  • Both are pretty bad at
  • Mapping consistently between words and concepts

23
Search Component Model
Utility
Human Judgment
Information Need
Document
Query Formulation
Query
Document Processing
Query Processing
Representation Function
Representation Function
Query Representation
Document Representation
Comparison Function
Retrieval Status Value
24
Query/Document Representation
  • Using controlled vocabulary
  • Using free text
  • Reasons/concerns
  • Space
  • Speed
  • Effectiveness

25
Discussion PointControlled Vocabulary v.s. Free
Text
  • Required tools/resources
  • Requirements for user/indexer
  • Ease of learn
  • Ease of use
  • Search quality
  • Ease of automation

26
Bag of Words Representation
  • Bag a set that can contain duplicates
  • Ignore syntax
  • The quick brown fox jumped over the lazy dogs
    back
  • ?back, brown, dog, fox, jump, lazy, over,
    quick, the, the
  • Vector values recorded in any consistent order
  • back, brown, dog, fox, jump, lazy, over, quick,
    the
  • ?1 1 1 1 1 1 1 1 2

27
Bag of Words Example
Document 1
Document 1
Document 2
Term
The quick brown fox jumped over the lazy dogs
back.
Stopword List
for
is
of
the
Document 2
to
Now is the time for all good men to come to the
aid of their party.
28
Exact-Match (Boolean) Retrieval
  • Limit the bag of words to absent and present
  • Boolean values, represented as 0 and 1
  • Represent terms as a bag of documents
  • Same representation, but rows rather than columns
  • Combine the rows using Boolean operators
  • AND, OR, NOT
  • Result set every document with a 1 remaining

29
Boolean Retrieval Example
  • dog AND fox
  • Doc 3, Doc 5
  • dog NOT fox
  • Empty
  • fox NOT dog
  • Doc 7
  • dog OR fox
  • Doc 3, Doc 5, Doc 7
  • good AND party
  • Doc 6, Doc 8
  • good AND party NOT over
  • Doc 6

30
Why Boolean Retrieval Works
  • Boolean operators approximate natural language
  • Find documents about a good party that is not
    over
  • AND can discover relationships between concepts
  • good party
  • OR can discover alternate terminology
  • excellent party
  • NOT can discover alternate meanings
  • Democratic party

31
Why Boolean Retrieval Fails
  • Natural language is way more complex
  • She saw the man on the hill with a telescope
  • AND discovers nonexistent relationships
  • Terms in different paragraphs, chapters,
  • Guessing terminology for OR is hard
  • good, nice, excellent, outstanding, awesome,
  • Guessing terms to exclude is even harder!
  • Democratic party, party to a lawsuit,

32
Boolean Retrieval
  • Strengths
  • Accurate, if you know the right strategies
  • Efficient for the computer
  • Weaknesses
  • Often results in too many documents, or none
  • Users must learn Boolean logic
  • Sometimes finds relationships that dont exist
  • Words can have many meanings
  • Choosing the right words is sometimes hard

33
Ranked Retrieval Paradigm
  • Some documents are more relevant to a query than
    others
  • Not necessarily true under Boolean retrieval!
  • Best-first ranking can be superior
  • Select n documents
  • Put them in order, with the best ones first
  • Display them one screen at a time
  • Users can decided when they want to stop reading

34
Ranked Retrieval Challenges
  • Best first is easy to say but hard to do!
  • The best we can hope for is to approximate it
  • Will the user understand the process?
  • It is hard to use a tool that you dont
    understand
  • Efficiency becomes a concern

35
Similarity-Based Retrieval
  • Assume most useful most similar to query
  • Weight terms based on two criteria
  • Repeated words are good cues to meaning
  • Rarely used words make searches more selective
  • Compare weights with query
  • Add up the weights for each query term
  • Put the documents with the highest total first

36
Counting Terms
  • Terms tell us about documents
  • If rabbit appears a lot, it may be about
    rabbits
  • Documents tell us about terms
  • the is in every document not discriminating
  • Documents are most likely described well by rare
    terms that occur in them frequently
  • Higher term frequency is stronger evidence
  • Low collection frequency makes it stronger still

37
Discussion Point Which Terms to Emphasize?
  • Major factors
  • Uncommon terms are more selective
  • Repeated terms provide evidence of meaning
  • Adjustments
  • Give more weight to terms in certain positions
  • Title, first paragraph, etc.
  • Give less weight each term in longer documents
  • Ignore documents that try to spam the index
  • Invisible text, excessive use of the meta
    field,

38
Okapi Term Weights
TF component
IDF component
39
Index Quality
  • Crawl quality
  • Comprehensiveness, dead links, duplicate
    detection
  • Document analysis
  • Frames, metadata, imperfect HTML,
  • Document extension
  • Anchor text, source authority, category,
    language,
  • Document restriction (ephemeral text suppression)
  • Banner ads, keyword spam,

40
Indexing Anchor Text
  • A type of document expansion
  • Terms near links describe content of the target
  • Works even when you cant index content
  • Image retrieval, uncrawled links,

41
Browsing Results User Goals
  • Identify documents for some form of delivery
  • Query Enrichment
  • Relevance feedback
  • User designates more like this documents
  • System adds terms from those documents to the
    query
  • Manual reformulation
  • Better approximation of information need
  • What can the system do?
  • Assist the user to identify relevant documents
  • Assist the user to identify potential useful terms

42
Selection Interfaces
  • One dimensional lists
  • What to display? title, source, date, summary,
    ratings, ...
  • What order to display? retrieval status value,
    date, alphabetic, ...
  • How much to display? number of hits
  • Other aids? related terms, suggested queries,
  • Two dimensional displays
  • Clustering, projection, contour maps, VR
  • Navigation jump, pan, zoom

43
Example Interfaces
  • Google keyword in context
  • Teoma query refinement suggestions
  • Vivisimo clustered results
  • Kartoo clustered visualization

44
Queries on the Web (1999)
  • Low query construction effort
  • 2.35 (often imprecise) terms per query
  • 20 use operators
  • 22 are subsequently modified
  • Low browsing effort
  • Only 15 view more than one page
  • Most look only above the fold
  • One study showed that 10 dont know how to
    scroll!

45
Types of User Needs
  • Informational (30-40 of AltaVista queries)
  • What is a quark?
  • Navigational
  • Find the home page of United Airlines
  • Transactional
  • Data What is the weather in Paris?
  • Shopping Who sells a Viao Z505RX?
  • Proprietary Obtain a journal article

46
Searching Other Languages
Query Formulation
Document
Use
47
(No Transcript)
48
Speech Retrieval Architecture
Query Formulation
Speech Recognition
Automatic Search
Boundary Tagging
Interactive Selection
Content Tagging
49
The GALE Project
Rosetta Distillation Utility
BAE Distillation GNG
NIST Translation GNG
Term sequence
Presentation Scope Select Synthesize
Entities Roles Relations Events Sentiment
Extraction Detect Classify Normalize
Acquisition Transcribe Translate Tokenize
Term sequence
Text Speech
Interests Knowledge
User Modeling
Interact
Control
Summary Visualization
Explanation
Report
50
Hands On Try Some Search Engines
  • Web Pages (using spatial layout)
  • http//kartoo.com/
  • Images (based on image similarity)
  • http//elib.cs.berkeley.edu/photos/blobworld/
  • Multimedia (based on metadata)
  • http//singingfish.com
  • Movies (based on recommendations)
  • http//www.movielens.umn.edu
  • Grey literature (based on citations)
  • http//citeseer.ist.psu.edu/

51
Evaluating Retrieval Systems
  • User-centered strategy
  • Given several users, and at least 2 retrieval
    systems
  • Have each user try the same task on both systems
  • Measure which system works the best
  • System-centered strategy
  • Given documents, queries, and relevance judgments
  • Try several variations on the retrieval system
  • Measure which ranks more good docs near the top

52
Defining Relevance
  • A central problem in information science
  • Relevance relates a topic and a document
  • Not static
  • Influenced by other documents
  • Two general types
  • Topical relevance is this document about the
    correct subject?
  • Situational relevance is this information useful?

53
Good Effectiveness Measures
  • Capture some aspect of what the user wants
  • Have predictive value for other situations
  • Different queries, different document collection
  • Easily replicated by other researchers
  • Easily compared
  • Optimally, expressed as a single number

54
Set-Based Measures
Collection size ABCD Relevant
AC Retrieved AB
  • Precision A (AB)
  • Recall A (AC)
  • Miss C (AC)
  • False alarm (fallout) B (BD)

When is precision important? When is recall
important?
55
Another View
Space of all documents
Relevant Retrieved
Relevant
Retrieved
Not Relevant Not Retrieved
56
Precision and Recall
  • Precision
  • How much of what was found is relevant?
  • Often of interest, particularly for interactive
    searching
  • Recall
  • How much of what is relevant was found?
  • Particularly important for law, patents, and
    medicine

57
Abstract Evaluation Model
Documents
Query
Ranked Retrieval
Ranked List
Evaluation
Relevance Judgments
Measure of Effectiveness
58
Which is the Best Rank Order?
A.
B.
C.
D.
E.
F.
59
Evaluating Ranked Retrieval
Prec. 1.00 1.00 0.60 0.57 0.50
Rel? R R R R R
-------------------------------------------- Avg.
Prec. (AP) 0.73
min(W, W-) lt 8 ? difference is not significant
(two-tailed, p0.05)
60
Precision-Recall Curves
Source Ellen Voorhees, NIST
61
User Studies
  • Goal is to account for interface issues
  • By studying the interface component
  • By studying the complete system
  • Formative evaluation
  • Provide a basis for system development
  • Summative evaluation
  • Designed to assess performance

62
Quantitative User Studies
  • Select independent variable(s)
  • e.g., two methods of displaying document summary
  • Select dependent variable(s)
  • e.g., time to find an answer
  • Run subjects in different orders
  • Average out learning and fatigue effects
  • Compute statistical significance
  • Null hypothesis independent variable has no
    effect
  • Rejected if plt0.05

63
Search Time Passages Help?
64
Qualitative User Studies
  • Observe user behavior
  • Instrumented software, eye trackers, etc.
  • Face and keyboard cameras
  • Think-aloud protocols
  • Interviews and focus groups
  • Organize the data
  • For example, group it into overlapping categories
  • Look for patterns and themes
  • Develop a grounded theory

65
Questionnaires
  • Demographic data
  • e.g., computer experience
  • Basis for interpreting results
  • Subjective self-assessment
  • Which did they think was more effective?
  • Often at variance with objective results!
  • Preference
  • Which interface did they prefer? Why?

66
Summary
  • Search is a process engaged in by people
  • Human-machine synergy is the key
  • Content and behavior offer useful evidence
  • Evaluation must consider many factors
Write a Comment
User Comments (0)
About PowerShow.com