Information Retrieval and Evaluation - PowerPoint PPT Presentation

1 / 66

About This Presentation

Title:

Information Retrieval and Evaluation

Description:

Grounded in the study of work processes. Cooperative. Assumes a ... Computers are a part of the fabric of society. Catalytic. Computers are symbols of progress ... – PowerPoint PPT presentation

Number of Views:94

Avg rating:3.0/5.0

Slides: 67

Provided by: dougla88

Category:

more less

Transcript and Presenter's Notes

Title: Information Retrieval and Evaluation

1
Information Retrievaland Evaluation

Session 12
LBSC 690
Information Technology

2
Agenda

CSCW
The search process
Information retrieval
Evaluation

3
Computer SupportedCooperative Work (CSCW)

Work
Grounded in the study of work processes
Cooperative
Assumes a shared objective, task
Technology-supported
Computers are just one type of tools used
Groupware

4
Key Issues in CSCW

Shared information space
Group awareness
Coordination
Concurrency control
Multi-user interfaces
Heterogeneous environments

5
Case Study Virtual Reference

Required functions
System architecture
Adoption

6
Case Study 2Your Project Team!

Face to face meetings
Teleconferences
Shared workspace on WAM
IM-synchronized work sessions
NetMeeting?

7
Educational Computing

Computer Assisted Education
What most people think of first
Computer Managed Instruction
What most people really do first!
Computer Mediated Communication
All that CSCW stuff applied to education
Computer-Based Multimedia
Just another filmstrip machine?

8
Rationales

Pedagogic
Use computers to teach
Vocational
Computer programming is a skill like typing
Social
Computers are a part of the fabric of society
Catalytic
Computers are symbols of progress

9
Conditions for Success

Most prerequisites are not computer-specific
Need, know-how, time, commitment, leadership,
incentives, expectations
The most important barrier is time
Teacher time is by far the most important factor

10
Alternatives

Facilities
Computer classrooms (e.g., teaching theaters)
Computers IN classrooms (e.g., HBK 2119)
Objectives
Computer Literacy
Not so in the Maryland teaching theaters
Comparatively few technology classes

11
Discussion PointComputers as Educational Media

What are the most salient characteristics
Books
Video
Computers

12
Distance Education

Correspondence courses
Focus on dissemination and evaluation
Instructional television
Dissemination, interaction, and evaluation
Computer-Assisted Instruction
Same three functions w/ubiquitous technology

13
What is Information Retrieval?

Information
How is it different from data?
How is it different from knowledge?
Types of information (text, audio, video, etc.)
Retrieval
Finding information that satisfies an information
need

14
Types of Information Needs

Retrospective
Searching the past
Different queries posed against a static
collection
Time invariant
Prospective
Searching the future
Static query posed against a dynamic collection
Time dependent

15
Retrospective Search

Find documents about this
Find documents about the use of nuclear energy in
China
Known item search
Find the class home page
Answer seeking
Is Lexington or Louisville the capital of
Kentucky?
Directed exploration
Who makes videoconferencing systems?

16
Prospective Search

Filtering
Make a binary decision about each incoming
document
Spam or not Spam?
Routing
Sort incoming documents into different bins?
Categorizing news headlines World? Nation?
Local? Sports?

17
Information Retrieval Paradigm
Document Delivery
Browse
Search
Select
Examine
Query
Document
18
Supporting the Search Process
Source Selection
Choose
19
Supporting the Search Process
Source Selection
20
Discussion PointLibraries v.s. Computerized
Systems

Documents types, acquisition, indexing
Queries formulation and reformulation
Search matching queries with documents
Browsing/selection/examination
Document delivery and use

21
Why is Information Retrieval hard?

What do users really want?
User diversity task, knowledge state (subject,
technology, etc.)
Information problem ? information need ? query
System limitations
Document coverage
Supporting tools/functionalities
Complexity of human languages
Ambiguity, synonymy, polysemy, etc.

22
Human-Machine Synergy

Machines are good at
Doing simple things accurately and quickly
Scaling to larger collections in sublinear time
People are better at
Accurately recognizing what they are looking for
Evaluating intangibles such as quality
Both are pretty bad at
Mapping consistently between words and concepts

23
Search Component Model
Utility
Human Judgment
Information Need
Document
Query Formulation
Query
Document Processing
Query Processing
Representation Function
Representation Function
Query Representation
Document Representation
Comparison Function
Retrieval Status Value
24
Query/Document Representation

Using controlled vocabulary
Using free text
Reasons/concerns
Space
Speed
Effectiveness

25
Discussion PointControlled Vocabulary v.s. Free
Text

Required tools/resources
Requirements for user/indexer
Ease of learn
Ease of use
Search quality
Ease of automation

26
Bag of Words Representation

Bag a set that can contain duplicates
Ignore syntax
The quick brown fox jumped over the lazy dogs
back
?back, brown, dog, fox, jump, lazy, over,
quick, the, the
Vector values recorded in any consistent order
back, brown, dog, fox, jump, lazy, over, quick,
the
?1 1 1 1 1 1 1 1 2

27
Bag of Words Example
Document 1
Document 1
Document 2
Term
The quick brown fox jumped over the lazy dogs
back.
Stopword List
for
is
of
the
Document 2
to
Now is the time for all good men to come to the
aid of their party.
28
Exact-Match (Boolean) Retrieval

Limit the bag of words to absent and present
Boolean values, represented as 0 and 1
Represent terms as a bag of documents
Same representation, but rows rather than columns
Combine the rows using Boolean operators
AND, OR, NOT
Result set every document with a 1 remaining

29
Boolean Retrieval Example

dog AND fox
Doc 3, Doc 5
dog NOT fox
Empty
fox NOT dog
Doc 7
dog OR fox
Doc 3, Doc 5, Doc 7
good AND party
Doc 6, Doc 8
good AND party NOT over
Doc 6

30
Why Boolean Retrieval Works

Boolean operators approximate natural language
Find documents about a good party that is not
over
AND can discover relationships between concepts
good party
OR can discover alternate terminology
excellent party
NOT can discover alternate meanings
Democratic party

31
Why Boolean Retrieval Fails

Natural language is way more complex
She saw the man on the hill with a telescope
AND discovers nonexistent relationships
Terms in different paragraphs, chapters,
Guessing terminology for OR is hard
good, nice, excellent, outstanding, awesome,
Guessing terms to exclude is even harder!
Democratic party, party to a lawsuit,

32
Boolean Retrieval

Strengths
Accurate, if you know the right strategies
Efficient for the computer
Weaknesses
Often results in too many documents, or none
Users must learn Boolean logic
Sometimes finds relationships that dont exist
Words can have many meanings
Choosing the right words is sometimes hard

33
Ranked Retrieval Paradigm

Some documents are more relevant to a query than
others
Not necessarily true under Boolean retrieval!
Best-first ranking can be superior
Select n documents
Put them in order, with the best ones first
Display them one screen at a time
Users can decided when they want to stop reading

34
Ranked Retrieval Challenges

Best first is easy to say but hard to do!
The best we can hope for is to approximate it
Will the user understand the process?
It is hard to use a tool that you dont
understand
Efficiency becomes a concern

35
Similarity-Based Retrieval

Assume most useful most similar to query
Weight terms based on two criteria
Repeated words are good cues to meaning
Rarely used words make searches more selective
Compare weights with query
Add up the weights for each query term
Put the documents with the highest total first

36
Counting Terms

Terms tell us about documents
If rabbit appears a lot, it may be about
rabbits
Documents tell us about terms
the is in every document not discriminating
Documents are most likely described well by rare
terms that occur in them frequently
Higher term frequency is stronger evidence
Low collection frequency makes it stronger still

37
Discussion Point Which Terms to Emphasize?

Major factors
Uncommon terms are more selective
Repeated terms provide evidence of meaning
Adjustments
Give more weight to terms in certain positions
Title, first paragraph, etc.
Give less weight each term in longer documents
Ignore documents that try to spam the index
Invisible text, excessive use of the meta
field,

38
Okapi Term Weights
TF component
IDF component
39
Index Quality

Crawl quality
Comprehensiveness, dead links, duplicate
detection
Document analysis
Frames, metadata, imperfect HTML,
Document extension
Anchor text, source authority, category,
language,
Document restriction (ephemeral text suppression)
Banner ads, keyword spam,

40
Indexing Anchor Text

A type of document expansion
Terms near links describe content of the target
Works even when you cant index content
Image retrieval, uncrawled links,

41
Browsing Results User Goals

Identify documents for some form of delivery
Query Enrichment
Relevance feedback
User designates more like this documents
System adds terms from those documents to the
query
Manual reformulation
Better approximation of information need
What can the system do?
Assist the user to identify relevant documents
Assist the user to identify potential useful terms

42
Selection Interfaces

One dimensional lists
What to display? title, source, date, summary,
ratings, ...
What order to display? retrieval status value,
date, alphabetic, ...
How much to display? number of hits
Other aids? related terms, suggested queries,
Two dimensional displays
Clustering, projection, contour maps, VR
Navigation jump, pan, zoom

43
Example Interfaces

Google keyword in context
Teoma query refinement suggestions
Vivisimo clustered results
Kartoo clustered visualization

44
Queries on the Web (1999)

Low query construction effort
2.35 (often imprecise) terms per query
20 use operators
22 are subsequently modified
Low browsing effort
Only 15 view more than one page
Most look only above the fold
One study showed that 10 dont know how to
scroll!

45
Types of User Needs

Informational (30-40 of AltaVista queries)
What is a quark?
Navigational
Find the home page of United Airlines
Transactional
Data What is the weather in Paris?
Shopping Who sells a Viao Z505RX?
Proprietary Obtain a journal article

46
Searching Other Languages
Query Formulation
Document
Use
47
(No Transcript)
48
Speech Retrieval Architecture
Query Formulation
Speech Recognition
Automatic Search
Boundary Tagging
Interactive Selection
Content Tagging
49
The GALE Project
Rosetta Distillation Utility
BAE Distillation GNG
NIST Translation GNG
Term sequence
Presentation Scope Select Synthesize
Entities Roles Relations Events Sentiment
Extraction Detect Classify Normalize
Acquisition Transcribe Translate Tokenize
Term sequence
Text Speech
Interests Knowledge
User Modeling
Interact
Control
Summary Visualization
Explanation
Report
50
Hands On Try Some Search Engines

Web Pages (using spatial layout)
http//kartoo.com/
Images (based on image similarity)
http//elib.cs.berkeley.edu/photos/blobworld/
Multimedia (based on metadata)
http//singingfish.com
Movies (based on recommendations)
http//www.movielens.umn.edu
Grey literature (based on citations)
http//citeseer.ist.psu.edu/

51
Evaluating Retrieval Systems

User-centered strategy
Given several users, and at least 2 retrieval
systems
Have each user try the same task on both systems
Measure which system works the best
System-centered strategy
Given documents, queries, and relevance judgments
Try several variations on the retrieval system
Measure which ranks more good docs near the top

52
Defining Relevance

A central problem in information science
Relevance relates a topic and a document
Not static
Influenced by other documents
Two general types
Topical relevance is this document about the
correct subject?
Situational relevance is this information useful?

53
Good Effectiveness Measures

Capture some aspect of what the user wants
Have predictive value for other situations
Different queries, different document collection
Easily replicated by other researchers
Easily compared
Optimally, expressed as a single number

54
Set-Based Measures
Collection size ABCD Relevant
AC Retrieved AB

Precision A (AB)
Recall A (AC)
Miss C (AC)
False alarm (fallout) B (BD)

When is precision important? When is recall
important?
55
Another View
Space of all documents
Relevant Retrieved
Relevant
Retrieved
Not Relevant Not Retrieved
56
Precision and Recall

Precision
How much of what was found is relevant?
Often of interest, particularly for interactive
searching
Recall
How much of what is relevant was found?
Particularly important for law, patents, and
medicine

57
Abstract Evaluation Model
Documents
Query
Ranked Retrieval
Ranked List
Evaluation
Relevance Judgments
Measure of Effectiveness
58
Which is the Best Rank Order?
A.
B.
C.
D.
E.
F.
59
Evaluating Ranked Retrieval
Prec. 1.00 1.00 0.60 0.57 0.50
Rel? R R R R R
-------------------------------------------- Avg.
Prec. (AP) 0.73
min(W, W-) lt 8 ? difference is not significant
(two-tailed, p0.05)
60
Precision-Recall Curves
Source Ellen Voorhees, NIST
61
User Studies