Improving Search Effectiveness in the Legal E-Discovery Process using Relevance Feedback - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Improving Search Effectiveness in the Legal E-Discovery Process using Relevance Feedback

Description:

Select a collection with relevance judgments. We used the TREC 2005 Robust Track ... artieles aod discontinue mndia6 CIGNA Well-Being aawslener to om employees was a ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 15
Provided by: jj277
Category:

less

Transcript and Presenter's Notes

Title: Improving Search Effectiveness in the Legal E-Discovery Process using Relevance Feedback


1
Improving Search Effectiveness in the Legal
E-Discovery Process using Relevance Feedback
  • Feng Charlie Zhao, University of Washington
  • Douglas W. Oard, University of Maryland
  • Jason R. Baron, National Archives and Records
    Administration
  • fcz_at_u.washington.edu, oard_at_umd.edu,
    jason.baron_at_nara.gov

2
Meet-and-Confer Alternatives
3
High Recall is Possible
Precision RelRet / Ret Recall RelRet / Rel
TREC 2008 Interactive Task, Topic 103, High
OCR-accuracy documents only
4
Meet-and-Confer Alternatives
5
Boolean Misses Many Relevant
  • Mean estR0.33 (26 topics)
  • Missed 67 of relevant documents (on average)
  • Max estR 0.99, Topic 127 (sanitation procedures)
  • Min estR0.00, Topic 142 (contingent sales)

Estimated Recall
26 topics
TREC 2008 Ad Hoc Task, Consensus Boolean Queries
6
Boolean Misses Many Highly Relevant
  • Mean estR0.42 (24 topics)
  • Missed 58 of highly relevant docs (on average)
  • Max estR1.00, Topic 137 (intellectual property
    rights)
  • Min estR0.00, Topic 147 (returns of cigarettes)

Estimated Recall
24 topics
TREC 2008 Ad Hoc Task, Consensus Boolean Queries
7
Meet-and-Confer Alternatives
8
Research Questions
  • Can incremental disclosure (with query
    renegotiation) increase recall without increasing
    the total manual review effort?
  • How many review stages are needed?
  • What criteria can be used to recognize when
    renegotiation might be helpful?

9
Research Method
  • Select a collection with relevance judgments
  • We used the TREC 2005 Robust Track
  • Almost 1 million news stories from 3 sources
  • Model renegotiation as relevance feedback
  • Issue initial query
  • Simulate review of some number of documents
  • Lock in those results
  • Add the best terms from relevant docs to query
  • Measure Recall_at_N at end of each stage
  • Simulates completion of responsive review

10
Why Not Legal Track Collection?
Scanned
OCR
Metadata
Philip Moxx's. U.S.A. x.dramc.
cvrrespoaa.aa Benffrts Departmext Riehgtpwna,
Yfeia Ta Dishlbutfon Data aday 90,1997. From
Lisa Fislla Sabj.csr CIGNA WeWedng Newsbttsr
-Yntsre StratsU During our last CIGNA Aatfoa Plan
meadng, tlu iasuo of wLetSae to i0op
per'Irwng artieles aod discontinue mndia6 CIGNA
Well-Being aawslener to om employees was a msiter
of disanision . I Imvm done somme reaearcgtgt, and
wanted to pruedt you with my Sadings and
pcdiminary recwmmeadatioa for PM's atratezy
Ieprding l4aas aewelattee . I believe .vayone'a
input is valusble, and would epproolate hoarlng
fmaa aaeh of you on whetlne you concur with my
reeommendatioa
Title CIGNA WELL-BEING NEWSLETTER - FUTURE
STRATEGY Organization Authors PMUSA, PHILIP
MORRIS USA Person Authors HALLE, L Document
Date 19970530 Document Type MEMO,
MEMORANDUM Bates Number 2078039376/9377 Page
Count 2 Collection Philip Morris
11
How Many Expansion Terms?

Arithmetic Progression Partitions
12
Relevance-Guided Partitioning
  • Judge relevance in best-first order
  • Requires a relevance ranking technique
  • Stop when N1 relevant docs have been found
  • Renegotiate the query
  • Judge unseen docs in best first order
  • Stop when N2 relevant docs have been found
  • Renegotiate the query
  • Judge all remaining unseen documents
  • Or iterate more renegotiation stages, if desired

13
How Many Iterations?
50 Topics, Title Queries, TREC-2005 Robust Track
Collection
14
Research Questions
  • Can incremental disclosure (with query
    renegotiation) increase recall without increasing
    the total manual review effort?
  • Yes, although renegotiation takes time and effort
  • How many review stages are needed?
  • At least two (maybe more, if many relevant docs)
  • What criteria can be used to recognize when
    renegotiation might be helpful?
  • Number of relevant documents found so far
Write a Comment
User Comments (0)
About PowerShow.com