Detecting Genre Shift - PowerPoint PPT Presentation

About This Presentation

Title:

Detecting Genre Shift

Description:

Detecting Genre Shift Mark Dredze, Tim Oates, Christine Piatko Paper to appear at EMNLP-10 – PowerPoint PPT presentation

Number of Views:129

Avg rating:3.0/5.0

Slides: 48

Provided by: TimOa152

Learn more at: https://ebiquity.umbc.edu

Category:

more less

Transcript and Presenter's Notes

Title: Detecting Genre Shift

1
Detecting Genre Shift

Mark Dredze, Tim Oates, Christine Piatko
Paper to appear at EMNLP-10

2
Natural Language Processing and Machine Learning

Extracting findings from scientific papers
Genetic epidemiology (development domain)
PubMed search produces thousands of papers
Manually reviewed to extract findings
Findings determine relevant papers/studies
Automate this process with ML/NLP methods
Create searchable database of findings
Allow machine inference over findings
Suggest new scientific hypotheses

3
Genre Shift in Statistical NLP
told that John Paul Stevens is retiring this
summer
Named Entity Recognition
4
Supervised Machine Learning for Named Entity
Recognition
Windowed Text Label
Today the Atlantic Ocean is B
the Atlantic Ocean is in I
Atlantic Ocean is in an O
Ocean is in an uproar O
is in an uproar and O
in an uproar and North O
an uproar and North Carolina O
uproar and North Carolina remains B
and North Carolina remains in I
North Carolina remains in a O
Today the Atlantic Ocean is in an uproar and
North Carolina remains in a state of anxiety.
5
Supervised Machine Learning for Named Entity
Recognition
Windowed Text Label
Today the Atlantic Ocean is B
the Atlantic Ocean is in I
Atlantic Ocean is in an O
Feature Vector Label
today, the, atlantic, ocean, is, U, L, U, U, L B
the, atlantic, ocean, is, in, L, U, U, L, L I
atlantic, ocean, is, in, an, U, U, L, L, L O
6
Genre Shift in Statistical NLP
told that John Paul Stevens is retiring this
summer
PRESIDENT BARACK OBAMA IS URGING MEMBERS TO
Named Entity Recognition
???
7
This is a Pervasive Problem

Extracting regulatory pathways from online
bioinformatics journals using a parser trained on
the WSJ
Finding faces in images of disaster victims using
a model trained on mug shot images
Identifying RNA sequences that regulate gene
expression in a lab in Baltimore using a model
trained on data gathered in a lab in Germany

When things change in a way thats harmful, wed
like to know!
8
Data Streams Change Over Time
Sentiment classification from movie reviews

Natural drift
Users unaware of system limitations

9
Detecting Genre Shift
Genre shift hurts system performance (accuracy)

Two problems
Detect changes in stream of numbers (A-distance)
Convert document stream to stream of informative
numbers (margin)

10
Detecting Genre Shift
Genre shift hurts system performance (accuracy)

Measure accuracy directly
Requires labeled examples!
Look for changes in feature distributions
Words become more/less common
New words appear

11
Measuring Changes in StreamsThe A-Distance
A nonparametric, distribution independent measure
of changes in univariate, real-valued data
streams (Kifer, Ben-David, and Gherke, 2004)
12
Measuring Changes in StreamsThe A-Distance
gt e
13
Measuring Changes in StreamsThe A-Distance
gt e
14
Changes in Document Streams
President Barack Obama is urging members to
15
Changes in Document Streams
4
Obama
4
1
1
embassy
President Barack Obama is urging members to
16
Changes in Document Streams
X
W
Obama
4
1.6
1
0.1
embassy
President Barack Obama is urging members to
17
Changes in Document Streams
X
W
Obama
4
1.6
1
0.1
embassy
President Barack Obama is urging members to

WX margin
sign of WX is class label (/-)
magnitude of WX is certainty in label

18
Why Margins?

We have an easy way of producing them from
unlabeled examples!
We want to track feature changes
Margins are linear combinations of feature values
Removing important features yields smaller
margins
Only track features that matter, features with
zero (small) weight dont affect margin (much)
Spoiler alert! Tracking margins works really
well for unsupervised detection on genre shifts.

19
Accuracy vs. Margins
DVD to Electronics
20
Accuracy vs. Margins
DVD to Electronics
Average in block
Average over last 100 instances
21
Accuracy vs. Margins
DVD to Electronics
22
Confidence Weighted Margins

Margins can be viewed as measure of confidence
We detect when confidence in classifications
drops
Confidence Weighted (CW) learning refines this
idea
Gaussian distribution over weight vectors
Mean of weight vector µ in RN
Diagonal co-variance matrix s in RNxN
Low variance ? high confidence
Normalized margin µx / (xTsx)0.5
Called VARIANCE in slides that follow

µ
s 0.02
1.6
s 1.74
0.1
23
Experiments

Datasets
Sentiment classification between domains (Blitzer
et al., 2007)
DVDs, electronics, books, kitchen appliances
Spam classification between users (Jiang and
Zhai, 2007)
Named entity classification between genres (ACE
2005)
News articles, broadcast news, telephone, blogs,
etc.
Algorithms
Baselines SVM, MIRA, CW
Our method VARIANCE

24
Experiments

Simulated domain shifts between each pair of
genres
38 pairs, 10 trials each with different random
instance orderings
500 source examples
1500 target examples
False change
11 datasets with no shift, 10 trials with
different random instance orderings
If no shift found then detection recorded as end
of target examples when computing averages

25
Comparing Algorithms
26
SVM vs. VARIANCE
27
SVM vs. VARIANCE
28
Summary of Results Thus Far

VARIANCE detected shifts faster than
SVM 34 times out of 38
MIRA 26 times out of 38
CW 27 times out of 38

29
Gradual Shifts
30
What if you have labels?

STEPD a Statistical Test of Equal Proportions to
Detect concept drift (Nishida and Yamauchi, 2007)
Monitors accuracy of classifier from stream of
labeled examples
Parameters window size, W, and threshold, a

31
Comparison to STEPD
32
What about false positives?
33
The A-Distance Choosing Parameters
P
gt e
34
The A-Distance Choosing Parameters
P
gt e
35
The A-Distance Choosing Parameters

A-distance paper gives bounds on FPs and FNs
Bounds depend on n and e
Bounds do not depend on tiling!
So loose as to be meaningless
No guidance on how to choose tiling
What if tiles lie outside support of data?

36
Better Bounds

PA true probability of a point falling in tile
A
h number of points that actually fell in A
pA h/n ML estimate of PA
Define PA, h, and pA for second window
Suppose PA PA, then any change detected is a
false positive

What is the probability that pA pA gt e/2?
gt e
37
Posterior Over PA

B(a, b) is the Beta function over a b Bernoulli
trials
a trials have one outcome (point lands in tile A)
b trials have the other (point lands in some
other tile)

38
False Positives Two Cases
39
Dont worry, Im not going to explain this (much)
40
Probability of a FP (n 200)
41
Probability of FN
42
Minimizing Expected Loss
43
Moving Forward
44
Genre Shift Fix
told that John Paul Stevens is retiring this
summer
PRESIDENT BARACK OBAMA IS URGING MEMBERS TO
Named Entity Recognition
45
Genre Shift Fix
told that John Paul Stevens is retiring this
summer
PRESIDENT BARACK OBAMA IS URGING MEMBERS TO
President Barack Obama is urging members to
Named Entity Recognition
46
Conclusion

Changes in margins convey useful information
about changes in classification accuracy
No need for labeled examples!
The A-distance applied to margin streams finds
genre shifts with few false positives/negatives
Confidence weighted margins normalized by
variance detect shifts faster than SVM, MIRA, or
(non-normalized) CW margins
Our approach even works with gradual shifts and
compares favorably to shift detectors that use
labeled examples

47
Thank you!

Write a Comment

User Comments (0)