Chapter 15 System Errors Revisited - PowerPoint PPT Presentation

About This Presentation

Title:

Chapter 15 System Errors Revisited

Description:

Calculate Dist(r) Solution: 14. Non-Parametric Method. Interval ... Put the samples in a bag, draw, record and put it back. Draw M samples from X B times. ... – PowerPoint PPT presentation

Number of Views:27

Avg rating:3.0/5.0

Slides: 32

Provided by: aer9

Learn more at: https://www.cse.unr.edu

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 15 System Errors Revisited

1
Chapter 15System Errors Revisited

Ali Erol
10/19/2005

2
System Errors Revisited

Quantify the accuracy of FAR and FRR estimates.
Confidence Intervals, a well known technique used
in statistical analysis.
See references 22,23.
The first three authors algorithm 23
experimentally demonstrated to provide better
Confidence Intervals estimates.

3
FAR/FRR

Definition
FRR(x)Prob(sm?x/H0)F(x)
FAR(y)Prob(sngty/Ha)1-Prob(sn ? y/Ha)1-G(y)
We need
F(x)Dist(x) Genuine (Matching) score DF
G(y) Dist(y) Imposter (Non-matching) score DF

4
FAR/FRR

Instead we have
Set of genuine scores XX1, X2, ., XM
Set of imposter scores YY1,Y2, ., YN
We estimate

5
Problem

What is the accuracy of these error rates?
The number of biometric samples
The quality of the samples
Data collection procedure (e.g. 10 consecutive
samples)
Subjects involved, the acquisition device etc.

6
An Estimation Problem

Given
x A random variable (F(x) denotes Dist(x))
XX1, X2, ., XM Sample set
Estimate ?E(x)
Solution
Error

(Unbiased estimator)
7
Biased/Unbiased Estimators

For an unbiased estimator we have
Example Gaussian Model Estimate mean ?1 and
variance ?2 using maximum likelihood criterion
i.e. maximize Prob(X/ ?,)

(Unbiased estimator)
(Biased estimator)
(Unbiased estimator)
8
Confidence Interval

Assume F(x) is given then Dist(r) can be
calculated
r is function of , which is a function of x
Calculate (1-??) 100 certainty (Next Slide)
r????1(?,X), ?2(?,X)
Which leads to (1-??)100 confidence interval for
? given by

9
Confidence Interval

Example
Discard ?/2 on lower and higher ends
Find the r values corresponding to the interval
boundary (called quantile)

Dist(r)
r
Prob(q(?/2) ?r ?q(1-?/2))1-?
10
Confidence Interval

Interpretation
Generate sample sets X from F(x)
Calculate confidence intervals for each X
(1-?)100 of these intervals contain ?.

11
Parametric Method

Xi identically distributed
Assume Xi are independent (not true in general)
Then can be taken to be normal
distribution using central limit theorem (large
M).
Result
E.g. For 95 confidence z1.96
Smaller interval with increasing M and ?

12
Non-Parametric Method

Assume F(x) is available.

Sample Set X
f(x)
Density of
Additional Sample Sets
Random Variable
13
Non-Parametric Method

FACT For large B we have
Define error to be
Calculate Dist(r)
Solution

14
Non-Parametric Method

Interval calculation Sorting and counting

15
Bootstrap Method

F(x) is not available all we have is X
How do we generate ?
Solution (i.e. Bootstrap method) Sampling with
replacement from X.
Put the samples in a bag, draw, record and put it
back.
Draw M samples from X B times. Some samples Xi
may not be in each set.

16
Bootstrap Method (Imperfections)

Xi are not independent.
In SR the dependence between samples is not
replicated.
Effect of dependence for independent samples
Variance of is smaller
Leads to smaller CIs

17
Subset Bootstrap

Potential sources of dependency
All samples from the same person (e.g. multiple
fingers)
All samples from same biometric (e.g. finger)
Partition X into independent subsets
Apply SR on subsets.

18
Subset Bootstrap (An example)

Fingerprint database
P persons
c fingers per person ?? DcP Fingers
d samples per finger
DB Size cPd
Matching pairs
d(d-1) per finger
cd(d-1) per person
cPd(d-1)Dd(d-1) total
Using a symmetric and asymmetric matcher does not
make any difference 23.

19
Subset Bootstrap (An Example)

X1? X2
X1 P10 c2, D20, d8 ? M1120
X2 P50 c2, D100, d8 ? M5600
Finger based partition Set subsets to be the
samples from the same finger (i.e. D subsets of
d(d-1) matching scores)
Person based partition Set subsets to be the
samples from the same person (i.e. P subsets of
cd(d-1) matching scores)

20
Subset Bootstrap (An Example)

We expect
CI1 (light gray) to be larger than CI2 (dark
gray)
Because X1 has smaller number of samples
CI2 (dark gray) to be contained in CI1 (light
gray)
Because X1? X2
The intervals are larger for person based
partitioning
There is dependency between fingers of the same
person

21
CIs for FAR/FRR

Calculate CIs for each threshold Tt0 and given
an ?

22
CI for FRR

Given genuine score set X
Generate
Calculate
Sort and count

23
CI for FAR

Given imposter score set Y
Generate
Calculate
Sort and count

24
Subset Bootstrap for FAR

Imposter scores Y are not independent
We are using multiple impressions of the same
finger.
Let Ixk kth finger impression from subject x
then sim(Ia1,Ib1), sim(Ia1,Ib2), sim(Ia2,Ib3) are
not statistically independent
Use a finger only once for D fingers we have
only D/2 such pairs
There is actually dependency between X and Y

25
Subset Bootstrap for FAR

Fingerprint database
P persons
c fingers per person ?? DcP Fingers
d samples per finger
DB Size cPd
Non-matching pairs
Nd2D(D-1)P(dc)2(P-1)d2c(c-1)
d2(D-1) per finger
(dc)2(P-1)d2c(c-1) per person

26
Subset Bootstrap for FAR
.
.
DB Partition
IN
Ii
I1
x
Y1IixI1
YN-1IixIN
Ii

Finger (ND) Take Ii (d elements), match it
against Ik??i (d2 pairs) then we have d2(D-1)
pairs. Repeat it with all Ii to construct subsets
Yk
Person (NP) Take Ii (cd elements), match it
against Ik??i ((dc)2 pairs) then we have
(dc)2(P-1) pairs. Inside Ii we have d2c(c-1)
pairs. Repeat it with all Ii to construct
subsets Yk
Not completely independent We use Ii many times.

27
Subset Bootstrap for FRR

Person subset is a better estimate

28
How good are the CIs?

There exists a true confidence interval (At the
beginning we assumed F(x) is known)
The CI we calculate is just one estimate.
How accurate is that estimate?

29
How good are the CIs?

We estimate E(x)
Ideal Test Assume F(x) is available
Generate
Calculate
Assume and test if

30
How good are the CIs?

Practical Test (for comparison)
Randomly split X into two subsets Xa and Xb
Calculate and CIa
Test
Repeat 1-3 many times and count the number of
hits i.e. probability of falling into the CIa
Hit rate is not equal to the confidence. Assume
have normal distribution.
The higher the hit rate is the better the
estimates are.

31
How good are the CIs?