Chapter 15 System Errors Revisited - PowerPoint PPT Presentation

About This Presentation
Title:

Chapter 15 System Errors Revisited

Description:

Calculate Dist(r) Solution: 14. Non-Parametric Method. Interval ... Put the samples in a bag, draw, record and put it back. Draw M samples from X B times. ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 32
Provided by: aer9
Learn more at: https://www.cse.unr.edu
Category:

less

Transcript and Presenter's Notes

Title: Chapter 15 System Errors Revisited


1
Chapter 15System Errors Revisited
  • Ali Erol
  • 10/19/2005

2
System Errors Revisited
  • Quantify the accuracy of FAR and FRR estimates.
  • Confidence Intervals, a well known technique used
    in statistical analysis.
  • See references 22,23.
  • The first three authors algorithm 23
    experimentally demonstrated to provide better
    Confidence Intervals estimates.

3
FAR/FRR
  • Definition
  • FRR(x)Prob(sm?x/H0)F(x)
  • FAR(y)Prob(sngty/Ha)1-Prob(sn ? y/Ha)1-G(y)
  • We need
  • F(x)Dist(x) Genuine (Matching) score DF
  • G(y) Dist(y) Imposter (Non-matching) score DF

4
FAR/FRR
  • Instead we have
  • Set of genuine scores XX1, X2, ., XM
  • Set of imposter scores YY1,Y2, ., YN
  • We estimate

5
Problem
  • What is the accuracy of these error rates?
  • The number of biometric samples
  • The quality of the samples
  • Data collection procedure (e.g. 10 consecutive
    samples)
  • Subjects involved, the acquisition device etc.

6
An Estimation Problem
  • Given
  • x A random variable (F(x) denotes Dist(x))
  • XX1, X2, ., XM Sample set
  • Estimate ?E(x)
  • Solution
  • Error

(Unbiased estimator)
7
Biased/Unbiased Estimators
  • For an unbiased estimator we have
  • Example Gaussian Model Estimate mean ?1 and
    variance ?2 using maximum likelihood criterion
    i.e. maximize Prob(X/ ?,)

(Unbiased estimator)
(Biased estimator)
(Unbiased estimator)
8
Confidence Interval
  • Assume F(x) is given then Dist(r) can be
    calculated
  • r is function of , which is a function of x
  • Calculate (1-??) 100 certainty (Next Slide)
  • r????1(?,X), ?2(?,X)
  • Which leads to (1-??)100 confidence interval for
    ? given by

9
Confidence Interval
  • Example
  • Discard ?/2 on lower and higher ends
  • Find the r values corresponding to the interval
    boundary (called quantile)

Dist(r)
r
Prob(q(?/2) ?r ?q(1-?/2))1-?
10
Confidence Interval
  • Interpretation
  • Generate sample sets X from F(x)
  • Calculate confidence intervals for each X
  • (1-?)100 of these intervals contain ?.

11
Parametric Method
  • Xi identically distributed
  • Assume Xi are independent (not true in general)
  • Then can be taken to be normal
    distribution using central limit theorem (large
    M).
  • Result
  • E.g. For 95 confidence z1.96
  • Smaller interval with increasing M and ?

12
Non-Parametric Method
  • Assume F(x) is available.

Sample Set X
f(x)
Density of
Additional Sample Sets
Random Variable
13
Non-Parametric Method
  • FACT For large B we have
  • Define error to be
  • Calculate Dist(r)
  • Solution

14
Non-Parametric Method
  • Interval calculation Sorting and counting

15
Bootstrap Method
  • F(x) is not available all we have is X
  • How do we generate ?
  • Solution (i.e. Bootstrap method) Sampling with
    replacement from X.
  • Put the samples in a bag, draw, record and put it
    back.
  • Draw M samples from X B times. Some samples Xi
    may not be in each set.

16
Bootstrap Method (Imperfections)
  • Xi are not independent.
  • In SR the dependence between samples is not
    replicated.
  • Effect of dependence for independent samples
  • Variance of is smaller
  • Leads to smaller CIs

17
Subset Bootstrap
  • Potential sources of dependency
  • All samples from the same person (e.g. multiple
    fingers)
  • All samples from same biometric (e.g. finger)
  • Partition X into independent subsets
  • Apply SR on subsets.

18
Subset Bootstrap (An example)
  • Fingerprint database
  • P persons
  • c fingers per person ?? DcP Fingers
  • d samples per finger
  • DB Size cPd
  • Matching pairs
  • d(d-1) per finger
  • cd(d-1) per person
  • cPd(d-1)Dd(d-1) total
  • Using a symmetric and asymmetric matcher does not
    make any difference 23.

19
Subset Bootstrap (An Example)
  • X1? X2
  • X1 P10 c2, D20, d8 ? M1120
  • X2 P50 c2, D100, d8 ? M5600
  • Finger based partition Set subsets to be the
    samples from the same finger (i.e. D subsets of
    d(d-1) matching scores)
  • Person based partition Set subsets to be the
    samples from the same person (i.e. P subsets of
    cd(d-1) matching scores)

20
Subset Bootstrap (An Example)
  • We expect
  • CI1 (light gray) to be larger than CI2 (dark
    gray)
  • Because X1 has smaller number of samples
  • CI2 (dark gray) to be contained in CI1 (light
    gray)
  • Because X1? X2
  • The intervals are larger for person based
    partitioning
  • There is dependency between fingers of the same
    person

21
CIs for FAR/FRR
  • Calculate CIs for each threshold Tt0 and given
    an ?

22
CI for FRR
  • Given genuine score set X
  • Generate
  • Calculate
  • Sort and count

23
CI for FAR
  • Given imposter score set Y
  • Generate
  • Calculate
  • Sort and count

24
Subset Bootstrap for FAR
  • Imposter scores Y are not independent
  • We are using multiple impressions of the same
    finger.
  • Let Ixk kth finger impression from subject x
    then sim(Ia1,Ib1), sim(Ia1,Ib2), sim(Ia2,Ib3) are
    not statistically independent
  • Use a finger only once for D fingers we have
    only D/2 such pairs
  • There is actually dependency between X and Y

25
Subset Bootstrap for FAR
  • Fingerprint database
  • P persons
  • c fingers per person ?? DcP Fingers
  • d samples per finger
  • DB Size cPd
  • Non-matching pairs
  • Nd2D(D-1)P(dc)2(P-1)d2c(c-1)
  • d2(D-1) per finger
  • (dc)2(P-1)d2c(c-1) per person

26
Subset Bootstrap for FAR
.
.
DB Partition
IN
Ii
I1
x
Y1IixI1
YN-1IixIN
Ii
  • Finger (ND) Take Ii (d elements), match it
    against Ik??i (d2 pairs) then we have d2(D-1)
    pairs. Repeat it with all Ii to construct subsets
    Yk
  • Person (NP) Take Ii (cd elements), match it
    against Ik??i ((dc)2 pairs) then we have
    (dc)2(P-1) pairs. Inside Ii we have d2c(c-1)
    pairs. Repeat it with all Ii to construct
    subsets Yk
  • Not completely independent We use Ii many times.

27
Subset Bootstrap for FRR
  • Person subset is a better estimate

28
How good are the CIs?
  • There exists a true confidence interval (At the
    beginning we assumed F(x) is known)
  • The CI we calculate is just one estimate.
  • How accurate is that estimate?

29
How good are the CIs?
  • We estimate E(x)
  • Ideal Test Assume F(x) is available
  • Generate
  • Calculate
  • Assume and test if

30
How good are the CIs?
  • Practical Test (for comparison)
  • Randomly split X into two subsets Xa and Xb
  • Calculate and CIa
  • Test
  • Repeat 1-3 many times and count the number of
    hits i.e. probability of falling into the CIa
  • Hit rate is not equal to the confidence. Assume
    have normal distribution.
  • The higher the hit rate is the better the
    estimates are.

31
How good are the CIs?
  • ?0.1
  • Person based partitioning provide more accurate
    confidence intervals
  • 73.10 is very close to the expected value
Write a Comment
User Comments (0)
About PowerShow.com