Title: Keystroke Biometric Recognition Studies on Long-Text Input Under Ideal and Application-Oriented Conditions
1Keystroke Biometric Recognition Studies on
Long-Text Input Under Ideal andApplication-Orient
ed Conditions
- Mary Villani, Charles Tappert, and Sung-Hyuk Cha
2Objective
- For long-text input of 600 keystrokes
- Determine the viability of the keystroke
biometric two independent variables - Different entry modes copy and free text
- Different keyboards desktop and laptop
3Advantages of Keystroke Biometric
- Keyboards commonly used
- Not intrusive
- Inexpensive
- Can Frequently Re-authenticate the User
4 Keystroke Biometric System Components
- Data Capture Applet
- Feature Extractor
- Pattern Classifier
5Data Capture Applet
6Sample Raw Feature Data
Sample Raw Feature Data File Hello World
7239 Feature Measurements
- 78 Key Press Duration Measures
- (39 means and 39 standard deviations)
- 70 Key Transition Type 1 Measures
- (35 means and 35 standard deviations)
- 70 Key Transition Type 2 Measures
- (35 means and 35 standard deviations)
- 21 Other Measures (percentages and rates)
8 Type 1 and 2 Transition Measures
9Key Press Duration Features and Fallback
Hierarchy
Hierarchy tree for the 39 duration features (each
oval), each represented by a mean and a standard
deviation.
10Key Transition Featuresand Fallback Hierarchy
Hierarchy tree for the 35 transition features
(each oval), each represented by a mean and a
standard deviation for each of the type 1 and
type 2 transitions.
11Fallback for Few Samples
- Mean and Standard Deviation Computation when
number of samples n(i) is less than
kfallback-threshold - Similar to NLP backoff statistics for n-grams
12Two Preprocessing Steps
- Outlier removal
- Remove samples gt 2s from µ
- Prevents feature skewing from pauses
- Standardization
- Scales to range 0-1 to give roughly equal weight
to each measure
13Pattern Classifier
- Nearest Neighbor Classifier using Euclidean
Distance
14Experimental DesignSix Main Experiments per Six
Arrows
15Experimental DesignKeyboards (independent
variable 1)
- Desktop Keyboards mostly (100) Dell desktops
in a classroom environment - Laptop Keyboards about 90 Dell laptops, some
IBM, HP, Apple - (greater variety of laptop
- than desktop keyboards)
16Experimental DesignInput Modes (independent
variable 2)
- Copy Task Input specified text of about 600
keystrokes corrections - Free Text Input creation of arbitrary emails
(at least 600 keystrokes)
17Subject Participation
18Participation By Experiment Each subject entered
5 texts in at least two quadrants A total of 36
participated in all four quadrants
Desktop
Laptop
1
52 Subjects
Copy
4
3
5
40 Subjects
47 Subjects
93 Subjects
Free Text
41 Subjects
6
2
40 Subjects
19Five Sub Experiments for Each of the Six Arrows
d e
b
a
c
- a. Training testing on data in quadrant at
first end of arrow (leave-one-out procedure) - b. Training testing on data in quadrant at
second end of arrow (leave-one-out procedure) - c. Combining data at each arrow end
(leave-one-out procedure) - d. Training on first end testing on second
- e. Training on second end testing on first
20Results Experiment 1 36 subjects participated in
all quadrants
21Results Experiment 2 36 subjects participated in
all quadrants
22Results Experiment 3 36 subjects participated in
all quadrants
23Results Experiment 4 36 subjects participated in
all quadrants
24Results Experiment 5 36 subjects participated in
all quadrants
25Results Experiment 6 36 subjects participated in
all quadrants
2636 Subject Summary
27All Subject SummarySupports 36 Subject Results
28Conclusions
- Best accuracies for same keyboard and same input
mode - Accuracy dropped significantly for different
keyboards or for different input modes - Accuracy for different input modes better than
accuracy for different keyboards - Accuracy for copy mode somewhat better than
accuracy for free-text mode - Accuracy decreased as the number of subjects
increased
29Long-Text Input Applications
- Identify the author of inappropriate email and
possibly even IM - Authenticate the student taking online exams
30Future Work
- Try more sophisticated classifiers
- Neural Networks
- Support Vector Machines
- Explore the data with data mining
31Questions?