A COMPARISON OF COMMERCIAL SPEECH RECOGNITION COMPONENTS FOR USE IN POLICE CRUISERS

About This Presentation

Title:

A COMPARISON OF COMMERCIAL SPEECH RECOGNITION COMPONENTS FOR USE IN POLICE CRUISERS

Description:

A COMPARISON OF COMMERCIAL SPEECH RECOGNITION COMPONENTS FOR USE IN POLICE CRUISERS. 3rd Annual Intelligent Vehicle Systems Symposium ... – PowerPoint PPT presentation

Number of Views:64

Avg rating:3.0/5.0

Slides: 49

Provided by: taj121

Learn more at: http://proceedings.ndia.org

Category:

more less

Transcript and Presenter's Notes

Title: A COMPARISON OF COMMERCIAL SPEECH RECOGNITION COMPONENTS FOR USE IN POLICE CRUISERS

1
A COMPARISON OF COMMERCIAL SPEECH RECOGNITION
COMPONENTS FOR USE IN POLICE CRUISERS

3rd Annual Intelligent Vehicle Systems Symposium
Andrew L. Kun
Brett Vinciguerra
June 11, 2003

2
Outline of Presentation

Introduction - What, Why and How?
Background
Speech Recognition Evaluation Program Software
Testing
Results and Discussion
Conclusion

3
Project54 Overview

UNH / NHSP / DOJ
Integrates
Controls
Standard Interface

4
(No Transcript)
5
(No Transcript)
6
Introduction

What was the goal of this research?
Compare SR engine and microphone combinations
Accuracy and efficiency
Quantitatively

7
Introduction

Why was this research important?
Limit distraction
Limit frustration
Standard Process

8
Introduction

How was this goal accomplished?
16 combinations (4 engines x 4 mics) evaluated
Speech Recognition Evaluation Program (SREP)
Simulates
Classifies
Calculates

9
Introduction

Accuracy
of correct commands verses total commands
Efficiency
false recognitions
weighted

10
Outline of Presentation

Introduction - What, Why and How?
Background
Speech Recognition Evaluation Program Software
Testing
Results
Discussion
Conclusion

11
SR ENGINE OPTIONS

Speed of Speech
Discrete
Continuous
Type of Application
Command-and-control
Dictation
User-Dependency
Speaker dependent
Speaker independent
Field of Application
PC
Telephone
Noise robust
Grammar File

12
Comparing SR Engines

Field test
Simulated tests
Speaker source
Background noise
Number of speakers

13
Accuracy Ratings

Not consistent
Different conditions
Hydes Law
Because speech recognisers have an accuracy of
98, tests must be arranged to prove it

14
Component Requirements

Speech Recognition Engine
Must be SAPI 4.0
Microphone
Must be far-field
Mountable on dashboard
Cancel noise
Array
Directional

15
Outline of Presentation

Introduction - What, Why and How?
Background
Speech Recognition Evaluation Program Software
Testing
Results and Discussion
Conclusion

16
(No Transcript)
17

18

LOOP ENGINES
LOOP BACKGROUND
LOOP COMMANDS
19
Obtaining Sound Files

Laptop w/ SoundBlaster
Earthworks M30BX
Background recorded on patrol
Speech commands in lab
Microsoft Audio Collection Tool
5 Speakers (4 male, 1 female)
40 phrases

20
Processing Sound Files

Matlab script
Signal strength variance(signal)
mean(signal)2
Set volume and signal-to-noise ratio

21

22
Control File Structure

Background Noises
WAV filename
Desired SNR
Signal strength
Description of file
Voice Commands
WAV filename
Number of loops
Signal strength
Phrase

23
Outline of Presentation

Introduction - What, Why and How?
Background
Speech Recognition Evaluation Program Software
Testing
Results and Discussion
Conclusion

24
PRODUCTS TESTED

Four microphones
A, B, C and D.
Four SR engines
1, 2, 3, and 4.
16 unique combinations
A1 through D4

25
(No Transcript)
26
SR ENGINES

SR Engine 1
Microsoft SR Engine 4.0
SR Engine 2
Microsoft SR Engine 4.0
SR Engine 3
Dragon NaturallySpeaking 4.0
SR Engine 4
IBM ViaVoice 8.01

27
PREPERATION

Freshly installed engines
Minimum training
Default settings
Microphone Set-up Wizard

28
TEST SCENERIO

Identical conditions
42 phrase grammar
10 speech commands
5 speakers
6 background noises
3 SNR levels

29
(No Transcript)
30
Outline of Presentation

Introduction - What, Why and How?
Background
Speech Recognition Evaluation Program Software
Testing
Results and Discussion
Conclusion

31
ACCURACY BY ENGINE
32
ACCURACY BY MIC
33
RANKED ACCURACY
34
Efficiency Score

Specific to Project54
False recognitions

35
Efficiency Score

SAID HEARD
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS LOSS 0
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS

36
Efficiency Score

SAID HEARD
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS LOSS 1
LIGHTS UNRECOGNIZED
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS

37
Efficiency Score

SAID HEARD
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS LOSS 1.5
LIGHTS SIREN ON
SIREN OFF SIREN OFF
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS

38
Efficiency Score

Scoring system
Correctly recognized 1.5
Unrecognised 0.5
Falsely recognized 0
Eff. ((correct 1.5) (unrec. 0.5)) /
13.5
Extreme scores
All correct gt Eff. 100
All unrecognised gt Eff. 33
All falsely recognised gt Eff. 0

39
RANKED EFFICIENCY
40
WINNER

Accuracy
Configuration C2 accuracy 70.3
Efficiency
Configuration C2 efficiency 72.4
Logical choices
Microphone C
SR Engine 2

41
WHY LOW ACCURACIES?

Speakers SR experience
Limited training
Training Environment
Default settings
Microphone and speaker placement
SNR
Absolute scores not important

42
Outline of Presentation

Introduction - What, Why and How?
Background
Speech Recognition Evaluation Program Software
Testing
Results and Discussion
Conclusion

43
CONCLUSION

The main goal of this research was
SR engine and microphone combinations
Accuracy and efficiency
Quantitatively

44
CONCLUSION

This research was important in order to
Limit distraction
Limit frustration

45
CONCLUSION

The goal was reached by
Evaluating 16 combinations (4 engines x 4 mics)
Speech Recognition Evaluation Program (SREP)
Simulated
Classified
Calculated

46
CONCLUSION

Configuration C2
Most accurate
Most efficient

SR ENGINE 2 Microsoft SR Engine 4.0 Telephone mode
47
CURRENT STATUS

9 vehicles on road
300 in production
Now support non SAPI 4.0
Evaluating new engines

48
MORE INFORMATION

www.project54.unh.edu
andrew.kun_at_unh.edu
brettv_at_unh.edu

Write a Comment

User Comments (0)

About PowerShow.com

A COMPARISON OF COMMERCIAL SPEECH RECOGNITION COMPONENTS FOR USE IN POLICE CRUISERS - PowerPoint PPT Presentation

A COMPARISON OF COMMERCIAL SPEECH RECOGNITION COMPONENTS FOR USE IN POLICE CRUISERS

A COMPARISON OF COMMERCIAL SPEECH RECOGNITION COMPONENTS FOR USE IN POLICE CRUISERS. 3rd Annual Intelligent Vehicle Systems Symposium ... – PowerPoint PPT presentation