Title: Generating Speech Recognition Grammars with Compositional Semantics from Unification Grammars
1Generating Speech Recognition Grammars with
Compositional Semantics from Unification Grammars
- Johan Bos
- Language Technology Group
- The University of Edinburgh
2Some HistoryAutomatic Speech Recognition
(1990-1999)
Go to the kitchen!
Go too the kitchen Go to a kitchen Go to the
kitchen Go it at Go and take it
ASR
- ASR output is a lattice or a set of strings
- Many non-grammatical productions
- Use parser to select string and produce logical
form
3Some HistoryAutomatic Speech Recognition
(2000-2001)
Go to the kitchen Go to a kitchen Go and take it
ASR
- Put linguistic knowledge in language models
- ASR output contains grammatical productions
- Use parser to produce logical form
4Automatic Speech Recognition (2002)
ASR
- Put compositional semantics in language models
- ASR output comprises logical forms (e.g., a DRS)
- No need for subsequent parsing
5Aims
- Introduce a generic compositional semantics in
state-of-the-art speech recognition software - Investigate how a linguistically motivated
grammar can be used as the basis for a language
model - Implement and test the
- results with NUANCE
- speech recognition software
6Structure of Talk
- Theory
- Compile unification grammar into GSL
- Left-recursion elimination
- Providing means for compositional semantics
- Practice
- Implementation
- Empirical Results
- Evaluation
- Conclusions
7Generating Speech Grammars
- Many ASR allow language models to be built of
restrictive context-free grammars (GSL, JSpeech) - Normally no support for feature unification
- although some offer slot-filling
- Limited Expressiveness
- Left-recursion is not allowed
8GSL NUANCE
- The NUANCE ASR can be configured by a recognition
package consisting of - Recognition grammar (GSL)
- The pronunciations of the words
- An acoustic model
- GSL (and similar approaches) are nice because
they allow tuning to a particular application in
a convenient way - Tedious to build for serious applications
9Example of a GSL grammar
10Unification Grammars
- Linguistically Motivated
- Typically hand-crafted and wide-coverage
- Express syntactic and semantic properties of
linguistic constituents - Use Feature Unification to constrain derivations
and to build logical forms
11Example of a Unification Grammar we work with
- Mostly atomic feature values
- Untyped Grammar
- Range of values extensionally determined
- Complex features for traces
- Feature sem to hold semantic representation
- Semantic representations are expressed as Prolog
terms
12Compiling UGs to GSL
- Create a context-free backbone of the UG
- Use syntactic features in the translation to
non-terminal symbols in GSL - Previous Work
- Rayner et al. 2000, 2001
- Dowding et al. 2001 (typed unification grammar)
- Kiefer Krieger 2000 (HPSG)
- Moore (2000)
- Previous work does not concern semantics
- UNIANCE compiler (Sicstus Prolog)
13Compilation Steps (UNIANCE)
- Input UG rules and lexicon
- Feature Instantiation
- Redundancy Elimination
- Packing and Compression
- Left Recursion Elimination
- Incorporating Compositional Semantics
- Output rules in GSL format
14Feature Instantiation
- Create a context-free backbone of the unification
grammar - Collect range of feature values by traversing
grammar and lexical rules (for features with a
finite number of possible values) - Disregard Feature SEM
- Result is set of rules of the form C0 ? C1Cn
- where Ci has structure cat(A,F,X) with
- A a category symbol,
- F a set of instantiated feature value pairs,
- X the semantic representation
15Eliminating Redundant Rules
- Rules might be redundant with respect to
application domain - (or grammar might be ill-formed)
- Two reasons for a production to be redundant
- A non-terminal member of a RHS does not appear in
a production as LHS - A LHS category (not the beginner) does not appear
as RHS member - Remove such rules until fixed point is reached
16Packing and Compression
- Pack together rules that share LHSs
- Compress productions by replacing a set of rules
with the same RHS by a single production - Replace pair Ci ? C and Cj ? C (i ? j) by
- Ck ? C (Ck a new category)
- Substitute Ck for all occurrences of Ci and Cj in
the grammar
17Eliminating Left Recursion
- Left-recursive rules are common in linguistically
motivated grammars - GSL does not allow LR
- Standard way of eliminating LR
- Aho et al. 1996, Greibach Normal Form
- Here we only consider immediate left-recursion
- Replace pairs of A?AB, A?C by A?CA, A?BA and
A?e - Put differently by A?CA, A?BA, A?C and A?B
18LR Elimination Rule I
- For each left-recursive category with non-
- recursive grammar productions of the form
- cat(A,F,X) ? C1Cn
- extend the grammar with productions
- cat(A,F,Z(X)) ? C1Cn cat(A,F,Z)
All dependencies in Ci on X are preserved
19LR Elimination Rule II
- For each left-recursive category, replace all
productions of the form - cat(A,F,X) ? cat(A,F,Y) C1Cn
- by the following two productions
- cat(A,F,?Y.Z(X)) ? C1Cn cat(A,F,Z)
- cat(A,F,?Y.X) ? C1Cn
All dependencies of Y and Ci on X are preserved
20Example
- Left-Recursive Derivation / Right-Recursive
Derivation
21Incorporating Compositional Semantics
- At this stage we have a set of rules of the form
LHS ? C, where C is a set of ordered pairs of RHS
categories and corresponding semantic values - Convert LHS and RHS to GSL categories
(straightforward) - Bookkeeping required to associate semantic
variables with GSL slots - Semantic operations are composed using the
built-in strcat/2 function
22Example (Input UG)
23Example (GSL Output)
24Example (Nuance Output)
25Practical Results
- Does adding general semantic representations to
GSL have any effects on recognition speed? - Do GSL grammars generated using this method
produce any useful language models for speech
recognition?
26Evaluation Experiment
- Corpus of several hundred spoken English
utterances of 24 different native speakers
(instructions to mobile robot) - Development Data (380 utterances)
- Evaluation Data (388 utterances)
- Unification Grammar with 80 rules, of which four
suffering from left-recursion - Modification, Coordination
27Example Instruction
- Er head to the end of the street
- Turn left
- Take the first left
- Er go right down the road past the first right
and its the next building on your right
28Adding Probabilities to GSL
- Include probabilities to increase recognition
accuracy - Done by bootstrapping GSL grammar
- Use first version of GSL to parse a domain
specific corpus - Create table with syntactic constructions and
frequencies - Choose closest attachment in case of structural
ambiguities - Add obtained probabilities to
- original GSL grammar
29Evaluation Results (1)
- Generate two GSL grammars
- One without compositional semantics
- One with compositional semantics
- Results
30Evaluation Results (2)
- Obtained GSL grammar compiled with nuance-compile
using option -dont_flatten - Recall percentage of recognized utterances
- Precision 100 - Word Error Rate
31Conclusions (Possibly Bad News)
- No means for robust processing
- Integration with Statistical Models non-trivial
- Only cases of immediate left-recursion are
covered - Moore (2000) uses Greibach Normal Form with
regular expressions in GSL - Unclear how to integrate compositional semantics
32Conclusions(Good News)
- The Grammar-based approach to Speech Recognition
Post-processing of speech output restricted to
beta-conversion - No computational overhead
- Empirical evidence that such language models are
useful in applications - Only small corpus required
33Acknowledgements
- Work developed as part of the following projects
- EU-project DHomme (Dialogue in the Home
Environment) - EPSRC-funded IBL (Instruction-Based Learning for
Mobile Robots) - Team at Edinburgh
- Johan Bos, Tetsushi Oka, Ewan Klein