Title: SSML Extensions for TTS in Indian Languages II workshop on Internationalizing SSML 30-31 May 2006, Greece
1SSML Extensions for TTS in Indian LanguagesII
workshop on Internationalizing SSML 30-31 May
2006, Greece
- Nixon Patel and Kishore Prahallad
- Bhrigus Inc. Hyderabad, India
- IIIT Hyderabad, India
2Topics
- About Bhrigus
- Collaborative Efforts between Bhrigus and IIIT
Hyderabad - Nature of Indian language scripts convergence
and divergence - Issues across TTS rendering in all these
languages - Proposed solutions/tags
- Syllable Element
- Alien Element
- Dialect Element
3Bhrigus voice data solutions
http//www.bhrigus.com
4About Bhrigus
- Established 2002
- Business Providing IVR, Speech
-
Enterprise solutions to BFSI, -
Telcos, contact centers -
manufacturing companies. - Key Customers Hewitt Associates,
- ATT,
Pfizer, Merrill Lynch, - Union
pacific railroad, CDIA, - South
western energy, - Orange
county, Stryker - SEI CMM Level 4 Process Implementation
undergoing, ISO 9001 2000 KPMG certified.
5Speech and Language Technology Lab _at_ Bhrigus
- Playing a leadership role in the development of
ASR and TTS for all official Indian languages to
provide voice solutions for Indian market - Collaborations IIIT Hyderabad, Carnegie Mellon
University - 10 member team board of advisors
- 3 PhDs and 4 Masters
- Synthesis team, Recognition team, Linguist team
and Language resources team - Initiating SSML and VXML chapters in India
6Collaborative Efforts
- Bhrigus Inc. Hyderabad Voice based solution
providers - IIIT Hyderabad one of the leading universities
in India doing speech research - Telugu TTS Collaborative Efforts between
Bhrigus Inc. and IIIT - Goal Develop ASR and TTS for all official Indian
languages
7Nature of Indian Language (IL) Scripts
- Basic units of the writing system are Aksharas
- An Akshara is an orthographic representation of a
speech sound - Akshara is syllabic in nature, typical forms are
V, CV, CCV and CCCV (C consonant, V vowel) - Always ends with a vowel (or nasalized vowel) in
written form - 1652 dialects/native languages
- 22 languages officially recognized
8Convergence of IL Scripts
- Aksharas are syllabic in nature
- Common phonetic base
- Share a common set of speech sounds across all
languages - Fairly good (though not exact) correspondence
between sequence of Aksharas and the
corresponding sequence of sounds - Often referred to as Letter-to-sound rules
- Written from left-to-right as in European
languages - Words are separated by space as in European
languages
9Divergence of IL Scripts
- Each IL has its own script
- All IL share a common phonetic base however,
Phonotactics in each IL are different from each
other - IL are non-tonal languages unlike eastern
languages such as Chinese
10How to represent Indian language Scripts
- Unicode
- Useful for rendering the Indian language
scripts - Not suitable for keying-in through QWERTY key
board - Not suitable to build modules such as
text-normalization (cant see the Unicode
characters on many editors) - Itrans-3 / OM - A transliteration scheme by IISc
Bangalore, India and Carnegie Mellon University - Useful for keying-in and store the scripts of
Indian language using QWERTY keyboards - Useful for processing and writing modules/rules
for letter-to-sound, text normalization etc.
11Itrans-3 / OM Notation
12Why Itrans-3/OM?
- Developed from the user readability aspects
Easier to read and type - It is case-insensitive.
- This scheme is phonetic in nature, the characters
corresponds to the actual sound that is being
spoken. - Thus a single transliteration scheme is used for
all the Indian languages, as they share the same
set of sounds. - Each character (corresponding to a phone/sound)
is not more than three letters length. - Adapted across Universities in India/Abroad and
some industrial labs such as Bhrigus Inc.
13Issues in TTS rendering in IL
- TTS should be able to pronounce words as Akshara
(syllable) by Akshara (syllable) - Languages have heavy influence of English (alien)
words - Alien words occur in between the sentences
- Each language has its own dialect
14SSML Tag Phoneme Element ltphonemegt
- ltphoneme alphabet"itrans-3" ph"n aa t oo"gt
naatoo lt/phonemegt - Ph attribute specifies phoneme/phone string
- Rendering n aa t oo individually does not
make sense to the native speakers of Indian
languages - Sounds needs to be rendered in terms of syllables
15Syllable Element ltsyllablegt
- ltsyllable alphabet"itrans-3" syl"naa too"gt
naatoo lt/syallablegt - Render naa and too which are Aksharas
(syllables)
16Motivation for Loan Word ltaliengt
- Informal experiments suggested 33 of errors of
TTS of IL occur while rendering alien
(non-native) words - Such alien words could be automatically detected
due to syllabic properties of the Indian
languages
17Example of loan word
- BANK has to be pronounce as /B/ /AE/ /N/ /K/
- /AE/ phoneme does not exist in Indian language
phone set - ltaliengt baank lt/aliengt
- Alien (non-native) words could be rendered using
different pronunciation dictionaries or
letter-to-sound rules
18Dialect Element ltdialectgt
- Each language has its own dialect
- TTS should be able to handle dialects without
unloading the language resources
19Dialect Element ltdialectgt
- lt?xml version"1.0"?gtltspeak version"1.0"
xmllang"tel-in"gt - ltvoice gender"female"gt
- ltdialect name andhragt yekkadiki vellaali
lt/dialectgt - ltdialect name telengana pro yaadiki
poovaalegt yekkadiki vellaali lt/dialectgt - lt/voicegtlt/speakgt
20Conclusions
- Bhrigus Inc. Hyderabad taking lead position to
develop ASR and TTS for Indian languages - Proposed ltsyllablegt ltaliengt ltdialectgt elements
for SSML extensions
21References
- Prahallad Lavanya, Prahallad Kishore and
GanapathiRaju Madhavi, A Simple Approach for
Building Transliteration Editors for Indian
Languages, Journal of Zhejiang University
Science, vol.6A, no.11, pp. 1354-1361, Oct 2005.