LOU Xiaoyan, LI Jian Research and Development Center, Toshiba China - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

LOU Xiaoyan, LI Jian Research and Development Center, Toshiba China

Description:

Toshiba (China) R&D Center. Note on Tone Markup. Possible influence on SSML1.0 ... Toshiba (China) R&D Center. Word Boundary (cont... – PowerPoint PPT presentation

Number of Views:115
Avg rating:3.0/5.0
Slides: 12
Provided by: louxi
Category:

less

Transcript and Presenter's Notes

Title: LOU Xiaoyan, LI Jian Research and Development Center, Toshiba China


1
LOU Xiaoyan, LI JianResearch and Development
Center, Toshiba China
Suggestions on Tone and Word Boundary of Mandarin
for SSML
2
Outline
  • Tone
  • Word boundary

3
Tone (cont)
  • Importance
  • As important as phonemes in tonal language
  • Same syllables with different tones take
    different meaning
  • ?(ma) ?(má) ?(ma) ?(mà)
  • Sandhi phenomenon in tonal language
  • ?? ni3 hao3 ? ni2 hao3
  • Synthesis with correct tone help listener catch
    the meaning of speech
  • Non-markup behavior
  • Tone can be achieved by looking up dictionary or
    applying rules.
  • Errors may occur, especially in dealing with
    sandhi

4
Suggestion on Tone (cont)
  • Our suggestions
  • Using Pinyin sequence as the value of phoneme
    element
  • Using number 1, 2, 3, 4 and 5 standing for tone
    yin ping, yang ping, shang sheng, qu
    sheng and neutral tone in Mandarin
  • Text ??(dàdou)
  • Pinyin sequencetone /da 4/dou 1/
  • Solution1 new tone element (optional), with
    required attribute detail
  • ??
  • Solution 2 new value t and ptof alphabet
    attribute in phoneme element
  • ??

  • ??

5
Note on Tone Markup
  • Possible influence on SSML1.0
  • Solution 1 Tone element cannot be followed by
    other element, and can be enclosed by p, s,
    w(if defined) element
  • Solution 2 phoneme element is modified, the
    relation to other elements should not change
  • The tone strings given by markup cannot be
    changed
  • in the text normalization step
  • in the result of looking up the lexicon.
  • Tone markup should be neglected, when
  • Value error of tone
  • Unmatched length of tone sequence

6
Outline
  • Tone
  • Word boundary

7
Word Boundary (cont)
  • Word is the basic unit for sentence parsing and
    understanding.
  • Chinese sentences are composed of sequence of
    Chinese characters without blanks or spaces to
    specify word boundaries.
  • Difficulties
  • Complex words, such as reduplications, derived
    words, such as ????(very easily),
    ???(immateriality)
  • Proper nouns, such as location name, person name
  • The ambiguous word segmentations.
  • A ?? ? ? ????(Shanghai is a metropolis)
  • B ??? ?? ? ?? ??(Most Shanghainese will say
    that)
  • Non-markup behavior
  • Determine the boundary using language-specific
    knowledge
  • Errors may occur

8
Suggestions on Word Boundary (cont)
  • New element w is suggested
  • ??
  • An optional attribute detail is also recommended
    to mark phrases
  • ??????
  • Here, the phrase is split into three words, and
    the number of Chinese characters of these words
    are 3, 2 and 1.

9
Suggestion on Word Boundary (cont)
  • Legal values of the optional attribute detail
  • Not bigger than the length of the contained text
  • ??
  • Default value is the length of the contained text
  • ??
  • When the sum of value is smaller than the length
    of the contained text, the left part is regarded
    as a word
  • ??????
  • The first 3 Chinese characters ???are regarded
    as one word and the left ??? are regarded as
    another word
  • When the sum of value is bigger than the length
    of the contained text, this markup should be
    neglected

10
Possible Influence on SSML 1.0
  • Influence on speech synthesizing steps
  • Word segmentation is suggested to be done before
    parse text and analysis structure
  • Relation between SSML 1.0 markups and word
    segmentation markup w (needs more discussion)
  • p, s element can be followed by w element
  • w element can be followed by audio, emphasis,
    phoneme, prosody, say-as, sub, voice and t(if
    defined)
  • ??
  • ?????

11
Thank you!
Write a Comment
User Comments (0)
About PowerShow.com