Ingen lysbildetittel presentation

About This Presentation

Transcript and Presenter's Notes

Title: Ingen lysbildetittel

1
RUNDKASTAn Annotated NorwegianBroadcast News
Speech Corpus

2
Overview

3
Purpose of Rundkast

Databases of broadcast news can be used for a
number of research topics in speech technology
such as
Supplement to existing databases of read speech
for training and testing automatic speech
recognition and speaker adaptation.
Research on recognition of spontaneous speech.
Research on automatic indexing of audio data.
Research on topic and/or speaker segmentation.
Research on speech/non-speech detection (e.g.
background music).
International research cooperation involving
speech technology for broadcast news
applications.
A corpus of this kind is necessary for language
technology research, but has not been available
for Norwegian

4
Overview of Rundkasthttp//www.iet.ntnu.no/projec
ts/rundkast/

Database of 77 hours radio broadcast news
fromthe Norwegian Broadcasting Corporation
(NRK)
Read and spontaneous speech, as well as
spontaneous dialogsand multipart discussions
There is large variation between speakers,
speaking styles and topics
Speaker turns may be rapid and several speakers
may talk simultaneously
The quality of the recordings include studio and
telephone(mobile, satellite etc)
Frequent occurrences of background noise,
jingles,music and audio illustrations
Funded by the Norwegian University of Science and
Technology (NTNU)

5
Structure of annotation

6
Hierarchy of annotation levels
levels 1section, 2speaker turn, and 3segment
7
Orthographic transcription

The lowest level in the annotation hierarchy,
segments, are transcribed orthographically.
Orthographic transcription of spoken language is
a challenge, especially for Norwegian. Using
dialect also in official circumstances is more
and more accepted.
The majority of RUNDKAST is not compliant to any
standard pronunciation.
The aim of the conventions for the orthographic
transcription in RUNDKAST is to minimize
uncertainty about pronunciations and facilitate
consistency.

8
Orthographic transcriptionMain conventions

Words are transcribed with the written forms
closest to actual pronunciations. A limited
number of interjections are allowed.
Text codes are used to mark mispronunciations,
truncations, and unknown words.
Numbers and symbols are written out as words.
Abbreviations are not used.
Punctuation marks are restricted to comma,
period, and question mark.
Space is used between spelled letters, also when
acronyms have spelled pronunciation.
Capital letters are used in proper names,
spellings, and acronyms, but not at the start of
sentences.

9
Example annotation in Transcriber
10
Broad phonetic annotation

Part of the data were to be phonetically
annotated
Use for low-level experiments in ASR (new
methods), smaller Norwegian counterpart to TIMIT
Auto-segmentation for e.g. unit selection TTS
Annotation to be based on existing standards
with necessary adjustments
Exploit experience and specifications from
development of Norwegian speech synthesis
databases
Suitable level of detail Acoustic boundaries
should be labeled, but more phonemic than
phonetic
Consistency of utmost importance!

11
Broad phonetic annotationSelected data

12
Broad phonetic annotationMain principles

The annotation is mainly phonemic using the
phoneme symbols closest to the perceived sound
Acoustic boundaries should be marked some
acoustically motivated symbols are included
A transcription as close as possible to the
citation form is preferred
Norwegian standard SAMPA is preferred
Some English phonemes included as well as dialect
variants
Example 3 variants of the /r/-sound/r/
(tap/trill)/R/ (uvular fricative)/r\/
(approximant)

13
Broad phonetic annotationAnnotation procedure

Conversion of orthographic transcription to a
format suitable for automatic transcription.
Automatic segmentation with a phonotypical
transcription using a speech recognizer.
Manual correction of both segments and labels by
four phonetics students using Praat.
Format check.
Control of all annotation by one supervisor.

14
Broad phonetic annotationComments on deviations

15
Example annotation in Praat
16
Concluding remarks

Availability
Planned to be included for non-commercial use in
a future Norwegian language bank
Will complement other corpora also intended to be
included
To be validated by Spex
Planned use at NTNU SIRKUS project
Investigation in new paradigms for ASR
Low-level phone recognition experiments initially
multi-linguality aspects
Spoken information retrieval

Write a Comment

User Comments (0)

About PowerShow.com

Ingen lysbildetittel PowerPoint PPT Presentation