multiple level articulatory segmental HMMs - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

multiple level articulatory segmental HMMs

Description:

Wide variation in pronunciation of English in the British Isles. Perceived as problem for speech technologies. No systematic study published to date ... – PowerPoint PPT presentation

Number of Views:358
Avg rating:3.0/5.0
Slides: 32
Provided by: martinr3
Category:

less

Transcript and Presenter's Notes

Title: multiple level articulatory segmental HMMs


1
ABI speech corpus
Shona DArcy, Martin Russell Electronic,
Electrical and Computer Engineering University of
Birmingham
2
Motivation
  • Wide variation in pronunciation of English in the
    British Isles
  • Perceived as problem for speech technologies
  • No systematic study published to date
  • Need for corpus of speech representing different
    accents

3
Variety of Accents
  • British isles
  • Northern England
  • Southern England
  • Wales
  • Scotland
  • Northern Ireland
  • Southern Ireland

4
Corpus design
  • What is an accent?
  • What accents are required for corpus?
  • List of 13 accents to be recorded
  • Standard Southern english

5
London
6
London
West County
7
London
West County
8
Ireland
London
West County
9
East Anglia
Ireland
London
West County
10
East Anglia
Ireland
Birmingham
London
West County
11
Liverpool
East Anglia
Ireland
Birmingham
London
West County
12
N Ireland
Liverpool
East Anglia
Ireland
Birmingham
London
West County
13
N Ireland
Lancashire
Liverpool
East Anglia
Ireland
Birmingham
London
West County
14
N Ireland
Yorkshire
Lancashire
Liverpool
East Anglia
Ireland
Birmingham
London
West County
15
N Ireland
Newcastle
Yorkshire
Lancashire
Liverpool
East Anglia
Ireland
Birmingham
London
West County
16
Glasgow
N Ireland
Newcastle
Yorkshire
Lancashire
Liverpool
East Anglia
Ireland
Birmingham
London
West County
17
Scottish Highlands
Glasgow
N Ireland
Newcastle
Yorkshire
Lancashire
Liverpool
East Anglia
Ireland
Birmingham
London
West County
18
Corpus design
  • Who are good subjects?
  • People who had lived in the area most of their
    lives
  • People whos parents had lived there most of
    their lives
  • Between the ages of 18 and 50
  • 10 male and 10 female

19
Text to be recorded
  • Accent specific texts
  • when a sailor in a small craft faces the might
    of the vast Atlantic ocean today
  • Scribe sentences
  • Where were you while we were away
  • Vowels in control contexts,
  • Had, whod, Hudd (to rhyme with bud), hid, heard

20
Texts to recorded
  • Standard prompts for speech technologies
  • Digits triples
  • Letters, phonetic alphabet
  • GYP, Golf, Yankee, Polo
  • Equipment specific commands
  • Climate control 71 degrees

21
(No Transcript)
22
Preparation
  • Recording software
  • List of prompts
  • Venue selection
  • Chose town
  • Library/community centre

23
Hardware
  • Laptop with external soundcard, (Edirol)
  • Emkay head mounted microphone
  • Generic desk mounted microphone

24
Strategy for subjects
  • Press releases
  • Local media, radio (25)
  • TV(7) newspapers (40)
  • Interviews
  • Free phone
  • Subject selection
  • Subject criteria
  • Based on audio

25
Statistics
  • 14 locations
  • 20 people per location, between 16 and 70
  • 95 hours of recordings
  • Annotated at phrase level

26
Lessons learnt
  • Identifying accents
  • Choosing a town likely to have appropriate accent
    and produce enough candidates
  • Choice of recording location

27
Rural settings
  • Association of accent with place not reliable
  • Expectation of situation
  • Less migration, hence less diluting of accents
  • Smaller variations over large area
  • Findings
  • Young people more likely to adopt other accents
  • Mainly older people had good accent

28
Urban settings
  • Most cities homogeneous accent
  • Liverpool, Glasgow
  • Huge variety of accents in London

29
Choice of location
  • Library
  • Relatively consistent noise level
  • Library users available to recruit
  • Other (e.g. community centre, universities)
  • More variation in noise
  • Older people

30
Press release free-phone
  • Mostly successful (would do it again)
  • Coverage approximately equal in each location
  • Variable response
  • When successful
  • Many volunteers with good accents within our age
    range
  • When unsuccessful
  • Not enough volunteers
  • Had to recruit from the street

31
Future recordings
  • Fill in holes existing in current database
  • Incorporating lessons learnt
  • Record accents not covered in this corpus
Write a Comment
User Comments (0)
About PowerShow.com