Lingfen Sun - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Lingfen Sun

Description:

How does codec affect speech quality? Using PESQ to calculate perceived MOS score ... Based on codec, bursty loss rate and gender of the talker, a NN model was ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 28
Provided by: lingF5
Category:
Tags: codec | lingfen | sun

less

Transcript and Presenter's Notes

Title: Lingfen Sun


1
Perceived Speech Quality Prediction for VoIP
Networks
  • Lingfen Sun
  • Emmanuel Ifeachor

2
Outline
  • Introduction
  • Simulation system
  • Perceived speech quality analysis
  • Impact of loss on speech quality
  • Impact of talkers on speech quality
  • Perceived speech quality prediction using Neural
    Network (NN) method
  • Conclusions and future work

3
Introduction
  • Speech quality Measurement
  • Subjective method (Mean Opinion Score -- MOS)
  • Objective methods
  • Intrusive methods (e.g. ITU P.862 PESQ)
  • Nonintrusive methods (e.g. E-model, NN model)
  • Why do we need to predict speech quality?
  • For online monitoring VoIP call
  • For Quality of Service (QoS) control for VoIP
    applications

4
How to predict speech quality?
  • E-model
  • All impairments are mapped to R-scale (R? MOS)
  • Principle "Psychological factors on the
    psychological scale are additive"
  • Static and computational model.
  • NN-model
  • To learn the non-linear relationships between
    network impairments and perceived speech quality
  • To adapt to dynamic IP network conditions.

5
Previous work
  • NN databases are based on subjective test only
  • As subjective test is time consuming, costly and
    stringent, available databases are limited and
    cannot cover all the possible scenarios
  • Only a limited number of subjects attended MOS
    tests
  • Limited number of codecs
  • Talker dependency has not been considered.

6
Main objectives of work
  • To undertake a fundamental investigation of the
    impact of packet loss on perceived speech quality
    using an objective measurement algorithm (e.g.
    PESQ)
  • To investigate the impact of different talkers on
    perceived speech quality
  • To develop a robust NN model for speech quality
    prediction based on PESQ.

7
Simulation system structure
quality measure (PESQ)
Measured MOS
Simulated VoIP system
encoder
loss simulator
decoder
Degraded speech
Reference speech
  • Reference speech is from a speech database

8
Loss Simulator
2 state Gilbert Model to simulate packet loss
characteristics
  • Network packet loss late arrival loss due to
    jitter
  • Unconditional loss probability (ulp, or average
    loss rate), ulp p / (p 1 q)
  • Conditional loss probability (clp), clp q to
    reflect burst loss features

9
Impact of loss on speech quality
  • How do packet loss and loss burstiness affect
    speech quality?
  • How does packet size affect speech quality?
  • How does codec affect speech quality?
  • ? Using PESQ to calculate perceived MOS score
  • ? Average over 300 different random "seeds" to
    reduce the impact from different loss locations

10
Bursty loss analysis (G.729)
11
Bursty loss analysis (G.723.1)
12
Bursty loss effect
  • clp has an obvious impact on the perceived speech
    quality even for the same average loss rate (ulp)
  • When burst loss increases (clp increasing), the
    MOS score decreases and the variation of the MOS
    score also increases.
  • ? Identify ulp and clp as input parameters
    related to loss for NN analysis

13
Impact of packet size (G.729)
14
Impact of packet size (G.723.1)
15
Impact of packet size on quality
  • Packet size has, in general, no obvious influence
    on speech quality for a given loss rate.
  • Variation in speech quality for the same network
    loss rate depends on packet size and codec.
  • Variation in quality due to loss location is the
    main obstacle in the prediction of speech quality
  • ? To consider loss only during active talkspurt
    frames (not for silence frames or SID frames).

16
Impact of talker on speech quality
  • To investigate whether difference in talker (male
    or female) has an effect on perceived speech
    quality
  • TIMIT data set and ITU data set are used for
    investigation

17
Talker Dependency
  • For 3 male and 3 female samples

18
Talker Dependency (cont.)
  • For 6 mixed male and female samples

19
Impact of talker on MOS
  • Impact of different talkers on perceived speech
    quality appears to depend mainly on the gender of
    the talker (male or female).
  • The quality for the female talker tends to be
    worse than that of the male talker for the same
    network impairments.
  • ? Identify gender (male and female) as one of the
    input parameters for NN analysis.

20
Quality prediction based on NN
  • Developed a neural network model (using Stuttgart
    Neural Network Simulator).
  • Identified four variables as inputs to NN
  • Codec type (G.729, G.723.1 and AMR)
  • Gender (male and female)
  • Unconditional loss probability ? ulp (VAD)
  • Conditional loss probability ? clp(VAD)
  • One output (MOS)

21
NN structure (for a 4-5-1 net)
  • a three-layer, feed-forward, neural network
    architecture
  • standard Backpropagation learning algorithm

22
NN database generation
  • Codec G.729, G.723.1 (6.3Kb/s), AMR (12.2Kb/s)
  • Gender Male and female
  • ulp 0, 10, 20, 30 and 40
  • clp 10, 50 and 90
  • Packet size 1 to 5
  • ? A total of 362 samples (patterns) were
    generated based on PESQ. 70 were chosen as the
    training set and 30 as the test dataset.

23
NN training process
Measured MOS
Quality measure (PESQ)
Simulated VoIP system
Reference speech

Degraded speech
?
Backprop
-
Network, Codec Speech parameters
Predicted MOS
24
Predicted MOS vs Measured MOS
Train ? 0.967, r 0.12 Test ?
0.952, r 0.15
25
Validation of the NN model
  • Generated a validation dataset from other talkers
    and different network loss conditions (total 210
    samples)
  • Obtained ? 0.946, r 0.19 for the validation
    dataset using a trained 4-5-1 neural network.
  • ? This suggested that the neural network model
    works well for speech quality prediction in
    general.

26
Conclusions
  • Investigated the impact of packet loss, codec and
    talker on perceived speech quality based on PESQ
  • The loss pattern, loss burstiness and the gender
    of the talker have an impact on speech quality.
  • Packet size has, in general, no obvious influence
    on speech quality, but the deviation in speech
    quality depends on packet size and codec.
  • Based on codec, bursty loss rate and gender of
    the talker, a NN model was developed successfully
    for speech quality prediction.

27
Future work
  • Extended to conversational speech quality
    prediction to cater for the impact from delay.
  • Use real VoIP trace data instead of generated
    data from Gilbert loss model.
  • Use more robust neural networks.
  • Application to QoS Control in VoIP systems.
Write a Comment
User Comments (0)
About PowerShow.com