Title: Multi-channel Echo Cancellers for Packet Telephony using a low cost DSP
1Multi-channel Echo Cancellersfor Packet
Telephony using a low cost DSP
- Krishna V V, Jitendra Rayala,
- Joseph Yau, Brendon Slade
- DSP Products Division
- LSI Logic
2Plan
- Line Echo Cancellation Overview
- Echo Sources and Cures
- EC for packet Voice
- Echo Canceller Internals
- Multi-channel EC on LSI403LP
- Summary
3Echo Sources in Telephony
Echo arises due to impedance mismatches at
hybrids.
Near Echo for A (side tone) Leakage at AH1
Reflection at AH2.
Far Echo for A Leakage at BH2 (major component)
Reflection at BH1.
4When is EC critical?
The need for EC is determined by both
Typical ERL values range between 6dB to
12dB. Typical round-trip network delays
POTS (Local Calls) POTS (LD, terrestrial) POTS
(LD, satellite) Wireless (GSM, CDMA,) Packet
Voice
Less than 10ms 30-70ms 300-500ms 100-180ms 120-200
ms
5Delay in Packet Networks
Overall delay break-up
Revised from Internet Telephony Going like
crazy, by G. Thomsen, Y. Jani, IEEE Spectrum,
May 2000.
6Tackling Echo Telephony Standards
7EC The Past Decade
8EC for Packet Voice
Question Why EC in the Gateway?
9EC in Packet Networks
10Packet Voice CPE Detail
11EC A Black-box View
12G.168 EC Internals
13Some EC Design Options
14Full Tail / Floating Window
128 ms
Actual echo path
Full tail solution
2-window solution
12 ms
12 ms
15Key Performance Issues
- Fast initial convergence
- Low steady-state residual
- Fast tracking (for occasional path changes)
Big Questions -- How fast? How low?
16Key Performance Issues (Contd)
- Robust to near-end talk
- Robust to double-talk
- Near-end voice quality (measured by PESQ, MOS,
...)
- Near-end back-ground noise contrast
17Adaptation Options
- NLMS (sample rate or block adaptive)
- Enhanced NLMS variants (decorrelation, variable
step size, PNLMS, PNLMS) - Fast affine projection (FAP)
- Fast RLS (FTRLS, QR-RLS, )
- Other methods also exist
18Costs of Adaptation
19LSI403LP/LC DSP
Price-Performance Balance
- 120 MHz - 200 MHz clock
- ZSP400 core, up to 4 instructions per cycle
- Dual MACs can perform two 16x16 or one 32x32
operation(s) per cycle - 48K words of on-chip SRAM (configurable as
16K32K or 24K24K or 32K16K of PM and DM) - Two serial ports with TDM support
- As low as 4.00 in volume.
20EC Complexity Break-up
21EC Complexity
NLMS based Example
Data Memory
Ops / Sample
FIR Filtering
O(N)
O(N)
For lattice structures, filtering and update
stage break-up not possible.
Filter Updates
O(N)-O(2N)
O(N)
Costs are almost constant (depends very weakly on
filter length, N).
Other Logic
?c
?M
Other logic includes several IIR filters,
conditional branching, data buffer management,
etc. for update control, NLP, CNI and V25 tone
disabler.
22Multi-MAC Processors
Percentage load for other logic is significant.
23FIR Filtering Loop
ZSP400 Code snippet L_ECFilter_Loop
lddu r2, r14, 2 ! r2 Yk, r3 Yk1
lddu r4, r13, 2 ! r4 Ak, r5 Ak1
mac2.a r2, r4 ! r1r0 r1r0 r2r4
r3r5 agn0 L_ECFilter_Loop Approximately
N/2 cycles per sample, as it can be implemented
using lddu / lddu / mac2.a instruction sequence.
24LSI403LP/LC EC Implementations
- Two Versions
- Full-tail, 64ms echo canceller
- Windowed version (up to 3 discrete echoes)
Notes All memory in 16-bit words. I/O buffers
are for 2.5 ms frame size Numbers subject to
change (on-going revisions!).
25Multi-channel EC Costs
- Processor load (MHz or gates)
- Increases almost linearly with channel count
- For large channel counts savings possible
- Data Memory (Channel object)
- Increases linearly with channel count
- On-chip memory is expensive, but reduces power
consumption, offers easier scalability with
multiple cores
2624 Channels on LSI403LP/LC
- Resources for 24 channels
Notes Processor load is worst case (all channels
performing adaptation). Extra data memory
requirements can be met by the free program
memory on LSI403LP/LC. The required swap
operations (for some channels only) are estimated
to add an extra load of 6.5 MHz in case of the
windowed version. Multi-chip packaging using
LSI403WLP provides higher channel density. For
example, a dual-processor package can support 32
channels, at a lower clock, without requiring any
memory swaps.
27Summary
- LSI403LP or LSI403LC can be used to support as
many as 24 channels of LEC with 64ms tail,
without any external SRAM. - Very low cost per channel (under 0.50 per
channel). - Multi-chip packaging for higher channel density.
- Custom ASICs can be built for further cost
reduction. Higher performance options using ZSP
G2 cores also possible.
Thanks! Questions?
28Backup Slides
29Delay G.114 Guidelines
30Echo Level and Delay
ERL data from Table 1.1, Acoustic Signal
Processing for Telecommunication, S. L. Gay and
J. Beneste (Ed.s), Kluwer Academic Publishers
(2000)
31Dealing With Delay (Echo)
- One-way delays in packet voice networks gt 100ms
- As recommended in ITU-T G.131, a network echo
canceller (EC) is required. - EC required only for
- PSTN interfaces on packet voice gateways (PVGs)
- Analog phone (SLIC) interfaces on CPEs
- EC not required for digital IP phones
- AEC may still be needed (for hands-free
operation) - EC tail length a much misused parameter
- ITU-T G.168 EC was initially developed for PSTN.
Can it be applied as-is for packet voice networks?
CPE Customer Premises Equipment, PVG Packet
Voice Gateway
32Quality of Service (QoS)