WIRELESS COMMUNICATIONS From Systems to Silicon - PowerPoint PPT Presentation

Loading...

PPT – WIRELESS COMMUNICATIONS From Systems to Silicon PowerPoint presentation | free to download - id: 3d20c2-MGRmM



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

WIRELESS COMMUNICATIONS From Systems to Silicon

Description:

WIRELESS COMMUNICATIONS From Systems to Silicon Raghu Rao Wireless Systems Group, Xilinx Inc. Agenda Introduction to Wireless communications Systems design and ... – PowerPoint PPT presentation

Number of Views:224
Avg rating:3.0/5.0
Slides: 86
Provided by: usersEceU
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: WIRELESS COMMUNICATIONS From Systems to Silicon


1
WIRELESS COMMUNICATIONS From Systems to Silicon
  • Raghu Rao
  • Wireless Systems Group,
  • Xilinx Inc.

2
Agenda
  • Introduction to Wireless communications
  • Systems design and considerations
  • The wireless environment
  • Link budget
  • MIMO and OFDM Systems
  • High level view of wireless communication systems
  • Mobile WiMax, an example of wireless comm system,
  • Hardware/software partitioning
  • PHY/MAC etc.
  • The Platform FPGA
  • Overview of FPGAs and FPGA tools
  • Building DSP sub-systems on FPGAs
  • Digital baseband
  • FPGA tools and design methodology

3
Communications Roadmap
  • Key markets
  • Core DSP technologies
  • OFDM
  • MIMO
  • IP Network is key
  • Enables new approaches to
  • QoS management
  • Robustness
  • Capacity

4
Wireless Environment
  • Multipaths caused by reflections from various
    objects.

5
Modeling the Channel
  • As the mobile moves through the environment, the
    field strength varies due to
  • Free space path loss
  • Long term (slow) fading
  • Short term (fast) fading

6
Doppler
  • Changes in the received carrier frequency due to
    the relative motion of the mobile to the base
    station
  • ?f fd (v/l) cos(q)
  • for f900 MHz, v 70 MPH (112 km/h)
  • fD-max v/l 93.3 Hz

7
Delay Spread
  • Measure of the time distribution of power in the
    channel impulse response
  • Typical office 25 ns to 60 ns
  • Large Lobbies and atria 100 ns
  • Warehouse and factory floors 100 ns to 200 ns
  • Delay spreads are up to 10 microseconds in
    cellular environments
  • Greater than 3 msec in urban areas
  • 0.5 ms in suburban and open areas

8
Exponential Power Delay Profile
  • If the delay spread of the channel is larger than
    the symbol interval we will see multiple paths in
    our channel.
  • Leads to inter-symbol interference (ISI).
  • Leads to a frequency selective channel.
  • Average energy of the channel impulse response
    follows an exponential power-delay profile.

9
Coherence Bandwidth
  • Maximum frequency bandwidth for which the signals
    are still considered to be correlated.
  • Bc in Hz 1/(2ptrms) when considering amplitude
    correlation (correlation coefficient 0.5)
  • trms is the rms-delay-spread of the channel

10
Coherence Time
  • Maximum time period for which the signals are
    still considered to be correlated.
  • It is used to characterize the time varying
    nature of the channel.
  • Rule of Thumb 9/(16?fm)ltTclt0.423/(fm)
  • fm is the maximum Doppler frequency
  • Correlation coefficient 0.5

11
Link Budget
  • A link budget is used to compute the range,
    transmit power, receiver sensitivity and other
    requirements of the communication system.
  • In free space the path loss is given by the Friis
    equation
  • Gt , Gr represent transmit and receive antenna
    gains. Pt , Pr represent the transmit power and
    receive power. is the wavelength, d is
    the distance.

12
Link Budget
  • Expressing path loss in dB
  • Note is the path loss exponent depending on
    the environment (2 in free space).
  • To compute the SNR at the baseband we need to
    include thermal noise in the signal bandwidth B,
    and noise figure of the system NF.

13
Link Budget
  • Margin for desired outage taking into account
    receiver structure and antenna diversity.
  • Standards specify outage probabilities
  • WiMax 90 in the cell, 75 at the boundary of
    the cell.
  • Compensation factors for other impairments
  • Interference from neighbouring cell
  • Shadow fading, etc.
  • Diversity helps achieve the outage probability
    (or reduces the margin for outage) without
    increase in transmit power.

14
Diversity
  • Diversity provides the receiver with multiple
    looks at the transmitted signal.
  • Prob(all channels in a fade) ltlt Prob(any 1
    channel in a fade)
  • Diversity improves link reliability.

Combined channel
Channel 1
Channel 2
Signal Level (dB)
Time
15
Diversity Techniques
  • Spatial Diversity
  • Antennas sufficiently spaced apart (gt ½
    wavelength).
  • Will result in an independent channel response
    and provide another look at the transmitted
    signal.
  • Frequency Diversity
  • Transmit over multiple carrier frequencies.
  • If the frequencies are sufficiently far
    (coherence bandwidth) apart the channel response
    will be different on the different frequencies.
  • Time Diversity
  • Channel is continuously changing.
  • Transmit signals sufficiently spaced (coherence
    time) apart in time so the 2nd transmission
    sees a different channel compared to the first
    one.
  • Polarization Diversity
  • Signals transmitted on two orthogonal
    polarizations exhibit uncorrelated fading
    statistics.

16
MIMO Systems
  • MIMO systems
  • Multiple Antennas at the transmitter and
    receiver.
  • 3 types of MIMO Systems
  • STBC MIMO systems
  • Diversity gain.
  • Spatial Multiplexing MIMO systems
  • Capacity/throughput gain.
  • Feedback MIMO systems
  • Higher performance thru interference reduction.
  • MISO (multiple input single output) Systems
  • STBC can be used with just 1 receive antenna.
  • Provides diversity gain.
  • To achieve array gain, need knowledge of channel
    at the transmitter (feedback).

17
Spatial Multiplexing
  • A spatial multiplexing MIMO system transmits
    different data symbols from each transmitter.
  • The signals from each transmitter combine over
    the air and are received by multiple receive
    antennas.
  • SM systems have a rateM (num transmit antennas).
    The diversity order depends on the type of
    encoding and receiver (uncoded SM with ML
    decoding has diversity orderN (num receive
    antennas)).

18
Spatial Multiplexing Receivers
  • Zero Forcing receiver

For ZF receivers
Significant increase in noise when the channel is
in a deep fade.
19
Spatial Multiplexing Receivers
  • MMSE MIMO Decoders
  • Cancels interference and minimizes noise.
  • Minimizes the over all error (mean squared error).

20
Spatial Multiplexing Receivers
  • Zero-Forcing
  • MMSE
  • Successive Interference cancellation receivers
  • Sphere detectors (sub-optimal Maximum Likelihood)

21
Transmit Diversity
  • Space Time Block Code (STBC)
  • 2 Antenna STBC also known as Alamouti Code.
  • Improves BER/SER performance.

22
STBC Decoder
In matrix form the received signal is
Low complexity decoder. Just 2 complex mults per
symbol for a 2 antenna system (and grows linearly
with block length/num antennas).
Decoder
23
Other MIMO schemes
  • Achieving high rate high diversity MIMO systems
    is an area of active research.
  • There are many suboptimal STBC schemes that
    improve the rate but reduce the diversity order.
  • There are also combinations of spatial
    multiplexing and STBC schemes.
  • One such scheme is 2 (or more) Alamoutis in
    parallel.

24
Stacked Alamouti
  • Interference Cancelling STBC
  • 2 Alamoutis in parallel
  • Rate 2 system
  • Diversity order
  • N(M-K1)
  • K co-channel users
  • N transmit antennas per user.
  • M receive antennas
  • Requires N(K-1)1 antennas at the receiver to
    suppress K-1 interferers.

25
Orthogonal Frequency Division Multiplexing (OFDM)
OFDM divides a frequency selective channel into a
number of flat fading channels
26
OFDM Modulation
  • A QAM symbol is modulated onto each subcarrier
  • IFFT/FFT are used for efficient modulation and
    demodulation

Frequency Domain
27
Combating Multipath
OFDM Symbol
CP
Constructing the cyclic prefix (CP)
Multipath components
tmax
Sampling Instant
Ts
  • Sampling at instant Ts all channels experience
    the same channel and there is no ICI

28
MIMO and OFDM
  • MIMO Multiple Input Multiple Output
    Communication System. Employs multiple antennas
    at both transmitter and receiver.
  • OFDM Orthogonal Frequency Division
    Multiplexing. Breaks up a broadband channel into
    many parallel narrowband channels (subcarriers).
  • MIMO-OFDM A Combination of MIMO and OFDM.
    Appears like many parallel MIMO systems on
    orthogonal subcarriers.

29
MIMO-OFDM System
Each transmitter is an independent OFDM
modulator. The source symbols could be
space-time block coded or just QAM modulated for
spatial multiplexing. Each receiver is an OFDM
demodulator combined with a MIMO decoder to
invert the channel on each subcarrier and extract
the source symbols.
30
Agenda
  • Introduction to Wireless communications
  • Systems design and considerations
  • The wireless environment
  • Link budget
  • MIMO and OFDM Systems
  • High level view of wireless communication systems
  • Mobile WiMax, an example of wireless comm system,
  • Hardware/software partitioning
  • PHY/MAC etc.
  • The Platform FPGA
  • Overview of FPGAs and FPGA tools
  • Building DSP sub-systems on FPGAs
  • Digital baseband
  • FPGA tools and design methodology

31
802.16/802.16e
  • The 802.16 WirelessMAN standard includes
    requirements for operation in
  • Line Of Sight (LOS), 10-66 GHz for fixed wireless
    systems.
  • Non Line Of Sight (NLOS), lt11 GHz for fixed
    wireless systems.
  • 802.16e (Mobile WiWax) adds enhancements for
    mobility in the lt11 GHz licensed and unlicensed
    bands.
  • Operation in mobile mode is limited to licensed
    bands between 2 GHz and 6 GHz.

32
Scalable OFDMA parameters
33
Link Budget
34
Time Division Duplexing
  • 802.16e can be deployed in TDD and FDD
    environments.
  • Initial certification profiles are only for
    TDD.
  • The DL subframe and UL subframe lengths are
    adjustable.
  • TDD assures channel reciprocity.

35
OFDMA Frame Structure
DL-MAP Downlink MAP downlink
allocations UL-MAP Uplink MAP uplink
allocations FCH Frame control header contains
information about the DL-MAP
36
Data rates for SIMO/MIMO configurations
64 QAM with 5/6 CTC
Source WiMax Forum
37
Baseband Transmission Model
  • OFDM receiver provides estimates of
  • Channel hn,i(t)
  • Frequency offset W0
  • Sample timing T'
  • OFDM symbol timing

38
Generic OFDM Transmitter
  • Figure shows a generic MIMO OFDM Tx
  • MIMO not an element of 802.11a, but it is in
    802.11n, 3GPP-LTE and 802.16e

39
OFDM Receiver Architecture
  • Figure illustrates architecture for generic OFDM
    Rx
  • Details will vary as a function of
  • Packet-based versus broadcast transmission
  • Existance of a preamble (or not) in the waveform

40
Agenda
  • Introduction to Wireless communications
  • Systems design and considerations
  • The wireless environment
  • Link budget
  • MIMO and OFDM Systems
  • High level view of wireless communication systems
  • Mobile WiMax, an example of wireless comm system,
  • Hardware/software partitioning
  • PHY/MAC etc.
  • The Platform FPGA
  • Overview of FPGAs and FPGA tools
  • Building DSP sub-systems on FPGAs
  • Digital baseband
  • FPGA tools and design methodology

41
Digital Receiver ArchitectureAbstracted
Architecture
  • Common model of abstraction for digital receiver
    is inner/outer receiver

42
Receiver Abstraction and Projection on to
Platform FPGA
FPGA product portfolio Tailored for various
processing Tasks in communications receiver
43
Digital Frontend
Digital upconversion (downconversion) Crest
factor reduction Digital pre-distortion
44
The Platform
  • Embedded Software
  • MAC (Media Access)
  • Decision oriented tasks
  • CORBA
  • RTOS
  • NBAP
  • SCA (JTRS radios)

EMIF
  • Serial Gigabit
  • OBSAI/CPRI
  • Proprietary serial backplane
  • Inter-chip connectivity

SRIO
  • Logic IO
  • OBSAI/CPRI
  • SRIO
  • AD/DA interface
  • EMIF

45
Virtex-4/5 FPGA ArhitectureHigh-Level View
  • FPGA family with 3 members tailored for specific
    classes of processing
  • SX DSP
  • LX Logic centric
  • FX Full featured
  • Embedded PowerPC hard IP
  • Giga-bit serial connectivity
  • DSP processing tiles DSP48

46
Virtex-5 FPGA Platform
Can be configured as a shift register
  • 2 slices per CLB, 4 LUTs per CLB
  • Can be configured as a shift register
  • Can be configured as distributed memory

Can be configured as RAM
47
Virtex-4 DSP48 Slice
Scalable 500MHz Performance Not Possible Using
Standard Cell Libraries and Standard Cell Design
Flow
Pipeline Registers Enable 500Mhz Performance
Integrated Cascade Routing Enables Scalable
Performance
Arithmatica Parallel Counter 20 Faster
Performance and Uses Less Area
Arithmatica AAdder 20 Faster Than Other
Implementations
48
Pipelined Multiplier
To Adjacent DSP48 Tile
C
3 delay latency
48
BCOUT
PCOUT
48
18
18
48
18
36
X
MS Word
48
18
A
CIN
36
18
LS Word
B
72
18

36
Y
P
48
48
48
SUB
ZERO
48
A
Z
48
18
P (PCOUT)
z-3
B
48
18
48
36b product sign extended to 48b
Register
Wire Shift Right By 17b
18
48
BCIN
PCIN
49
Pipelined Complex 18x18 MPY
S4
Ai
18
-
Pr
Bi
48
18
S3
Ar
18
Br
48
18
0
S2
Ar
18
Pi
Bi
48
18
S1
Ar
18
Bi
18
0
sn Slice n
Register
Sign Extension
50
Wide Filters At Full Speed Within the Virtex-4
DSP Slice Column
  • Systolic N-tap FIR
  • Scalable N-levels deep implementation
  • N-levels deep at 500MHz performance
  • Uses Integrated Pipeline Registers to Synchronize
    Filter Inputs
  • Utilizes Input and Output Cascade Routing

Build Massively Parallel 512-TAP FIR Filter In a
Single Device Achieving 256 GMACCs/s Performance
Equivalent Implementation Would Consume 444
Embedded Multipliers and 77,008 LCs And Would
Only Achieve ½ The Performance
51
Xilinx FFT IP (4)
  • FFT fully utilizes FPGA arithmetic hardware
    resources
  • FFT viewed as a recursion using a butterfly
    kernel

a
(a b)
CADD1
CADD2
(a b) e-j2pk/N
b
CMPY
Phase factors e-j2pk/N
  • CADD12 complex adder
  • CMPY complex multiplier

52
Virtex-4 DSP Slice
  • DSP slice key for implementing high-performance
    arithmetic
  • Embedded 18x18 MPY and 48b adder
  • Butterfly phase rotator
  • Cross-addition

53
Butterfly CMPLX MPY
Pr jPi (ArjAi) x (Br jBi)
  • Complex MPY used in FFT butterfly
  • Optimized to employ Virtex-4 DSP Slice
  • 4 and 3 MPY option
  • Complex MPY available as IP module

Pr
DSP Slice 2
Ar
DSP Slice 1
Br
Bi
DSP Slice 4
Ai
DSP Slice 3
Pi
Available 6.2i IP Update 2
54
Performance/Parallelism/Area
  • FPGA highly parallel computing machine
  • Achieve performance using functional unit
    parallelism
  • Area/throughput tradeoff delivered via Xilinx IP
    library
  • Butterfly array to produce high-performance FFT
    processor
  • High computation rate using (possibly) hundreds
    of DSP slices
  • Allocate resources as appropriate to meet system
    requirements
  • Large memory bandwidth using multi-port memory
    constructed from BRAMs

Mem read BW 320 x 36 x 500e6 5.76 Tera-bps
55
FFT Architecture
  • For small number of carriers and modest data
    rates single butterfly (I)FFT is probably
    suitable - Small FPGA footprint

Phase Factor ROM
Iteration Engine
Input Data
Data Ram 0
switch
switch
Data Ram 1
Output Data
56
Block boundary detection/Fine timing acquisition
F. Tufvesson, O. Edfors, M. Faulkner, Time and
Frequency Synchronization for OFDM using
PN-Sequence Preambles, VTC-1999/Fall, vol 4,
pp.2203-7, New Jersey, 1999.
57
Fine-timing acquisition using a clipped correlator
10 time multiplexed correlators
Full precision correlators 32 embedded
multipliers 896 flipflops
1-bit correlator
Each 1-bit correlator 10 slices Total for
clipped correlator 589 slices
Bank of correlators
58
QRD
  • One of the popular methods of matrix inversion is
    based on QRD.
  • Q is Unitary and R is upper triangular
  • A Unitary matrix has a trival inverse,
  • An upper triangular matrix can be inverted by
    back-substitution

59
Givens Rotations
  • For a 2x1 vector of real numbers
  • For a NxM matrix, repeat the process 2 cells at a
    time.

60
Systolic Arrays
  • Structured arrays with identical cells. Usually a
    boundary cell and an internal cell for the
    QRD process.

Internal cell
  • The boundary cell generates the rotations.
  • Internal cell applies the rotations to all the
    cells in the row.
  • The systolic array in this figure can handle any
    matrix below 3x3.

Boundary cell
61
Triangularization mode
  • For QRD of upto a 3x3 matrix we need 3 boundary
    cells and 3 internal cells.
  • Boundary cells calculate rotation vectors and
    internal cells store them.
  • Data is fed column-wise into the systolic array.
  • This may have to be staggered depending on the
    pipelining delays thru the boundary cell and
    internal cell.

The rotation factors for zeroing out cell A(2,1)
are stored in cell A(1,2), etc.
62
Q-matrix computation mode
first column of Q matrix
second column of Q matrix
third column of Q matrix
63
Agenda
  • Introduction to Wireless communications
  • Systems design and considerations
  • The wireless environment
  • Link budget
  • MIMO and OFDM Systems
  • High level view of wireless communication systems
  • Mobile WiMax, an example of wireless comm system,
  • Hardware/software partitioning
  • PHY/MAC etc.
  • The Platform FPGA
  • Overview of FPGAs and FPGA tools
  • Building DSP sub-systems on FPGAs
  • Digital baseband
  • FPGA tools and design methodology

64
FPGA Tools for DSP Systems Design
  • Higher level tools are raising the level of
    abstraction.
  • Allows non-hardware engineers (algorithm
    designers) to get a first look at hardware.
  • System Generator
  • Simulink to Hardware
  • C-to-Gates tools
  • C or higher level languages to gates

65
System Generator
System Level Modeling Simulation Framework
Work in the language of your problem
66
System Generator Flow
DSP Development Flow
1. Develop Algorithm System Model
HDL Simulation Flow
Simulink MDL
2. Automatic Code Generation
Bitstream
FPGA
Download to FPGA
67
Configurable MIMO-OFDM Transmitter
Time shared FFT across antennas
Add Cyclic Extension/Block Shaping
Pilot insertion and data loading
Spatial Demultiplexing and Interpolation
Packetization and configurable STBC encoding
Packet Controller
Resource sharing (folding factor) Ratio of System
clock rate to symbol rate gt 8 needed for a 4
transmit antenna system
68
MIMO Receiver Architecture
Samples processed at sample clock rate
Samples processed at system clock rate
69
Fine-timing acquisition using a clipped correlator
10 time multiplexed correlators
Full precision correlators 32 embedded
multipliers 896 flipflops
1-bit correlator
Each 1-bit correlator 10 slices Total for
clipped correlator 589 slices
Bank of correlators
70
MIMO-OFDM Receiver
Packet Detection
Fine Timing Acq
Carrier Frequency Offset Correction
Output FIFO
Cyclic prefix removal
FFT
Channel Estimation
Weight Matrix Computation
MIMO Decoder
71
Channel Estimation
Channel Estimation Pilots for Tx4
Channel Estimation Pilots for Tx1
4x4 Channel Estimation Memory
Input FIFO
Control Signals
72
Packet Detection
Schmidl and Cox algorithm for Packet Detection
and coarse carrier frequency offset
estimation. T. M. Schmidl, D. C. Cox, Low
Overhead Low Complexity Synchronization for
OFDM, ICC 1996, Vol 3, pp 1301-1306.
73
Two Branch CFO estimation using Schmidl and Cox
algo
Carrier Frequency Offset causes a linearly
increasing rotation in the time domain
Combine the metric from both Antennas
74
Carrier Frequency Offset Estimation
  • Pre-FFT
  • Uses a dedicated preamble or symbol for CFO
    estimation
  • Post-FFT using channel estimation pilots
  • Uses channel estimation training symbols
  • Post-FFT CFO Tracking
  • Needs continuous pilots during payload symbols
  • CFO Estimation using Cyclic Prefix
  • Works well when you have a lengthy cyclic prefix
  • Examples WiMax, 3GPP-LTE, DVB-T/H
  • Does not need preamble or pilot support

75
Pre-FFT Carrier Frequency Offset Estimation
The angle of the correlation metric is
proportional to the Carrier frequency
offset. Right size the number of bits before the
CORDIC operation. CORDIC ATAN from the Xilinx
Math library calculates the angle.
76
Post-FFT CFO Estimation and tracking
CFO causes a linear rotation every sample in the
time domain. CFO causes a constant rotation on
all subcarriers in the frequency domain. This
rotation increases from OFDM symbol to symbol and
can be used to estimate CFO.
77
Carrier Frequency Offset Correction
Direct digital synthesizer (DDS) from the Xilinx
DSP SysGen library.
78
Design methodology issues
  • FPGA tools
  • Where to from here?
  • C-to-gates
  • Higher level design languages to gates
  • Raising the level of abstraction

79
End of Roadmap for theVon Neumann Model
1945-2005Sequential programming
2005 - ????Concurrentprogramming
6x6 GALS Processor Array
80
Merging MindsetsSoftware Design vs. Hardware
Design
? Encapsulation ? Abstraction ? Portability ?
Re-use
Implementation Detail ? Control Logic ? Interface
Glue ? Concurrency ? Communication ?
Architecture ?
  • Events? Protocols
  • Ordering
  • ? Sequential execution

Clocks ? Signals ? Timing ?
Combining the strengths of both paradigms can
bring about a radical improvement in
hardware/software system design productivity.
81
Objective for a New Methodologyreduce design
cost (by a lot)
  • Quality of result (QoR) is not a design goal!
  • Performance, power, BOM cost budgets make QoR a
    design constraint
  • The real objective is to meet the QoR target and
    minimize
  • Non-recurring engineering costs (NRE)
  • Time-to-market (TTM)
  • The new methodology should save on design cost by
    enabling
  • Design of portable, retargetable, composable IP
    blocks
  • Rapid design space exploration and system
    composition

AbstractionProfit
Traditional HDL Flow
QoRperformance/performance/W
New methodology
abstractioncost
Total Design CostNRE , TTM
82
C or higher level language to Gates
  • There is interest in higher level design
    methodologies, such as C-to-Gates from the design
    community.
  • ESL (Electronic system level) tools/design
    methodologies are being explored.
  • But, extracting all the concurrency from a
    sequential description is not an easy problem.

83
Actor/Dataflow Programming Model
  • A well-known and researched model for concurrent
    systems
  • Edward Lee et. al. (UC Berkeley)
  • Arvind et. al. (MIT)
  • Broadly applicable to heterogeneous HW/SW systems
  • Actors are described in the CAL language (UC
    Berkeley)
  • Open source simulator available from SourceForge
  • Under consideration as reference model for MPEG

84
Conclusion
  • FPGAs are finding wide use in infrastructure
    communication systems and signal processing
    systems.
  • FPGA are an efficient choice for exploring VLSI
    architectures.
  • FPGA tools are raising the level of abstraction
    to allow algorithm designers the ability to
    explore h/w architectures without learning h/w
    design tools/languages.

85
Questions?
About PowerShow.com