Title: Non-Linear Codes for Asymmetric Channels, applied to Optical Channels
1Non-Linear Codes for Asymmetric Channels, applied
to Optical Channels
2Outline
- Motivation Optical Channel, Uncoordinated
Multiple Access. - Models and Capacity Calculation
- Basic Model the OR Channel
- Treating other users as noise
- Capacity loss vs. complexity reduction.
- The Z channel
- The need for non-linear codes
- Optimal ones density
- Non-linear Trellis Coded Modulation (NL-TCM)
- Definition of distance in the Z-Channel
- Characteristics of Trellis Codes
- Design Technique
- Results for 6-user OR-MAC 100-user OR-MAC
- Concatenation with High-Rate Block Codes
- Results for 6-user OR-MAC
- Conclusions
- Future Work
3Motivation Optical Channels, Multiple Access
- Optical Channels
- provide very high data rates, up to tens to
hundreds of gigabits per second. - Typically deliver a very low Bit Error Rate
- Wavelength Division (WDMA) or Time Division
(TDMA) are the most common forms of Multiple
Access today. - However, they require considerable coordination.
- Objective
- Uncoordinated access to the channel.
- Apply error correcting codes, in order to achieve
the required BER. - Maximizing the rate at feasible complexity for
optical speeds.
4Basic Model The OR Multiple Access Channel
(OR-MAC)
- OR Channel model
- Basic model that can describe the multiple-user
optical channel with non-coherent combining - N users transmitting at the same time
- If all users transmit a 0, then a 0 is received
- If even one of them transmits a 1, a 1 is
received - 0XX, 1X1
Receiver
5OR Channel Theoretical characteristics
- Achievable rate (Capacity)
- The theoretical limits for the MAC, were given by
Liao and Ahslwede. - In the case of the OR-MAC, the Theoretical
Capacity is the triangle of all rate-pairs less
than the maximum possible sum-rate, which is 1. - This sum-rate can be theoretically achieved by
- Joint Decoding.
- Sequential decoding (requires coordination).
- Time-Sharing or Wave-length sharing (requires
coordination).
6Treating other users as noise the Z-Channel
- Joint Decoding and Successive Decoding are fully
efficient in that one useful bit of information
is transmitted per time-wavelength slot. - However, non of these are computationally
feasible for optical speeds today. - A practical alternative is to treat all but a
desired user as noise. - This alternative, while dramatically reducing the
decoding complexity, looses up to 30 of full
capacity, as we will see next. - When treating other users as noise in an OR-MAC,
each user sees what is called the Z-Channel. - My research has been focused on the Z-Channel,
resulting from the OR-MAC when treating other
users as noise.
7The Z-Channel
- N users, all transmitting with the same ones
density p P(X1)p, P(X0)1-p. - Focus on a desired user
- If it transmits a 1, a 1 will be received.
- If it transmits a 0, a 0 will be received only if
all other N-1 users transmit a 0
8Maximum achievable sum-rate, when treating other
users as noise.
- Information Theory tells us the optimal ones
density to transmit for each user. - When the number of users tends to infinity, the
optimal ones density tends to ,
which is also the optimal density for joint
decoding. - In that case equal probabilities of 1 and 0 is
perceived at the receiver. - Note that for a large number of users, the
optimal ones density becomes very small. - Surprisingly, the maximum achievable sum-rate is
always lower-bounded by ln(2)0.6931 and tends to
ln(2) when the number of users tends to infinity.
9Comparison of capacities
Optimal ones densities
Users Joint Others noise
2 0.293 0.286
6 0.109 0.108
12 0.056 0.056
10The need for non-linear codes
- Linear codes provide equal density of ones and
zeros in their output (p0.5). - Most of the codes thoroughly studied in the
literature are linear codes. - We observed in previous slide that, for linear
codes, the achievable rate tends to zero as the
number of users increase. - As the number of users increase, the optimal ones
density tends to zero. - Non-linear codes with relatively low density of
ones are required, to a achieve a good rate. - Only recently, there has been work on LDPC codes
with arbitrary density of ones. Theoretical
bounds are found to prove that these codes are
capacity achieving under ML decoding. There is no
design technique described for these codes. - Non-linear Trellis Coded Modulation
- This work introduces a novel design technique for
non-linear trellis codes with an arbitrary
density of ones. - To my knowledge, it is the first work that
addresses this task.
11Interleaver Division Multiple Access
- One successful approach to uncoordinated multiple
access is IDMA. - Every user has the same channel code, but each
users code bits are interleaved by a randomly
drawn interleaver, with very high probability of
being unique. - The receiver is assumed to know the interleaver
of the desired user. - With IDMA in the OR-MAC, a receiver should see
the signal from a desired user, corrupted by a
memory-less Z-Channel. - Performance obtained for a 6-user OR-MAC using
IDMA, and for the corresponding Z-Channel were
the same in my simulations.
12Trellis Codes Characteristics
- Memory given by a state. In the trellis
representation, for each state, and each possible
input, an output value and the next state is
given. - Generally next state and output given by
generator polynomials. - Initial state the all-zero state.
- Zero Termination.
- They are NOT capacity achieving
- We are achieving around 30 of full capacity
(around 43 of the achievable rate when treating
other users as noise) - Low complexity compared to capacity achieving
codes (Turbo-Codes, LDPC) - ML decoding Viterbi Decoding
State at time (t1)
0010
State at time t
1100
13Metric for the Z-Channel, for Maximum Likelihood
decoding
- Given a received word, the decoded codeword will
be the one that maximizes , or given
equally likely codewords . - For the Z-Channel, if one codeword has a 1
in a position where the received word r has a 0,
then - Among the possible transmitted codewords (where
there are no 1-to-0 transitions) - where is the number of 0-to-1 transitions,
and is the number of 0-to-0 transitions - Note that for the possible transmitted
codewords is actually the number of zeros in the
received word, which is the same for all possible
codewords. - Now, , so the most likely codeword is
the one that presents the less number of 0-to-1
transitions.
14Definition of distance for the Z-Channel
- The distance between two codewords measures the
likelihood that one transmitted codeword will be
wrongly decoded as the other codeword. - In the Z-Channel, a transmitted 1 will always
induce a received 1. - Define the directional Hamming distance
as the
number of ones that have to be added to a
codeword so that all ones of codeword are
present in the received word. - Example
- Now
- A Maximum-Likely (ML) decoder will always decode
the codeword with larger Hamming weight.
15Definition of distance for the Z-Channel (2)
- For two codewords with different Hamming weight,
if the received word contains all ones from both
codewords, the one with larger Hamming weight
will be more likely than the codeword with
smaller Hamming weight . - Only if is transmitted, an error will be
produced in the decoder. - Then, the directional distance of interest is
- which is the larger of both directional Hamming
distances. - For two codewords with equal Hamming weight,
errors can be made in both directions, and both
directional Hamming distances are equal, and
equal to the maximum of both. - In any case, the proper pairwise deign metric is
- And the overall objective is to maximize
16Greedy definition of distance
- In a trellis code, the design is made
branch-wise for each state, and each input, we
assign the next state, and the output. - Due to its non-linearity, last definition cannot
be applied branch-wise. - It is impossible to tell from one branch, which
codeword will have more Hamming weight. - Hence, we have to consider both branch-wise
directional Hamming distances. - The safest branch-wise metric would be
- This is the definition used in our design of
NL-TCM.
17Non-linear Trellis Coded Modulation
- Desired density of ones p is given
- Rate of the form 1/n (1 input bit, n output
bits). - states (represented by v bits)
- 2S branches
- Feed-forward encoder with 1 input
- Design
- Assign output values to the 2S branches of the
trellis - Objective Maximize the minimum distance (greedy
definition) - Those outputs have to maintain the desired
density of ones p.
18Assigning Hamming Weights
- First step assign Hamming weights to the output
of each branch. - Using any of the definitions of distance given
before, codewords with as equal Hamming weight
between each other lead to better performance. - In the case of codewords with different Hamming
weights, the worst-case performance will be
driven by those codewords with smaller Hamming
weight. - Criteria assign as similar Hamming weights to
the branches as possible, maintaining the density
of ones as close to the desired density of ones
as close to the desired p as possible.
19Assigning Hamming Weights
- Consider the following sub-graph
- There are S/2 of these sub-graphs.
- Branches produced by an input bit equal to 0 for
both states (or 1) go to the same state. - Define
- In this subgroup of four branches, assign a
Hamming weight of w1 to i branches, and a
Hamming weight of w to (4-i) branches.
20Assigning Hamming Weights, Examples
- 6-user OR-MAC, desired density of ones is
. - n20 w2, i2
- 2 branches with Hw2, 2 with Hw3 (p1/8).
- n 18 w2, i1
- 3 branches with Hw2, 1 with Hw3 (p1/8).
- n 17 w2, iround(0.5)
- 1 branch with Hw3 and 3 with Hw2 (p0.132)
- all with Hw2 (p2/170.118).
- 100-user OR-MAC,
- n 400 w2, i3 (p 0.006875)
- n 360 w2, i2 (p 0.006944)
21Choosing all branches to have at least distance
of 1 between each other
- It would be desirable if possible, that all
branches had at least distance of 1 between each
other. - In the case where all branches have the same
Hamming weight w then we can have up to
different branches with a distance of at least 1.
- If , it is possible.
- In the case of branches with different Hamming
weights, the computation is a little more
complicated. - Two different codewords split at some point in
their trellis paths, and their paths will not
merge again until at least v1 trellis sections
after the split. - In case it is possible to have different output
values for all branches, then the minimum
distance of the code is lower-bounded by -
22Choosing all branches to have at least distance
of 1 between each other
- In case we need to repeat output values, we can
allow the following branches to have same output
value maintaining the bound - Again consider the sub-graph
- Branches in red (blue) can have same output value
without affecting the minimum distance. - Consider two different paths, one traversing
branch A, and the other traversing branch B at
some trellis section. - They traverse at least v branches with different
output values before that trellis section. - They traverse at least v branches with different
output values after that trellis section. - Their distance is at least 2v.
23Ungerboecks rule
- We have already assigned Hamming weights to the
branches, and have enumerated all the possible
output values in order to have different output
values for all branches (allowing some branches
to be equal according to previous slide) - Up to this point we have
- We can further increase the minimum distance by
applying Ungerboecks rule maximize the distance
between all splits and merges. - Remember that all output values had at least a
Hamming distance of w. - For every two different codewords, their paths
split and merge at least once, and there are at
least v-1 branches between the split and the
merge. - Hence
24Extending Ungerboecks rule
- One can extend Ungerboecks rule into the trellis.
0
1
Maximize split
25Extending Ungerboecks rule
- One can extend Ungerboecks rule into the trellis.
0
0
1
1
0
1
Maximize
26Extending Ungerboecks rule
- One can extend Ungerboecks rule into the trellis.
Note that by maximizing the distance between the
8 branches, coming from a split 2 trellis section
before, we are maximizing all groups of 4
branches coming from a split in the previous
trellis section, and all splits.
0
0
1
1
Maximize
0
1
27Extending Ungerboecks rule
- The same idea applies for the merges, moving
backwards in the trellis. - If we move h trellis sections forward from a
split (including the split), and g sections
backwards from a merge (including the merge), the
new bound becomes - Now, we have to compute the maximum possible
values of h and g.
28Extending Ungerboecks rule
- First, lets compute the number of branches that
need to have maximum distance between each
otherto cover h sections from a split, and g
sections backwards from a merge. - From a splitting point of view
- From the merging point of view
- Each branch, belongs to one group of and one
group of - Thus, each branch has to have maximum distance
with other branches.
29Extending Ungerboecks rule
- Second, we can compute how many branches of
maximum distance between each other we can have.
Lets denote this number T. - For branches with equal Hamming weight w,
- In the general case
- The constraints are
-
- Each branch has to belong to a group of and
a group of - If we choose all constraints
are satisfied
30Designing for a very low desired ones density
- For a low enough desired ones density, all the
branches can be chosen to have maximum distance. - The design becomes straight-forward.
- Consider a NL-TCM code with S states, desired
density p. - Denote M the sum of all the ones from the outputs
of all 2S branches. - Then
- But if then
- It is possible to choose all 2S branches so that
there is at most 1 branch that has a 1 in a given
position. - Straight-forward design
- Assign Hamming weights to branches
- For each branch, add ones in positions that
arent used in previous branches - Example 100-user OR-MAC,
31Performance Results
- For all implementations, states were
used. - 6-user OR-MAC
- n20 Sum-rate 0.30
- 2 branches with Hw2, 2 with Hw3 (p1/8).
- h3, g2
- n 18 Sum-rate 1/3
- 3 branches with Hw2, 1 with Hw3 (p1/8).
- h2,g2
- n 17 Sum-rate 0.353
- all with Hw2 (p2/170.118).
- h2,g2
- 100-user OR-MAC,
- n 400 w2, i3 (p 0.006875)
- n 360 w2, i2 (p 0.006944)
- for both cases
32Performance results
- FPGA implementation
- In order to prove that NL-TCM codes are feasible
today for optical speeds, a hardware simulation
engine was built on the Xilinx Virtex2-Pro 2V20
FPGA. - Results for the rate-1/20 NL-TCM code are shown
next. - Transfer Bound
- Wen-Yen Weng collaborated to this work, with the
computation a Transfer Function Bound for NL-TCM
codes. - It proved to be a very accurate bound, thus
providing a fast estimation of the performance of
the NL-TCM codes designed in this work.
33Performance Results 6-user OR-MAC
34Performance Results 6-user OR-MAC
35Results observations
- An error floor can observed for the FPGA
rate-1/20 NL-TCM. - This is mainly due to the fact that, while
theoretically a 1-to-0 transition means an
infinite distance, for implementation constraints
those transitions are given a value of 20. - Trace-back depth of 35.
- The BER for all cases are not as low as required.
36Performance Results 100-user OR-MAC
Rate Sum-rate p BER
1/360 0.2778 0.006944 0.49837
1/400 0.25 0.006875 0.49489
37Dramatically lowering the BER Concatenation
with Outer Block Code
- Optical systems deliver a very low BER, in our
work a was required. - Using only a NL-TCM, the rate would have to be
very low. - A better solution is found using the fact that
when the Viterbi decoding fails, with relatively
high probability only a small number of bits are
in error. - Thus, a high-rate block code that can correct a
few errors can be attached as an outer code,
dramatically lowering the BER.
Block-Code Encoder
NL-TCM Encoder
Z-Channel
Block-Code Decoder
NL-TCM Decoder
38Reed-Solomon NL-TCM Results
- A concatenation of the rate-1/20 NL-TCM code with
(255 bytes,247 bytes) Reed-Solomon code has been
tested for the 6-user OR-MAC scenario. - This RS-code corrects up to 8 erred bits.
- The resulting rate for each user is
(247/255).(1/20) - The results were obtained using a C program to
apply the RS-code to the FPGA NL-TCM output. - Although we dont have results for the 100-user
case, it may be inferred that a similar BER would
be achieved.
Rate Sum-rate p BER
0.0484 0.29 0.125 0.4652
39Conclusions
- I developed a novel design technique for
non-linear trellis codes, that provide a wide
range of ones density. - These codes have been designed for the Z-Channel,
that arises in the optical multiple access
channel. - A relatively low ones density is essential for
the OR-MAC channel, and asymmetric channels in
general. - An arbitrary number of users is supported,
maintaining relatively the same efficiency
(around 30) - Although these codes are not capacity achieving,a
good part of the capacity is achieved, with a
suitable BER fr optical needs, and a complexity
feasible for optical speeds with todays
technology. An FPGA implementation has been built
to prove this fact.
40Future work (1) Capacity achieving codes
- Capacity achieving codes.
- Although they may not be feasible for optical
speeds, with todays technology, Turbo codes and
LDPC codes will be feasible in the near future - Part of my immediate futures work will be the
design Turbo-Like codes, with an arbitrary ones
density. - Most common Turbo-like codes are
- Parallel concatenation of convolutional codes
- Serially concatenated convolutional codes.
- The convolutional codes will be replaced by
properly designed NL-TCMs.
41Non-linear Turbo Like codes
- Serial concatenation CC NL-TCM
- Parallel concatenated NL-TCMs
CC
Interleaver
NL-TCM
NL-TCM
Interleaver
NL-TCM
42Non-linear Turbo-like codes
- The NL-TCM will not be a feed-forward encoder.
- The design criteria changes.
- However, the fundamental ideas hold.
- The fact that the RSNL-TCM concatenation
(hard-decision transmitted from one decoder to
the other) has such a good BER, makes the serial
concatenation of CCNL-TCM with soft-decoding
look promising.
43Future Work(2) More general Channel
- Also, to be more general, I will study the
Multiple access channel where the 110 case, has
a positive (although very small) probability. - Treating other users as noise, one user sees an
Binary Asymmetric Channel. - This will be change the metric in the Viterbi
decoder, the definition of distance used, but
shouldnt change the design criteria