Tinoosh Mohsenin and Bevan M' Baas - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Tinoosh Mohsenin and Bevan M' Baas

Description:

Row weight=32, Col weight=6, quantization bit=5. 139 mm2 in 0.18 m CMOS ... Answer: first the size and total no of col processors doesn't change. ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 34
Provided by: tinooshm
Category:
Tags: baas | bevan | col | mohsenin | tinoosh

less

Transcript and Presenter's Notes

Title: Tinoosh Mohsenin and Bevan M' Baas


1
Split-Row A Reduced Complexity, High Throughput
Low Density Parity Check (LDPC) Decoder
Architecture
  • Tinoosh Mohsenin and Bevan M. Baas
  • VLSI Computation Lab, ECE Department
  • University of California, Davis

2
Outline
  • Introduction to LDPC Codes
  • Split-Row Decoder Algorithm
  • Error Performance Comparison
  • Decoder Implementation Results
  • Conclusion

3
Error Correction in Communication Systems
Error correction is widely used in most
communication systems.
4
LDPC Codes Applications
  • Standards
  • 10 Gigabit Ethernet (10GBASE-T) 2006
  • Digital Video Broadcasting (DVB-S2)2005
  • Next generation of WiFi and WiMAX
  • Problems with current LDPC decoders
  • Lack of enough memory bandwidth
  • High interconnect complexity

www.ieee802.org/3/an/
5
LDPC Coding
Transmitter
Noisy Channel
Encoded Image
Receiver
Decoded Image
Received Image
Iteration 1
Iteration 14
Modified images from Maccay 2001
6
LDPC Decoding Message Passing Algorithm
  • Performs row and column operations iteratively.

7
Serial Decoders
  • One or a few row and column processing units.
  • Features
  • Simple
  • Small area
  • Small number of memories
  • Disadvantages
  • Low memory bandwidth
  • Low throughput 100 Kbps-10Mbps

8
Full Parallel Decoders
  • Row and column processors are directly mapped
    according to the parity check matrix
  • High throughput
  • Disadvantages
  • Large circuit area
  • High interconnect complexity
  • Example 2048-bit, 10GBASE-T
  • Row weight32, Col weight6, quantization bit5
  • 139 mm2 in 0.18 µm CMOS
  • 122,000 long inter-processor wires
  • 1.3 Gbps

9
Outline
  • Introduction to LDPC Codes
  • Split-Row Decoder Algorithm
  • Error Rate Comparison
  • Decoder Implementation Results
  • Conclusion

10
Key Features of Split-Row Decoder
  • Row processing (dominates decoder complexity)
  • Increased parallelism
  • Reduced number of memory accesses
  • Reduced processor complexity
  • Results
  • Smaller decoder area and higher utilization
  • Lower interconnect complexity
  • Higher throughput
  • Simpler hardware implementation

11
Standard vs. Split-Row Decoder
Standard Decoder
Split-Row Decoder
12
Split-Row Algorithm-Mathematical View
  • The magnitude part of the row processor output a,
    is larger for the Split-Row decoder
  • By normalizing the a values with a scale factor
    Slt1 the error performance of Split-Row decoder is
    improved

13
Outline
  • Introduction to LDPC Codes
  • Split-Row Decoder Algorithm
  • Error Performance Comparison
  • Decoder Implementation Results
  • Conclusion

14
Bit Error Rate Performance Comparison
  • Code length 1536 bits
  • Message length 1155 bits
  • Row weight 16
  • Column weight4
  • No. of iterations15
  • MS MinSum
  • MS Split-Row MinSum-
  • Split Row
  • S Scale factor

0.6dB
15
Bit Error Rate Performance Comparison
  • Code length 2048 bits
  • Message length 1723 bits
  • Row weight 32
  • Column weight6
  • No. of iterations15
  • MS MinSum
  • MS Split-Row MinSum-
  • Split Row
  • S Scale factor

0.3dB
16
Outline
  • Introduction to LDPC Codes
  • Split-Row Decoder Algorithm
  • Error Rate Comparison
  • Decoder Implementation Results
  • Conclusion

17
A Full-Parallel Decoder Implementation
  • LDPC code example
  • Code length1536 bits
  • Message length770 bits
  • Row weight6
  • Col weight3
  • In Split-Row decoder
  • Total no. of wires between each half is 3 of
    total wires.
  • Row processors in each half are 2.7 times
    smaller
  • Each row processor in each half is connected to
    only 3 column processors

18
Full Parallel Decoder Architecture
0.18 µm CMOS Technology, 6M layer
  • Split-Row, each half includes
  • 768 row processors
  • 768 column processors

Standard MinSum
19
Split-Row vs. Standard Decoder
(mm2)
(MHz)
(Gbps)
(mm)
  • 1536-bit (3,6) Quasi-cyclic LDPC code
  • No. of quantization bits is set to 5 bits per
    message.
  • For throughput computation no. of decoding
    iterations is set to 15.
  • Reported numbers are based on chip implementation
    results in 0.18 µm

20
Conclusion
  • Split-Row decoder method provides a significant
    reduction in circuit area
  • Results in
  • Reduced wire interconnect complexity
  • Increased circuit area utilization
  • Increased speed
  • Simpler implementation
  • A good tradeoff between hardware complexity and
    error performance

21
Acknowledgments
  • Intel Corporation
  • UC Micro
  • NSF Grant No. 0430090
  • UCD Faculty Research Grant

22
Message Passing (Row processing )
23
Message Passing (Column processing )
?j is the received information.
24
?1
25
(No Transcript)
26
LDPC Codes
  • An LDPC code is defined by a binary matrix called
    parity check matrix H.
  • Rows define parity check equations (constrains)
    between encoded symbols in a code word and
    columns define the length of the code.
  • V is a valid code word if H?Vt0
  • Decoder in the receiver checks if the condition
    H?Vt0 is valid.
  • Example Parity check matrix for (9, 5) LDPC
    code, row weight4, column weight 2

27
Row and Column Processor Architecture
28
RowCol Procs. Right
RowCol Procs. left
29
(No Transcript)
30
  • ThroughputClkCode length/Imax
  • Pcfv2

31
  • What is the critical path and how you make sure
    that sign is computed correctly?
  • Answer the critical path is the sign
    computation, which depends on the other side. The
    statistical timing analysis in place and route
    reports the slowest path delay, so it will make
    sure that the circuit works correctly.
  • Why the decoder chip becomes smaller even when
    you make it into half?
  • Answer first the size and total no of col
    processors doesnt change. The main benefit comes
    from the row processor which gets smaller than
    twice. The reason is that inside row processor
    there are different stages of comparators and
    they decrease more than twice when the number of
    inputs reduces to half.
  • You mentioned the design is power efficient but
    you didnt report any power numbers
  • Answer For this paper we didnt get the power
    numbers, but it can be estimated from the fact
    the major energy comes from the wires (p1/2cf2)
    and we can say its scaled down linearly so its
    about 58 reduction.
  • Are there other works close to your design?

32
  • Which applications can tolerate this error
    performance loss?
  • This a very broad question. It really depends on
    the power budget and how much low you want to go
    on ber.
  • What is the difference between viterbi and LDPC
    code?
  • What is the difference between the turbo and
    LDPC?
  • If dont know the answer
  • I was not involved in That part of project but
    from what I know .
  • Review the previous works
  • If asked why the chip figure is not square?
  • If somebody asked the way yu proposed didnt
    decrease the no of wires how do you say that it
    decreases the interconncet complexity.
  • You should notice that we are talking about long
    wires. Because when there is a large no of wires
    conincting one

33
  • Hard decision vs. soft
  • In hard decision decoding each received symbol is
    thresholded to yield a single received bit as
    input to the decoding algorithm and messages
    passed between variable and check nodes as single
    bit only In soft decision decoding, multiple bits
    are used to represent each received symbol and
    the messages passed between variable and check
    node
  • How did you compute
Write a Comment
User Comments (0)
About PowerShow.com