Implementation of H.264 Based System on Multi-DSPs Board - PowerPoint PPT Presentation

Loading...

PPT – Implementation of H.264 Based System on Multi-DSPs Board PowerPoint presentation | free to view - id: 6f5895-NTQ5N



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Implementation of H.264 Based System on Multi-DSPs Board

Description:

Implementation of H.264 Based System on Multi-DSPs Board 2008.02.13 * – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 42
Provided by: YiA8
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Implementation of H.264 Based System on Multi-DSPs Board


1
Implementation of H.264 Based System on
Multi-DSPs Board
  • ???
  • 2008.02.13

2
Outline
  • System description
  • Architecture
  • MEX Board
  • TMSDM642
  • Communication interface
  • Software development
  • Error resilience

3
Architecture
MEX Board 1
PC 1
Capture Frame
H.264 Encode
Send to Network
PC 2
MEX Board 2
PC 2
Display
H.264 Decode
Receive from Network
4
MEX Board
  • MEX board is composed of
  • 4 DSP TMS320DM642 for data stream compression
    (video/audio) and its memory.
  • 2 FPGA for flexible architecture
  • 8 video chips SA6711H(ADC)
  • 4 audio stereo chip CS4221(ADC)

5
MEX Board
Block Diagram of MEX board1
6
MEX Board Block Diagram
Block Diagram of MEX board1
7
TMS320DM642
  • TMS320DM642
  • Performance 4000-4800 MIPS
  • Two Level Cache
  • L2 256 KB, L1P 16 KB, L1D 16 KB
  • 3 Video Ports
  • 8-Bit McASP
  • Ethernet MAC
  • 32-Bit HPI
  • 66 MHz PCI
  • 64-Bit EMIF

DSP DM642 block diagram2
8
TMS320DM642
  • Peripherals will be used
  • Enhanced DMA (EDMA)
  • Video ports (VP0VP2)
  • Inter-integrated circuit (I2C) bus
  • External memory interface (EMIF)
  • Ethernet media access controller(EMAC)
  • Management data input/output (MDIO)

9
Outline
  • System description
  • Communication interface
  • Host/ MEX Communication
  • Video capturing/ Displaying
  • Network Transmit
  • Software development
  • Error resilience

10
Host/ MEX Communication
MEX
Start EDMA Unreset DSP1 FIFO Clear PCI Interrupt
Set DSP FIFO Direction Set FIFO Full Flag
value DSP FIFO is reset
DSP started fill memory
Initialize transfer
DSP to PCI transfer request
Start Transfer
Transfer finished
PC
PCI started wait for interrupt
Initialize transfer
PCI to DSP start transfer request
Wait for transfer finished
Transfer finished
Set transfer size Set PCI FIFO direction Select
DSP data sources Set transfer destination address
Start PCI FIFO Clear DSP Interrupt
Data transfer from the 4 DSP (SDRAM) to PCI 1
11
Video Capture
MEX Board
I2C BUS
DM642
Camera
Video Chip SAA7113H (ADC)
VP0
DMA
VP1
VP2
NTSC Analog / 525-line per frame / 30 frames
per second or PAL Analog / 625-line per frame /
25 frames per second
ITU656 Digital / for PAL or NTSC
Raw Data
12
TMS320DM642 Video Port
3
13
Network Architecture
MEX Board 1
DM642
PHY LXT971ALC
EMAC
MDIO
MEX Board 2
RJ45
DM642
PHY LXT971ALC
EMAC
MDIO
14
TMS320DM642 EMAC
  • DM642 Networking Using EMAC and MDIO

DM642 Networking 4
15
Outline
  • System description
  • Communication interface
  • Software development
  • H.264 Codec
  • Optimization
  • Parallelization
  • Memory Issue
  • Error resilience

16
H.264 Encoder Block Diagram
17
H.264 Decoder Block Diagram
18
Optimization on Single Chip
  • Realization and Optimization of DSP Based H.264
    Encoder 5
  • Optimization of H.264 on DSP platform
  • Code transplant and primary optimization
  • Optimization of the key module
  • Using TI C64x IMAGLIB
  • Data scheduling and storage allocation
  • Data scheduling with EDMA
  • Storage allocation (Code section/Data section)

19
Parallelization on Chips
  • One GOP in one DSP
  • Each DSP handles IPPP or IBBPBB... .
  • No dependences are between group of pictures
    (GOPs).
  • One Frame / One macroblck in one DSP
  • Each DSP handle one frame or one macroblock.
  • Dependences are between frames and macroblocks.

20
Macroblock Dependencies
  • Data dependencies induced by inter-prediction
  • Motion vector MVcur are predicted from MVAD

Reference frame
Current frame
MVD
MVB
MVC
MVA
MVcur
Data dependencies induced from MV prediction 6
21
Macroblock Dependencies
  • Data dependencies induced by intra-prediction
  • Left, upper-left, upper, and upper-right MBs

Data dependencies induced from intra prediction
6
22
Macroblock Dependencies
  • Data dependencies induced by deblocking filter
  • Top 4 rows of pixels and leftmost 4 columns

Data dependencies induced from deblocking filter
6
23
Macroblock Dependencies
  • Possible spatial data dependencies for a
    macroblock

Intra Pred. MV Pred.
Intra Pred. MV Pred. Deblocking Fitler
Intra Pred. MV Pred.
Intra Pred. MV Pred. Deblocking Fitler
Current MB
Possible spatial data dependencies for a
macroblock 6
24
Macroblock Dependencies
  • Macroblock Dependencies
  • Data dependencies between frames
  • Data dependencies between MB rows in the same
    frame
  • Data dependencies in the same MB row

25
Wave-front parallelization
  • Partition for MB region

Wave-front of Macro-block Region Partition 7
26
Wave-front parallelization
  • Partition for frames

Wave-front of Frame Partition 7
27
Memory Issue
  • Limited memory of DM642
  • Use memory buffer to reduce memory access

L1P Cache Direct Mapped 16Kbytes Total
peripherals
DM642 DSP Core
L2 Cache/ Memory 256Kbytes Total
EDMA Controller
L1D Cache 2-way Set Associated 16Kbytes Total
Two-level cache architecture of DM642
28
Memory Issue
  • Memory hierarchy for inter prediction

Memory hierarchy 8
29
Memory Issue
  • Slice memory buffer for intra prediction and
    deblocking filter

Slice Memory 9
30
Outline
  • System description
  • Communication interface
  • Software development
  • Error resilience
  • Error-Resilience Tools in H.264/AVC
  • Error resilience of JM source code

31
Error Resilience Tools in H.264/AVC
  • Redundant slices (RSs) 10
  • For a MB, an encoder can place redundant
    representation of the same MBs into the same it
    stream.
  • e.g.
  • One slice is coded using different quantization
    parameter (QP).
  • If the slice of low QP is available, the decoder
    discards the RS otherwise, the RS is
    reconstructed by the decoder

Slice AQP1
Decoder
Slice AQP2
32
Error Resilience Tools in H.264/AVC
  • Parameter sets 10
  • Including picture size, entropy coding method, MV
    resolution, and so on.
  • Sequence parameter set (SPS)
  • Containing all information related to the picture
    sequence between two IDR (Instantaneous Decoding
    Refresh ) pictures.
  • Picture parameter set (PPS)
  • Containing all information related to all slices
    in a picture.
  • e.g. Sending multiple copies of SPSs so to
    enhance the arrival rate.
  • e.g. SPSs can be sent out-of-band.

33
Error Resilience Tools in H.264/AVC
  • Flexible macro-block ordering (FMO) 10
  • 7 modes
  • Overhead bits highly depends on the picture
    format, the content, and the QP.
  • lt 5 penalty at QP 16 on average 20 at QP
    28.

6 modes of FMO 10
34
Error Concealment  of H.264/AVC
  • Error concealment scheme provided in JM
  • Intra
  • Inter

Error concealment for macro-blocks 11
35
Future Work
  • Optimization the H.264 codec for real time
  • Implementation of different concealment methods
  • Proposed corresponding error resilience methods

36
Reference
  • 1 VITEC MULTIMEDIA, MEX User manual Revision
    1.7.
  • 2 Texas Instruments, Incorporated TMS320C64x
    DSP Generation Product Bulletin (sprt236)
  • 3 Texas Instruments, Incorporated TMS320DM64x
    Video Port to Video Port Communication.
    (spraaf3)
  • 4 Texas Instruments, Incorporated TMS320C6000
    DSP Ethernet Media Access Controller (EMAX)
    Management Data Input Output Module Reference
    Guide. (spru628a)
  • 5 Zhe Wei and Canhui Cai  Realization and
    Optimization of DSP Based H.264 Encoder , ISCAS
    2006 Circuits and Systems, May 2006
  • 6 Chen, Y., Li, E., Zhou, X., Ge, S.
    Implementation of H. 264 Encoder and Decoder on
    Personal Computers. Journal of Visual
    Communications and Image Representation 17 (2006)
  • 7 Zhuo Zhao, and Ping Liang, Data partition
    for wave-front parallelization of H.264 video
    encoder, 31st IEEE International Conference on
    Acoustics, Speech, and Signal Processing (2006)
  • 8 Denolf, K. De Vleeschouwer, et al,, Memory
    centric design of an MPEG-4 video encoder , IEEE
    Trans. CSVT, Vol. 15, No. 5, pp. 609-619, May
    2005.
  • 9 Tsu-Ming Liu et al., A 125µW, Fully Scalable
    MPEG-2 and H.264/AVC Video Decoder for Mobile
    Applications, ISSCC Digest of Technical Papers,
    pp. 402-403, Feb. 2006.
  • 10S. Wenger, H.264/AVC over IP, IEEE Trans.
    Cir. Syst. Video Technol., vol. 13, pp. 645656,
    July 2003.
  • 11 "Non-normative error concealment
    algorithms,ITU-T VCEG-N62S?,2001?O9

37
H.264 Partitions
16x16 blocks
8x8 blocks
4x4 blocks
  • Frame partitions
  • Macroblock partitions

38
H.264 Intra-Mode Decision
39
H.264 Intra-Mode Decision
44 horizontal
1616 plane
40
Fast integer fractional pixel motion estimation
Cover both small motion and large motions, the
search point which gives the smallest matching
error from one step is the starting point of next
step.
Assume the guessed starting point is (0,0).
Around 130 points searched in this algorithm, the
save is (33x33-130)/(33x33)? 90! If there are 3
starting points are tried, the save is around 64!

Integer pixel search scheme
41
Fast integer fractional pixel motion estimation
Best matching integer point coming from integer
motion search
  1. Search its 1/2 -pixel neighbors
  2. Search its 1/4-pixel neighbors
  3. Search its 1/8-pixel neighbors

The optimal point is the search center of next
step search.
Fractional pixel search scheme
About PowerShow.com