Multimedia Processor Design - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Multimedia Processor Design

Description:

4 way Parallelism in 64 bit SIMD. Another Example.. VLIW .... Very Long Instruction Word ... SONY Play station 2 3D gaming. What a Lengthy Instruction does? ASIC's ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 28
Provided by: sud47
Category:

less

Transcript and Presenter's Notes

Title: Multimedia Processor Design


1
Multimedia Processor Design
  • -An Overview of Design Issues

2
Discussion ..
  • Multimedia in general
  • Approaches in designing SIMD and
  • VLIW
  • Evaluation methods and Parameters
  • A focus on SIMD
  • Conclusion

3
Why Multimedia Processors?
  • Video Audio Compression
  • Image Processing 3D graphics
  • Signal Processing
  • Communication Networking and ,
  • World Wide Web

4
Features of Multimedia Processing..
  • Heavy Computation
  • High Memory Bandwidth
  • Real Time Processing Video Conferencing
  • Inherent Data and Instruction Parallelism
  • A lot of execution time spent in small loops

5
Approaches..
  • SIMD Single Instruction Multiple Data
  • VLIW Very Long Instruction Word
  • ASICS Application Specific Ics.

6
SIMD?
  • Single Instruction Multiple Data
  • Exploitation of Data Parallelism
  • Multimedia Extensions to GPPs
  • MMX for Pentium II Processor
  • Similar operation on different data
    Simultaneously

7
SIMD .What is the idea?
  • Multimedia processing involves smaller data types
    (typically 8 or 16 bit)
  • Packing SMALLER data elements in
  • WIDER data paths (typically 32 or 64
  • bit)

8
ADD Instruction in SIMD4 way Parallelism in
64 bit SIMD
9
Another Example..
10
VLIW .Very Long Instruction Word
  • Exploits Instruction Parallelism
  • Multiple Functional Units in Data Path
  • A Single instruction triggers Multiple
    Operations
  • Dedicated Multimedia Processors
  • SONY Play station 2 3D gaming

11
What a Lengthy Instruction does?
12
ASICs
  • Application Specific Integrated Circuits
  • Highly Optimized
  • Reduced Flexibility
  • Ex ADV- JP2000 Wavelet Processor Entropy
    Codec
  • Not in much use compared to SIMD and
  • VLIW

13
Parameters for Evaluation
  • Execution time Speed Up
  • Performance improvement of technique x over y
    is
  • Execution time of y
  • Execution time of x

14
Parameters (contd..)
  • IPC Instructions retired per cycle
  • Frequency of Operations
  • Effective Branch prediction
  • Cache hit rates
  • Data size

15
Parameters (contd)
  • Resource and Instruction stream stalls
  • Average Basic Block size local parallelism
  • Floating point operations

16
Testing Methodologies
  • How to evaluate performance?
  • Benchmarks
  • Standard Multimedia Applications
  • Kernels
  • Kernels are operations that dominate
  • overall processing time

17
  • Kernels
  • Dot-product, matrix-vector products
  • FIR and IIR filters
  • FFT and DCT
  • Applications
  • Speech compression algorithms
  • Image Processing applications (JPEG)
  • 3D graphics - Quake II
  • Video processing (MPEG) - Real Video

18
Bottlenecks in SIMD.
  • Most of the DLP in SIMD lies in data present
    within nested loops.
  • As an example of data in multimedia processing
    consider this..

19
A 2D Data structure in multimedia..
20
The problem..
  • DLP in the outer loops is not exploited
  • Hardware to generate multiple address sequences
    is not complicated, but supporting them using
    general-purpose instruction sets is difficult due
    to limitation of addressing modes.
  • Not enough support for keeping track of multiple
    indices/strides efficiently in GPPs.

21
Instructions of two types..
  • Computationally Useful Instructions
  • EX MAC used in the algorithm
  • Overhead instructions
  • Address generation
  • Branch
  • Address transformation
  • Load store

22
What dominates??
  • Overhead instructions consume 75 of
  • execution time!!!!
  • Peak throughput of SIMD is not achieved.
  • Hardware to process Overhead Instructions has to
    be scaled.

23
A Solution..
  • Scaling SIMD units will not help
  • More Parallelism is extracted by scaling
  • NON-SIMD part of the architecture..
  • Dedicated architecture like Media Breeze along
    with the SIMD will yield good results.

24
An important issue
  • Architecture and compiler should complement each
    other
  • Compiler should not saturate certain
    architectural resources while rarely using other.

25
A sample analysisRatio of Execution Times
26
VLIW or SIMD ??
  • Either processor is not a replacement for the
    other
  • SIMD and VLIW and
  • not SIMD vs VLIW!
  • Infact both can be implemented on the same
    processor. Ex ADIs TigerSharc

27
  • Thank You
Write a Comment
User Comments (0)
About PowerShow.com