Design and Implementation of Multimedia Signal Processing SystemsonChip - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Design and Implementation of Multimedia Signal Processing SystemsonChip

Description:

Design and Implementation of Multimedia Signal Processing Systems ... Fab-less design houses turn innovative design into profitable chip sets using CAD tools. ... – PowerPoint PPT presentation

Number of Views:194
Avg rating:3.0/5.0
Slides: 40
Provided by: YuHe8
Category:

less

Transcript and Presenter's Notes

Title: Design and Implementation of Multimedia Signal Processing SystemsonChip


1
Design and Implementation of Multimedia Signal
Processing Systems-on-Chip
  • Yu Hen Hu
  • University of Wisconsin Madison
  • Dept. Electrical Computer Engr.
  • Madison, WI 53706
  • Hu_at_engr.wisc.edu

2
Outline
  • Course Objectives and Outline,
  • What is multimedia signal processing?
  • What is Systems-on-Chip (SoC)?
  • Implementation Options and Design issues
  • General purpose (micro) processor (GPP) core
  • Multimedia enhanced extension (Native signal
    processing)
  • Programmable digital signal processors (PDSP)
    core
  • Multimedia signal processors (MSP)
  • Application specific integrated circuit (ASIC) IP
  • Re-configurable IP

2
3
Course Objectives
  • Provide students with a global view of embedded
    micro-architecture implementation options and
    design methodologies for multimedia signal
    processing applications
  • The interaction between the algorithm formulation
    and the underlying architecture that implements
    the algorithm will be focused
  • Formulate algorithm to match architecture.
  • Design novel architecture to match algorithm.

4
Course Outline
  • Signal processing algorithm representation Data
    flow graph, dependence graph, signal flow graph,
    iteration bounds
  • Pipelining and parallel processing of signal
    processing algorithms, and algorithm
    transformation retiming, unfolding, folding
  • Re-configurable computing using field
    programmable gate array (FPGA)
  • Signal processing arithmetic units distributed
    arithmetic, CORDIC
  • Implementation of video coding standards MPEG,
    and JPEG DCT and DWT architecture, motion
    estimation architecture, entropy coder
    architecture
  • Implementation of communication algorithms

5
What is Signal?
  • A SIGNAL is a measurement of a physical quantity
    of certain medium.
  • Examples of signals
  • Visual patterns (written documents, picture,
    video, gesture, facial expression)
  • Audio patterns (voice, speech, music)
  • Change patterns of other physical quantities
    temperature, EM wave, etc.
  • Signal contains INFORMATION!

6
Medium and Modality
  • Medium
  • Physical materials that carry the signal.
  • Examples paper (visual patterns, handwriting,
    etc.), Air (sound pressure, music, voice),
    various video displays (CRT, LCD)
  • Modality
  • Different modes of signals over the same or
    different media.
  • Examples voice, facial expression and gesture.

7
What is Signal Processing?
  • Ways to manipulate signal in its original medium
    or an abstract representation.
  • Signal can be abstracted as functions of time or
    spatial coordinates.
  • Types of processing
  • Transformation
  • Filtering
  • Detection
  • Estimation
  • Recognition and classification
  • Coding (compression)
  • Synthesis and reproduction
  • Recording, archiving
  • Analyzing, modeling

8
Signal Processing Applications
  • Communications
  • Modulation/Demodulation (modem)
  • Channel estimation, equalization
  • Channel coding
  • Source coding compression
  • Imaging
  • Digital camera,
  • scanner
  • HDTV, DVD
  • Audio
  • 3D sound,
  • surround sound
  • Speech
  • Coding
  • Recognition
  • Synthesis
  • Translation
  • Virtual reality, animation,
  • Control
  • Hard drive,
  • Motor

9
Digital Signal Processing
  • Signals generated via physical phenomenon are
    analog in that
  • Their amplitudes are defined over the range of
    real/complex numbers
  • Their domains are continuous in time or space.
  • Processing analog signal requires
    dedicated,special hardware.
  • Digital signal processing concerns processing
    signals using digital computers.
  • A continuous time/space signal must be sampled to
    yield countable signal samples.
  • The real-(complex) valued samples must be
    quantized to fit into internal word length.

10
Multimedia Signal Processing
  • Digital signal processing applied to
    multimedia/multi-modality applications
  • Movie, visualization, animation
  • Speech, audio
  • Gesture, expression, emotion
  • Transmission, storage of multimedia signals
  • Streaming, wireless video
  • Multimedia database, content based retrieval
  • Security watermarking

11
Implementation of DSP Systems
  • Platforms
  • Native signal processing (NSP) with general
    purpose processors (GPP)
  • Multimedia extension (MMX) instructions
  • Programmable digital signal processors (PDSP)
  • Media processors
  • Application-Specific Integrated Circuits (ASIC)
  • Re-configurable computing with field-programmable
    gate array (FPGA)
  • Requirements
  • Real time
  • Processing must be done before a pre-specified
    deadline.
  • Streamed numerical data
  • Sequential processing
  • Fast arithmetic processing
  • High throughput
  • Fast data input/output
  • Fast manipulation of data

12
Observations
  • Embedded, low power multimedia communication
    systems are emerging applications that demand a
    SoC platform based solution.
  • The high-level of integration and complexity of
    SoC require close match between the algorithm and
    the architecture.
  • Two issues will be addressed
  • Communication
  • Interface
  • One should design MM/Comm algorithms such that it
    requires local communication and have flexible
    interface requirements.

13
The SoC Edge
Technology Demand and Supply
  • Initially, the development of sustaining
    technology such as general purpose ?P focused on
    performance (thick line) improvement to meet
    demand (dashed line)
  • After the performance surpassed the demand,
    disruptive technology such as SoC come in late in
    the game, focusing on
  • Time-to-market
  • Customization
  • Price/performance ratio
  • Power consumption

Computing power
Disruptive technology
performance
workstation
PC
embedded
  • Time-to-market
  • Customization
  • Price
  • Power consumption

Sustaining technology
time
M.J. Bass Clayton Christensen, the future of
the microprocessor business, IEEE Spectrum, April
2002, pp. 34-39.
14
Widening Hw/Sw Gap
  • Hardware
  • Performance improves according to Moores law
    (exponentially).
  • Cost is lower and lower
  • Manufacture
  • Service
  • Design cost increase!
  • Verification
  • Simulation
  • ? New generation of CAD software is in desperate
    needs for SoC design.
  • Software
  • Relatively stable
  • Unix 30 years!
  • Mac OS 20 years!
  • MS Window 20 years!
  • High cost in developing software
  • MS Office cost more than a low end PC!
  • CAD software always lags behind hardware
    development!
  • ? SoC application must address software
    compatibility issue.

15
SoC Platforms
  • Platform
  • A platform consists of compatible hardware Ips
    (processor, buses), software Ips (OS,
    application), design tools (CAD software,
    prototype system, etc) and technical support
    services to facilitate the development of SoC
    systems.
  • Platform based design is to meet the software
    compatibility requirements
  • SoC Platforms
  • Processor centric
  • use proven processor core, such as ARM
  • Software compatible
  • Communication centric
  • use uniformed bus architecture
  • Standardized communication interface
  • Re-configurable
  • use FPGA plus processor core
  • More flexibility in ASIC IP design.

16
A Design Chain of MM SoC
  • Electronic design chain is a supply chain
    management model to manage the complexity of SoC
    design.
  • Each design chain is based on a particular
    platform including programmable ?P core, OS,
    ASIC module IPs, application softwares, APIs,
  • Platform examples Philips Semiconductors
    Nexperia, Texas Instruments Open Multimedia
    Applications Platform (OMAP), ARMs PrimeXsys,
    Infineons MGold Platform, and Intels Xscale
    Architecture.

Embedded SoC provider-integrator design
chain Martin, G., and F. Schirrmeister, IEEE
Computer, March 2002
17
Applications That Demand SoC
  • Multimedia Applications
  • Audio/Video/image codec
  • Graphics, rendering, visualization, virtual
    environment
  • Content analysis
  • Properties of MM Apps
  • Data intensive rather than control intensive
  • Bit operations
  • High-speed, real time operations
  • Continuous rather than intermittent operations
  • Communication applications
  • Software defined radio
  • Base station
  • Wireless Lan (802.1x)
  • Ad hoc network (Bluetooth)
  • Properties of Comm. Apps
  • Bit operations
  • High speed
  • Programmability
  • Portability
  • Low power

18
Multimedia SoC Design Issues
  • Communication
  • Cost of communication increases as feature size
    shrinking
  • Relative delay
  • Signal integrity
  • Overhead in clock buffer, bus driver, insulation
    all increase
  • Localized communication is more desirable than
    global communication (e.g. clock)
  • Off-chip communication with external memory
    sub-system costs much higher
  • Interface
  • Due to proliferation of different platforms,
    there is no unique, prevailing standards to
    define the interaction between different IPs.
  • Interface incompatibility requires custom design
    of interface modules
  • to convert data format,
  • to rearrange data movement patterns,
  • sometimes, incompatible IPs can not be used in
    the same design.

19
Communication Issue
  • Current communication methods
  • Bus
  • shared medium
  • Time shared access
  • Direct connection
  • Switches
  • Used mostly in FPGA or high performance PDSP (TI
    C80s, e.g.)
  • Parallel access
  • Direct connection
  • Programmable
  • Routers
  • RAW architecture
  • Network-on-chip
  • Incorporating layered network strategy (e.g the
    7-layer model of OSI) to manage the complexity of
    communication.
  • Similar to
  • wide area network
  • Parallel processor interconnection network
  • SoC distinct characteristics
  • On-chip communication,
  • Delay sensitive, power, etc.

20
Interface Issues
  • IP designer must make assumption on the data
    input output patterns and behaviors.
  • Since there is no standard available, interface
    can be very challenging.
  • Interface problems
  • Incompatible data format
  • Incompatible timing
  • Incompatible data organization
  • Etc.
  • Possible solutions
  • Standardization
  • May limit innovation and performance
  • Re-configurable interface
  • Based on description of interface requirements of
    interfacing IPs, automated configuration of
    necessary interface.

21
Evolution of Micro-Processor
  • Micro-processors implemented a central processing
    unit on a single chip.
  • Performance improved from 1MFLOP (1983) to 1GFLOP
    or above
  • Word length ( bits for register, data bus, addr.
    Space, etc) increases from 4 bits to 64 bits
    today.
  • Clock frequency increases from 100KHz to 1GHz
  • Number of transistors increases from 1K to 50M
  • Power consumption increases much slower with the
    use of lower supply voltage 5 V drops to 1.5V

22
Native Signal Processing
  • Use GPP to perform signal processing task with no
    additional hardware.
  • Example soft-modem, soft DVD player, soft MPEG
    player.
  • Reduce hardware cost!
  • May not be feasible for extremely high throughput
    tasks.
  • Interfering with other tasks as GPP is tied up
    with NSP tasks.
  • MMX (multimedia extension instructions) special
    instructions for accelerating multimedia tasks.
  • May share same data-path with other instructions,
    or work on special hardware modules.
  • Make use sub-word parallelism to improve
    numerical calculation speed.
  • Implement DSP-specific arithmetic operations, eg.
    Saturation arithmetic ops.

23
ASIC Application Specific ICs
  • Custom or semi-custom IC chip or chip sets
    developed for specific functions.
  • Suitable for high volume, low cost productions.
  • Example MPEG codec, 3D graphic chip, etc.
  • ASIC becomes popular due to availability of IC
    foundry services. Fab-less design houses turn
    innovative design into profitable chip sets using
    CAD tools.
  • Design automation is a key enabling technology to
    facilitate fast design cycle and shorter time to
    market delay.

24
Programmable Digital Signal Processors (PDSPs)
  • Micro-processors designed for signal processing
    applications.
  • Special hardware support for
  • Multiply-and-Accumulate (MAC) ops
  • Saturation arithmetic ops
  • Zero-overhead loop ops
  • Dedicated data I/O ports
  • Complex address calculation and memory access
  • Real time clock and other embedded processing
    supports.
  • PDSPs were developed to fill a market segment
    between GPP and ASIC
  • GPP flexible, but slow
  • ASIC fast, but inflexible
  • As VLSI technology improves, role of PDSP changed
    over time.
  • Cost design, sales, maintenance/upgrade
  • Performance

25
Multimedia Signal Processors
  • Specialized PDSPs designed for multimedia
    applications
  • Features
  • Multi-processing system with a GPP core plus
    multiple function modules
  • VLIW-like instructions to promote instruction
    level parallelism (ILP)
  • Dedicated I/O and memory management units.
  • Main applications
  • Video signal processing, MPEG, H.324, H.263, etc.
  • 3D surround sound
  • Graphic engine for 3D rendering

26
Re-configurable Computing using FPGA
  • FPGA (Field programmable gate array) is a
    derivative of PLD (programmable logic devices).
  • They are hardware configurable to behave
    differently for different configurations.
  • Slower than ASIC, but faster than PDSP.
  • Once configured, it behaves like an ASIC module.
  • Use of FPGA
  • Rapid prototyping run fractional ASIC speed
    without fab delay.
  • Hardware accelerator using the same hardware to
    realize different function modules to save
    hardware
  • Low quantity system deployment

27
Characteristics and Impact of VLSI
  • Characteristics
  • High density
  • Reduced feature size 0.25µm -gt 0.16 µm
  • of wire/routing area increases
  • Low power/high speed
  • Decreased operating voltage 1.8V -gt 1V
  • Increased clock frequency 500 MHz-gt 1GH.
  • High complexity
  • Increased transistor count 10M transistors and
    higher
  • Shortened time-to-market delay 6-12 months
  • The term VLSI (Very Large Scale Integration) is
    coined in late 1970s.
  • Usage of VLSI
  • Micro-processor
  • General purpose
  • Programmable DSP
  • Embedded m-controller
  • Application-specific ICs
  • Field-Programmable Gate Array (FPGA)
  • Impacts
  • Design methodology
  • Performance
  • Power

28
Design Issues
  • Given a DSP application, which implementation
    option should be chosen?
  • For a particular implementation option, how to
    achieve optimal design? Optimal in terms of what
    criteria?
  • Software design
  • NSP/MMX, PDSP/MSP
  • Algorithms are implemented as programs.
  • Often still require programming in assembly level
    manually
  • Hardware design
  • ASIC, FPGA
  • Algorithms are directly implemented in hardware
    modules.
  • S/H Co-design System level design methodology.

29
Design Process Model
  • Design is the process that links algorithm to
    implementation
  • Algorithm
  • Operations
  • Dependency between operations determines a
    partial ordering of execution
  • Can be specified as a dependence graph
  • Implementation
  • Assignment Each operation can be realized with
  • One or more instructions (software)
  • One or more function modules (hardware)
  • Scheduling Dependence relations and resource
    constraints leads to a schedule.

30
A Design Example
  • Consider the algorithm
  • Program
  • y(0) 0
  • For k 1 to n Do
  • y(k) y(k-1) a(k)x(k)
  • End
  • y y(n)
  • Operations
  • Multiplication
  • Addition
  • Dependency
  • y(k) depends on y(k-1)
  • Dependence Graph

a(1) x(1)
a(2) x(2)
a(n) x(n)
y(0)
y(n)
31
Design Example contd
  • Software Implementation
  • Map each op. to a MUL instruction, and each
    op. to a ADD instruction.
  • Allocate memory space for a(k), x(k), and
    y(k)
  • Schedule the operation by sequentially execute
    y(1)a(1)x(1), y(2)y(1) a(2)x(2), etc.
  • Note that each instruction is still to be
    implemented in hardware.
  • Hardware Implementation
  • Map each op. to a multiplier, and each op. to
    an adder.
  • Interconnect them according to the dependence
    graph

a(1) x(1)
a(n) x(n)
a(2) x(2)
y(0)
y(n)
32
Observations
  • Eventually, an implementation is realized with
    hardware.
  • However, by using the same hardware to realize
    different operations at different time
    (scheduling), we have a software program!
  • Bottom line Hardware/ software co-design. There
    is a continuation between hardware and software
    implementation.
  • A design must explore both simultaneously to
    achieve best performance/cost trade-off.

33
A Theme
  • Matching hardware to algorithm
  • Hardware architecture must match the
    characteristics of the algorithm.
  • Example ASIC architecture is designed to
    implement a specific algorithm, and hence can
    achieve superior performance.
  • Formulate algorithm to match hardware
  • Algorithm must be formulated so that they can
    best exploit the potential of architecture.
  • Example GPP, PDSP architectures are fixed. One
    must formulate the algorithm properly to achieve
    best performance. Eg. To minimize number of
    operations.

34
Algorithm Reformulation
  • Matching algorithm to architectural features
  • Similar to optimizing assembly code
  • Exploiting equivalence between different
    operations
  • Reformulation methods
  • Equivalent ordering of execution
  • (ab)c a(bc)
  • Equivalent operation with a particular
    representation
  • a2 is the same as left-shift a by 1 bit in
    binary representation
  • Algorithmic level equivalence
  • Different filter structures implementing the same
    specification!

35
Algorithm Reformulation (2)
  • Exploiting parallelism
  • Regular iterative algorithms and loop
    reformulation
  • Well studied in parallel compiler technology
  • Signal flow/Data flow representation
  • Suitable for specification of pipelined
    parallelism

36
Mapping Algorithm to Architecture
  • Scheduling and Assignment Problem
  • Resources hardware modules, and time slots
  • Demands operations (algorithm), and throughput
  • Constrained optimization problem
  • Minimize resources (objective function) to meet
    demands (constraints)
  • For regular iterative algorithms and regular
    processor arrays -gt algebraic mapping.

15
37
Mapping Algorithms to Architectures
  • Irregular multi-processor architecture
  • linear programming
  • Heuristic methods
  • Algorithm reformulation for recursions.
  • Instruction level parallelism
  • MMX instruction programming
  • Related to optimizing compilation.

38
Arithmetic
  • CORDIC
  • Compute elementary functions
  • Distributed arithmetic
  • ROM based implementation
  • Redundant representation
  • eliminate carry propagation
  • Residue number system

14
39
Low Power Design
  • Device level low power design
  • Logic level low power design
  • Architectural level low power design
  • Algorithmic level low power design
Write a Comment
User Comments (0)
About PowerShow.com