C64x DSP in Embedded Systems - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

C64x DSP in Embedded Systems

Description:

20MHz / 5MIPS (58.000 transistors) 1990 First TI TMS320C50 DSP ... Bit interleaving. Variable shift Rotation. 32-bit add and subtract. Address calculation ... – PowerPoint PPT presentation

Number of Views:104
Avg rating:3.0/5.0
Slides: 34
Provided by: eric409
Category:

less

Transcript and Presenter's Notes

Title: C64x DSP in Embedded Systems


1
C64x DSP in Embedded Systems
  • Matthias Kassner
  • OMAP FAE
  • Texas Instruments

2
History of TI DSPs
  • 1982 First TI DSP
  • TMS320C10
  • 20MHz / 5MIPS (58.000 transistors)
  • 1990 First TI TMS320C50 DSP
  • 60MHz / 30MIPS (1 Million transistors)
  • More than two billion C5x family DSPs have been
    sold to date
  • 1997 First C62 and C67 DSPs
  • 200MHz / 1600MIPs
  • 2001 First C64x DSP
  • 600MHz / 4800MIPS
  • 2004 First 1GHz C64x DSP
  • Used for high performance DSP applications
  • 2006 First C64x DSP in OMAP
  • 330MHz
  • Used as low power multimedia accelerator

3
TMS32C10 Architecture
4
The Basic DSP Algorithm
Coefficients
Results
yn S (ak xk) y0 a0x0 a1x1 a2x2 a3x3
- 4
5
C64x DSP Hardware
6
C64x DSP Core
7
C64x Functional Units
  • C64x DSP core uses four different functional
    units
  • Each unit provides specific capabilities
  • Many instructions can be executed on more than
    one unit

8
Pipeline - Basics
  • Processor instructions typically require several
    consecutive activities
  • Fetch program word from memory
  • Decode (determine from the program word what to
    do)
  • Execute the command
  • Write (store the result in a register or in
    memory)
  • Problem Each of these acitivities takes at least
    one cycle
  • Execution of every instruction would take at
    least four cycles
  • Idea Start to execute the next instructions
    before the current instruction has completed

9
Pipeline - Basics
10
Pipeline Advantages
  • Big improvement in performance for linear program
    sequences
  • Improvement increases with the number of pipeline
    stages
  • Better hardware partitioning
  • Many smaller hardware blocks instead of one big
    block
  • Better performance with slow operations
  • Pipeline stages can be added for slow operations
    such as memory accesses

11
Pipeline Disadvantages
  • Pipeline must be full to provide the advantages
  • Pipeline latency (time to execute the first
    instruction) equals the pipeline length
  • Operations such as branches can result in an
    empty pipeline
  • Pipeline can introduce so-called pipeline hazards
  • Result of an instruction is needed by the next
    instruction before it is available
  • Protected Pipelines automatically wait for the
    result to be available
  • Pipeline protection complexity increases with
    pipeline length
  • Most high-performance processors (such as C64x)
    use unprotected pipelines
  • Pipeline can result in ressource conflicts
  • Different stages might try to access the same
    ressources (e.g. Memories)

12
Scalar, Super-scalar and VLIW Architectures
13
OMAP
14
Application Processors
Communications Processor
Applications Processor
User Interface
Air Interface
  • Real-time media processing
  • RTOS
  • Layers 1-3 of Comm Modem
  • Non real-time appl control
  • Advanced O/S
  • User Interface

15
OMAP TI Application Processor Family
OMAP1710
OMAP2420
OMAP3430
OMAP2430
Multimedia Processor (high-end)
Smartphone Processor (with Modem)
OMAP850
OMAPV1030
OMAPV1230
OMAP-DM290
OMAP-DM299
OMAP-DM510
Multimedia Accelerators
16
OMAP3430 High-level Architecture
17
C64x Integration into OMAP3430
IVA22 Megacell
PRCM
1 Wake-up Event
IVA Interrupt Controller
256b
L1 Program Memory Cache
48 Interrupts NMI
256b
256b
OMAP3430 Modules
4 x 64b
IVA DMA
20 DMA Requests
C64x DSP Core
256b
64b
L2 Unified Memory Cache
Extended Memory Controller
Local Resource Controller
A Registers
B Registers
ARM Interrupt Controller
IVA MMU
256b
64b
1
64b
M
L
D
S
M
L
D
S
32b
Config
Host Interface
64b
256b
256b
64b
64b
256b
L1 Data Memory Cache
64b
32b
OMAP3430 System Interconnect
18
DSP Operating System
  • DSP BIOS

19
What is DSP/BIOS?
  • DSP BIOS is a modular DSP Operating System
  • Deterministic scheduler
  • Low footprint (only uses needed modules)
  • Low latency
  • Used in products with most handset manufacturers
  • Supports all TI DSPs
  • Easy to use with graphical configuration
    interface
  • Free of charge, no royalties

20
DSP BIOS Modules
21
Example - Interrupt Vector Setup
22
Real-time Analysis
Message Logs

CPU Load
Thread Statistical Information
Execution Graph (Software Logic Analyzer)
23
Debug Tools
24
Debug Basics
25
How does debugging work?
  • Debugging requires data and command exchange
    between the Host (PC) and the Target processor
    (DSP).
  • Data Exchange enables
  • Program Download
  • Data manipulation (Register Setting, Memory
    read/write...)
  • Command Exchange enables
  • Execution control (Breakpoints, Watchpoints,
    Single Step...)
  • Processor State control (Reset, Restart, Run,
    Halt...)
  • Two basic options exist for this data exchange
  • Use a combination of
  • Target Hardware for physical interfaceing and
  • Target Software for execution control
  • ?Bootloader
  • Use dedicated Target Hardware only ? JTAG
    interface

26
The JTAG Interface
  • Developed to test devices that are soldered
    already into boards
  • Old Problem How to access the device pins?
  • New Solution Boundary Scan Buffer
  • Boundary Scan Buffers
  • Critical signals are not routed directly from the
    core to the pins
  • Signals go through special Boundary Scane Buffers
  • Buffer state mirrors the signal level on the
    signal line
  • Buffer state can be set and read
  • Pins can be disconnected from core
  • No interaction with external circuitry
  • Internal signals can be set / read independently
    of external world

27
JTAG Scan Chain
  • Buffer inputs / outputs of the buffers are not
    output individually
  • They are chained together like a shift register
  • Required Signals
  • Input
  • Output
  • Clock
  • Control

28
Debug Solutions
29
TI Code Composer Studio
  • CCS is an integrated development environment for
    DSP and ARM processors
  • It integrates
  • Editor
  • Code Generation Tools (Compiler, Assembler,
    Linker)
  • DSP BIOS Operating System Tools
  • Debugger with Breakpoint, Probepoint Capability
  • Real Time Data Exchange between Host and Target
  • It is flexible
  • Can be extended with user-written Plug-ins
  • Standardized API
  • Will soon move to Eclipse environment

30
CCS IDE
Menu Bar
Icon Bars
Source Code Editor
Project View
Output Windows
Message View
Status Bar
31
Lauterbach Debugger
  • Lauterbach is the industry standard ARM debugger
  • Fast, efficient and stable
  • Very light-weight, fast and fully-customizable
    GUI
  • Rock-solid
  • It is nearly impossible to get a Lauterbach to
    crash
  • Fast JTAG access
  • Download speed in excess of 1000 Kbytes per
    second to ARM
  • Successful and reliable
  • Best selling debug tools set in the world
  • Approximately 50,000 systems in use world wide
  • LB has been evaluated by all major telecoms
    outside of China
  • LB is the tool of choice in almost all of them
    (so I heard)

32
TRACE32 GUI
33
Demo
  • OMAP2430 Video Decoding by C64
Write a Comment
User Comments (0)
About PowerShow.com