Title: Architecture Description Language Driven Design Space Exploration in the Presence of Coprocessors
1Architecture Description Language DrivenDesign
Space Exploration in thePresence of Coprocessors
Prabhat Mishra, Nikil Dutt, Alex Nicolau ACES
Lab Center for Embedded Computer
Systems University of California, Irvine, USA
Frédéric Rousseau TIMA Lab System Level
Synthesis Group University of Grenoble, France
2Motivation
Application program
Objective
To find the best architecture
Needed
Architecture code processor core ? memory
? Coprocessor ?
Framework to specify and to quickly generate
toolkit
3Embedded System Design
- What kind of system ?
- Coprocessor inside or outside of the processor
core - What kind of problems we deal with ?
- compare coprocessor models
- compare coprocessor vs functional unit
- memory architecture
- optimize memory access
- How to implement a specific functionality ?
- using a coprocessor (in or out of the processor
core) - implementing this functionality in software
- adding a functional unit in the processor
- modifying the existing functional unit
4The flow in our approach
Processor IP Library
Memory IP Library
Coprocessor IP Library
Processor Core
Memory Subsystem
Coprocessor
Feedback
EXPRESSION ADL
EXPRESS Compiler
SIMPRESS Simulator
object pgm
Application
5Experiments
We performed Co-exploration of TIC6201
architecture in the presence of coprocessor using
DSPStone benchmarks
- Objective
- to show that a quick design space exploration is
possible - Example
- performance impact of using coprocessor to
support vector multiplication - Benchmark used
- DSPStone fixed point benchmarks with
multiplications
6TI C62x Architecture with Coprocessor
7ADL description of the TI C6211 with Coprocessor
The TI C6211 is an 8-way VLIW DSP 4 fetch
stages 2 decode stages 8 functional unit 1
coprocessor
ADL description (PIPELINE PG PS PW DP
DC Execute) (Execute (ATERNATE COPRO L1
S1 M1 D1 D2 M2 S2 L2) (COPRO (PIPELINE
CP_1 EMIF_1 Copro CP_2 EMIF_2))
8Coprocessor description
CP_1 decodes the instruction, determines the
size of the input data and starting address in
the main memory EMIF_1 asks the DMA to read
Data from the main memory (and save data in a
MemoryCopro) COPRO performs computations CP_2
examines the size of the results and the starting
address in the main memory to store the
results EMIF_2 requests the DMA to read values
in the MemoryCopro and write them in the main
memoy Memory Copro coprocessor local memory
DMA allows to transfer data between the
MemoryCopro and the main memory
CP_1
EMIF_1
DMA
COPRO
Memory Copro
CP_2
EMIF_2
9Experiments equivalent program
// instruction for the coprocessor VectMult(a,
b, z, n)
// Multiplication in the application program for
(i0 i lt n i) zi ai bi
// Translated pseudo assembly that can run on
TIC62c MOV i, 0 L5 LOAD x, mem_address(ai) LO
AD y, mem_address(bi) MUL t, x, y STORE t,
mem_address(zi) INC i LT cc i n IF cc L5
10Performance with or without coprocessor
11Performance analysis for Convolution benchmark
with coprocessor