Title: Efficient Code Synthesis from Extended Dataflow Graphs for Multimedia Applications
1Efficient Code Synthesis from Extended Dataflow
Graphs for Multimedia Applications
- Hyunok Oh and Soonhoi Ha
- Seoul National Univ.
- Korea
2Outline
- Introduction
- Buffer minimization techniques
- Extended dataflow
- FRDF (Fractional rate dataflow)
- Buffer sharing
- Conclusion
3Code Synthesis from Dataflow Graph
Dataflow program graph
H.263 Encoder
4Software Synthesis from Dataflow Model
- Why?
- DSP applications
- COSSAP, GRAPE, SPW
- Design reuse
- IP blocks
- Why not?
- Limited expression power
- Resource
- Memory
5Software Synthesis Example
SDF(Synchronous Dataflow)
1
2
1
2
A
B
D
1
2
C
Schedule 2(A)CB2(D)
6Memory Size Example
H.263 Encoder
DCT
Zigzag
Q
Well defined and optimized function module
Code Size
Data Size
Reference Code TMN 2.0
47KB
361KB
7Previous Works
- Optimal scheduling
- Minimize code and data memory size
- Minimize data memory size by looping
- Buffer sharing
- Compiler field
- Lifetime analysis
8Problem and Solutions
Composite data type - multimedia
applications - Frame, Macro block
Buffer reuse
Buffer requirement
Large buffer size
Extend dataflow
Buffer sharing
9Outline
- Introduction
- Buffer minimization techniques
- Extend dataflow
- Buffer sharing
- Conclusion
10SDF Specification
H.263 Encoder in SDF
1
1
Motion Estimation
Distributor
Macro Block Encoding
Variable Length Coding
Read From Device
1
1
99
1
1
99
1
1
Macro Block Decoding
Motion Compensation
Write To Device
1
1
99
1
1
1
11Experiment of H.263 Encoder
Reference Code (TMN 2.0)
SDF
361KB
686KB
Buffer size
12Motivating Example
SDF
current frame
1
ME
D
EN
1
1
1
99
1
previous frame
1
99x(16x16)
176x144
Schedule (ME,D)99(EN)
13Observation
- ME(motion estimation) node in hand-optimized code
- need not produce the frame-size output at once
- Generates output samples at the unit of macro
block for short latency and minimizing buffer
memory
for(i0 ilt99 i) MotionEstimation(motion_ve
ctor,macroblock, currFrame,prevFrame)
DCT(macroblock) Quantization(macroblock)
14Fractional Rate Dataflow
FRDF
ME
EN
1
1
1
16x16
Schedule 99(ME,EN)
15Experiment
H.263 Encoder in FRDF
1
1/99
Motion Estimation
Macro Block Encoding
Variable Length Coding
Read From Device
1/99
1
1
1
1
1/99
Macro Block Decoding
Motion Compensation
Write To Device
1
1
1
1/99
1/99
1
16Experiment of H.263 Encoder
Reference Code (TMN 2.0)
SDF
FRDF
361KB
686KB
225KB
Buffer size
5
8
3
No. of frame Type data
- refer to LCTES/SCOPES 2002
17Outline
- Introduction
- Buffer minimization techniques
- Extended dataflow
- Buffer sharing
- Problem definition
- Buffer sharing with same-size samples
- Buffer sharing with different-size samples
- Conclusion
18Motivating Example
Simplified H.263 Encoder
1
1
1
1
ME
Trans
InvTrans
1
1
1
19Previous Approach
b
a
buffer
1
1
1
1
ME
Trans
b
b
1
1
InvTrans
1
a
a
c
c
c
Trans
InvT
ME
3 buffers
Buffer lifetime chart
3x(176x144) 76KB
20Our Approach
Pointer buffer Data buffer
Global data buffer
g1
g2
Local pointer buffer
ME
Trans
InvTrans
1
2 global buffers 6 local buffers
2x(176x144)6x4 51KB
21Buffer Sharing Problem
- Determine both the local buffer sizes and the
global buffer sizes - for the objective of minimizing the sum of them
- given program graph
- sample lifetime chart
- given schedule
22Proposed Heuristic Step
global data buffer size
23Outline
- Introduction
- Buffer minimization techniques
- Extended dataflow
- Buffer sharing
- Problem definition
- Buffer sharing with same-size samples
- Buffer sharing with different-size samples
- Conclusion
24Sample Lifetime Chart
samples
1
1
1
1
A
B
C
a
b
s(c,2)
1
1
1
c
s(b,1)
Simplified H.263 Encoder
s(a,1)
s(c,1)
B
C
A
Sample lifetime chart
25Subproblem 1 Global Data Buffer Minimization
take out a sample lifetime with earliest start
time
global buffer
samples
s(c,2)
Interval Scheduling
s(b,1)
s(a,1)
s(a,1)
s(c,2)
g(1)
s(c,1)
s(b,1)
s(c,1)
g(0)
B
C
A
B
C
A
26Subproblem 2Local Buffer Size Determination
Dynamic binding
Static binding
g1
g2
g1
g2
1
1
1
1
1
1
1
1
ME
Trans
ME
Trans
1
1
1
1
InvTrans
InvTrans
1
1
Local buffer size the maximum number of live
samples at any time
Local buffer size ?
27Subproblem 3Repetition Period
global buffer
s(a,1)
s(c,2)
g(1)
s(b,1)
s(c,1)
g(0)
B
C
A
28Code Generation with Static Binding
Global buffer
struct Frame g2 main() struct G
a2g1,g, b2g,g1, c2g,g1
int in_A0, out_A0, in_B0, out_B0, in_C0,
out_C1 for(int i0iltmax_iterationi)
// As codes. Use cin_A and aout_A.
in_A (in_A1)2 out_A (out_A1)2
// Bs codes. Use ain_B and bout_B.
in_B (in_B1)2 out_B (out_B1)2
// Cs codes. Use bin_C and cout_C.
in_C (in_C1)2 out_C (out_C1)2
Local buffer
Buffer offset update
29Code Generation with Dynamic Static Binding
Global buffer
struct Frame g2 main() struct G a, b,
c2g,g1 int in_A0, out_C1 for(int
i0iltmax_iterationi) a
c(i1)2 // As codes. Use cin_A and
a. in_A (in_A1)2 b
ci2 // Bs codes. Use a and b.
// Cs codes. Use b and cout_C.
out_C (out_C1)2
Local buffer
Pointer update
Buffer offset update
30Outline
- Introduction
- Buffer minimization techniques
- Extended dataflow
- Buffer sharing
- Problem definition
- Buffer sharing with same-size samples
- Buffer sharing with different-size samples
- Conclusion
31Global Buffer Minimization
100
99
1
B
C
a
b
c
A
F
99
100
1
D
E
d
e
f
b
f
d
a
e
c
32Repetition Period of Sample Lifetime Patterns
offset
x
h
t
y
After interval allocation
33Experiments of Buffer Sharing
Example
JPEG
MP3
Simplified H.263
H.263
of samples
6
336
3
1804
No sharing
1536B
36KB
111KB
659KB
Sharing of same size
512B
23KB
111KB
510KB
Sharing w/o separation
512B
11KB
111KB
510KB
Sharing with separation
-
-
74KB
396KB
34Synthesized Code vs. Reference Code
H.263 Encoder
Reference Code (TMN 2.0)
SDF
FRDF
FRDF buffer sharing
361KB
686KB
225KB
Buffer size
219KB
35Conclusion
- Efficient code synthesis techniques from dataflow
graphs for multimedia applications - Extend existing dataflow to FRDF
- Share buffers based on local and global buffer
separation - Buffer sharing
- Global data buffer minimization
- Repetition period of sample lifetime patterns
- Local pointer buffer determination
- Reduces the memory requirement by 37 more than
the reference code
36Appendix
http//peace.snu.ac.kr/research/peace