Memory Efficient Software Synthesis from Dataflow Graph - PowerPoint PPT Presentation

About This Presentation
Title:

Memory Efficient Software Synthesis from Dataflow Graph

Description:

Memory Efficient Software Synthesis from Dataflow Graph Wonyong Sung, Junedong Kim, Soonhoi Ha Codesign and Parallel Processing Lab. Seoul National University – PowerPoint PPT presentation

Number of Views:82
Avg rating:3.0/5.0
Slides: 21
Provided by: edut1550
Category:

less

Transcript and Presenter's Notes

Title: Memory Efficient Software Synthesis from Dataflow Graph


1
Memory Efficient Software Synthesis from Dataflow
Graph
  • Wonyong Sung, Junedong Kim, Soonhoi Ha
  • Codesign and Parallel Processing Lab.
  • Seoul National University

2
Contents
  • Introduction
  • Code Generation from Block Diagram Specification
  • Synchronous Data Flow and Single Appearance
    Schedule
  • Proposed Strategies
  • Optimization 1 code sharing optimization
  • Optimization 2 minimize buffer requirement
  • Experiments
  • Conclusions

3
Introduction
  • Motivations
  • Embedded system has limited amount of memory
  • large program memory cost, performance
    penalty, power consumption
  • New trend of software development high level
    design methodology
  • growing complexity, fast design turn-around
    time, limited budget, etc.
  • Goal of Research
  • Reduce the code and data size of automatically
    generated software
  • In an automatic software synthesis environment
  • Specification Dataflow graph with
    SDF(Synchronous DataFlow) semantics

4
Software Synthesis from SDF graph
main() for(i0ilt6i)A for(i0ilt4i)B
for(i0ilt3i)C for(i0ilt2i)D main(
) for(i0ilt2i) for(j0jlt3j)A for(
j0jlt2j)B for(i0ilt3i)C for(i0i
lt2i)D
B
1
3
2
2
A
D
3
1
2
C
2
Possible Schedules AABCABACDABABCD
(6A)(4B)(3C)(2D) (2(3A2B))(3C)(2D)
Single Appearance Schedule (SAS)
5
Previous Efforts
  • Single Appearance Schedule (SAS) APGAN,RPMC
  • by Battacharyya et. al. in Ptolemy Group
  • SAS guarantees the minimum code size (without
    code sharing)
  • APGAN,RPMC heuristics to find data minimized
    SAS schedule
  • ILP formulation for data memory minimization
  • by Ritz et. al. in Meyr Group
  • flat single appearance schedule sharing of data
    buffer
  • Rate optimal compile time schedule
  • by Govindarajan et. al. in Gao Group
  • tried to minimize the buffer requirement using
    linear programming
  • An algorithm to compute the smallest data buffer
    size
  • by Ade et. al. in GRAPE group

6
Proposed Strategies
  • Coding style
  • not stuck to one coding style, hybrid approach
  • generated code is a mixture of inlines and
    functions
  • Optimization 1 Code Sharing
  • Multiple instances of a same kernel treated as
    different node in SAS
  • Code sharing optimization has gain(block size)
    and cost(context size)
  • Optimization 2 Schedule Adjustment
  • give up single appearance schedule to reduce the
    data size
  • (1) represents schedule information with BTLC
    data structure
  • (2) find possible location for adjustment
  • (3) schedule adjustment

7
Flowchart of Optimization Procedure
Get SAS schedule RPMC,APGAN
code-block size context size
Code sharing optimization
Schedule Adjustment
C code generation
8
Example of Code Sharing (CD2DAT)
ramp
sine
?
fir1
fir2
fir3
fir4
xgraph
ramp
sine
xgraph
Code before sharing for(int i0ilt2i)
/ code for fir1 / out
tapinputi / code for fir
2 / ..
Code after sharing for(int i0ilt2i)
fir(1) for(int i0ilt3i) fir(2) void
fir(int context) context_FIRcontext.
out...
context definition typedef struct double
out int output_ofs int
output_bs int output_nx .
double decimation double
tap context_FIR
9
Code Size Overhead (in Sparc/Solaris)
without context
with context
.. value ..
(context_CGCRampcontext.value) ldd fp
-336,o0 sethi hi(0x20800),o1 ld
o10x3c8, o0 mov o0, o2 sll o2,
2, o1 add o1, o0, o1 sll 01, 3,
o0 add fp, -424, o1 add o1, o0,
o2 ld o2 0x1c, o0 ldd o0, o2
4 bytes
40 bytes
Reference Overhead 36 bytes!
10
Optimization 1 Code sharing
  • Multiple instances of a same kernel have their
    own contexts
  • Kernel code should be transformed into shared
    version function
  • Shared Version
  • references are only through context variable
  • Gain and cost of sharing
  • Gain ( instances -1) ? (code block size)
  • Cost (instances) ? (context variable size)
    (code block overhead)
  • Code sharing is performed only when the gain is
    larger than the cost

11
Decision Formula
(1) ? code sharing overhead ?context
?reference (2) ?context ??pi?(pi), pi ?
ports where, ?(x) 3sizeof(int)
sizeof(pointer) (3)
?reference ?t ?S,C,AS,AP(?(t)??(t))
?(t) reference count ?(t) unit overhead
t type of reference (4) ? code block
size (5) ? number of instances
12
Optimization 2 Adjusting SAS
  • Adjusting Single Appearance Schedule
  • 2(7A3B)5C gt 51
  • 2(7A3B2C)C gt 39
  • give up single appearance schedule
  • BTLC (Binary Tree with Leaf Chain)

G
5
2
6,0,0 input, inside, output
3
7
0,0,21
21,0,15
7,0,5
0,0,3
13
Computation of Buffer Requirements
2
7
3
7
5
3
A
B
21
30
14
Flowchart of Schedule Adjustment
SAS schedule
Construct BTLC
Compute buffer requirement
Find candidate for adjustment
no
found
yes
Adjust schedule (split a chain)
Done code generation
15
Splitting A Chain
0,30,0
  • Finding split candidate
  • a chain which has the largest number
  • in this example BC is selected
  • Schedule after splitting
  • 2(7A3B2C)C
  • In general, for a schedule that has two clusters
    aCabCb(a and b are loop counts) new schedule is
    defined as
  • a(Ca(b/a)Cb)(ba)Cb) , if altb
  • (ab)Ca b((b/a)CaCb ), otherwise

30,0,0
0,21,30
21,0,15
0,0,21
6,0,0
30
21
Split point
0,0,3
7,0,5
Schedule 2(7A3B)5C
16
Decision Formula
G
0,6,0
0,12,6
6,0,0
2
1
12,0,0
0,21,15
C
1
2
Cluster W value of the cluster
6,0,0
21,0,15
6
New Schedule 2(7A3B2C)C Gain 12
0,0,21
7
3
C
6,0,0
12
21
A
B
7,0,5
0,0,3
17
Experiment CD2DAT
18
Experimental Result
Program size after each optimization
CD2DAT Filter Bank SAS 13672 28512 Code
Sharing 12768 22024 Schedule
Adjustment 12296 22024
Memory behavior of CD2DAT in ARM7
Fetches Miss SAS 17098177 57189 Code
Sharing 17573923 52867 Schedule
Adjustment 17499386 54331
19
Conclusion
  • Our Environment
  • PeaCE Ptolemy extension as Codesign Environment
  • Optimization Techniques in Software Synthesis
  • For automatic code generation from dataflow graph
  • Joint minimization of code and data size
  • Selective application code sharing and schedule
    adjustment to SAS
  • Future works
  • Clustering multiple fine grain nodes into a
    large one
  • increase chance of code sharing
  • Buffer sharing
  • further reduce the buffer size and increase the
    cache effect

20
Thank You !
Write a Comment
User Comments (0)
About PowerShow.com