Efficient Software Performance Estimation Methods for Hardware/Software Codesign - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Efficient Software Performance Estimation Methods for Hardware/Software Codesign

Description:

Efficient Software Performance Estimation Methods for Hardware/Software Codesign. Kei Suzuki ... into hardware and software parts, and also for scheduler ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 22
Provided by: yan50
Category:

less

Transcript and Presenter's Notes

Title: Efficient Software Performance Estimation Methods for Hardware/Software Codesign


1
Efficient Software Performance Estimation Methods
for Hardware/Software Codesign
  • Kei Suzuki
  • Alberto Sangiovanni-Vincentelli
  • Present Yanmei Li

2
Introduction
  • One of the most important purposes of hw/sw
    codesign is to find the optimum hw/sw partition
    of a system level specification under particular
    criteria
  • Criteria
  • Performance(speed, or the number of clock cycles)
  • Cost(number of components, die size, or code
    size)
  • Estimation
  • At a lower abstraction level
  • easy and accurate, but long design iteration time
  • At a higher abstraction level
  • reduce the exploring time
  • Play an important role in the synthesis and
    optimization

3
Software Performance Estimation
  • Cost of a mixed hw/sw system based on a standard
    micro-processor depends on the hw size
  • Solution Implement a given functionality with a
    program on the microprocessor
  • Problem Software implementation often fails to
    meet the performance requirement
  • Tradeoff
  • To implement the critical portion in the program
    with hardware
  • Software performance estimation is the key

4
POLIS System
  • CFSMs (Codesign Finite State Machines)
  • Does not discriminate between hw and sw
  • Estimation provides preliminary timing
    information and also a measure for hw/sw
    partitioning
  • A partitioning process takes place to identify
    the candidate components for sw implementation
  • S-Graph (Software graph)
  • To optimize the trade-off between the performance
    and the code size of the final implementation
  • Estimation is helpful for s-graph optimization
    and sw module scheduling

5
Related Work
  • Software performance depends on the structure of
    the software program as well as on the components
    of the target system
  • The structure of the software program is more
    difficult to estimate as the abstraction level
    rises
  • Most of the results are from the object code
    level which is the lowest level of abstraction,
    and are concerned with software that has a
    limited structure
  • A number of approaches have been proposed
  • A simple prediction method
  • Statistical methods

6
Abstraction Models in POLIS
  • CFSM
  • HW be mapped into an abstract hardware
    description format, and synthesized into a
    combinational circuit and a set of latches
  • SW be is translated into a data structure called
    s-graph

7
Abstraction Models in POLIS
  • S-Graph
  • A DAG(directed acyclic graph) with one source
    node and one sink node
  • Represent the control flow of a given behavior
  • Four types of node BEGIN, END, TEST, ASSIGN

8
S-Graph
  • Semantics
  • Start with the BEGIN node
  • Traverse each node along its edge, until reaching
    the END node
  • At a TEST node, select one corresponding child
    with the value of the associated predicate P(V)
  • At an ASSIGN node, assign the value of the
    associated function A(V) to the output variable z
  • Translate an s-graph into a C program
  • Traverse the graph in a depth-first manner
  • TEST if (or switch) statement
  • ASSIGN assignment statement
  • The resulting C program has the same structure

9
Performance Estimation Methods
  • Modeling the target system
  • The structure of C code generated by POLIS
  • Function() (1)
  • Initialization of local variable(assignment
    statements) (2)
  • Structure of mixed if or switch statements and
    assignment statements (3)
  • Return (4)

10
Modeling the Target System
  • Execution time
  • TTpp k Tinit Tstruct
  • Code size
  • SSpp k Sinit Sstruct
  • Tpp (Spp) for entering and exiting the function
    (1)(4)
  • Tinit ( Sinit)for initializing local
    variables(2).
  • k is the number of local variables.
  • Tstruct (Sstruct)for the structure of mixed
    conditional statements generated from TEST nodes
    and assignment statements generated from ASSIGN
    nodes(3).

11
Modeling the Target System(cont.)
  • Tpp, Spp , Tinit , Sinit are constant which can
    be determined beforehand
  • Tstruct SPi Ct (node_type_of(i),
    variable_type_of(i))
  • Sstruct SCs (node_type_of(i), variable_type_of(i)
    )
  • Pi 1 if node i is on a path, otherwise Pi 0
  • Ct and Cs can be obtained by using simple
    benchmark programs containing a mix of the C
    statement that appears in the generated C
    programs and analyzing the execution time and
    code size of the programs on the target compiler
    and the target CPU

12
Benchmark Model
  • Four attributes to characterize a system
  • Name of the parameter set, a name for a unit of
    execution time, a name for a unit of code size,
    and the size of an integer variable
  • seventeen cost parameters to model the execution
    time, and fifteen cost parameters to model the
    code size
  • A TEST node with an event-type variable/multi-valu
    ed variable with a bit mask/multi-valued variable
  • An ASSIGN node with an event-type variable/which
    assigns a constant to a variable/which assigns
    one variable to another one
  • Pre-processing and post-processing
  • A branch operation
  • Initialization of a local variable
  • Average execution time and size for pre-defined
    software library functions
  • The size of pointers
  • The size of integer variables

13
S-graph Level Estimation
  • Property
  • Property 1. Each node in an S-graph has a
    one-to-one correspondence with only a few
    statements in the synthesized C code
  • Property 2. The form of each statement is
    determined by the type of corresponding node
  • Property 3. The S-graph is a DAG, hence it does
    not include loops in its structure
  • Each node/edge is weighted according to
    pre-calculated cost parameters in the pre-process

14
S-graph Level Estimation
  • Algorithm SGtrace(sgi)
  • If (sgiNULL) return (C(0,8,0))
  • If(sgi has been visited)
  • return (pre-calculated Ci(,,0) associated with
    sgi)
  • Ciinitialize (max_time0 min_time8
    code_size0)
  • For each child sgj of sgi
  • CijSGtrace(sgj)edge cost for edge eij
  • If(Cij.max_timegt Ci.max_time)
  • Ci.max_time Cij.max_time
  • If(Cij.min_timelt Ci.min_time)
  • Ci.min_time Cij.min_time
  • Ci.code_size Cij.code_size
  • Ci node cost for node sgi
  • Return(Ci)

15
S-graph Level Estimation
  • The computational complexity O(E)
  • Average execution time
  • Cave SPij (Ct (node_type_of(i),
    variable_type_of(i)) Ce (i,j))
  • Pij is the possibility of executing node i and
    going to node j
  • Ce (i,j) is the edge cost for edge eij

16
CFSM Level Estimation
  • Is much more difficult since a CFSM model does
    not closely reflect the code structure
  • MDDs are used to represent the transition
    relation function of a CFSM (a node represents a
    multi-valued variable ordering is important)
  • The estimation algorithm of the MDD is based on
    the assumption that the maximum(minimum) cost
    path in an MDD is usually the maximum (minimum)
    cost path in the s-graph that is generated from
    the MDD
  • Also based on recursive DFS traversing algorithm
  • There is no relation between the code size of the
    number of the MDD nodes

17
Experimental Results(1)
18
Experimental Results(2)
19
Experimental Results(3)
  • Compared to an assembly-level analysis
  • S-graph(Table 1)
  • The differences in the maximum execution time are
    within (-10, 10)
  • The differences in the minimum execution time are
    within (-20,20)
  • The differences in code size are within
    (-20,20)
  • CFSM(Table 2)
  • The differences in the maximum execution time are
    within (-10,25)
  • The differences in the minimum execution time are
    within (-20,20)

20
Conclusions
  • S-graph level method
  • provides an accurate estimation for all analysis
    the maximum and minimum execution time, and code
    size.
  • It is a useful technique for optimization in
    software synthesis because of its accuracy.
  • CFSM level method
  • is less accurate than the s-graph estimation, but
    it is still accurate enough when estimating the
    maximum and minimum execution time.
  • is important for automatic partitioning of CFSMs
    into hardware and software parts, and also for
    scheduler generation.

21
Conclusions
  • Two software performance estimation methods for
    use with the POLIS hardware/software codesign
    system are proposed in this paper.
  • S-graph level method
  • CFSM level method
  • The experimental results showed that the accuracy
    of both proposed methods is high enough for use
    in the POLIS system.
Write a Comment
User Comments (0)
About PowerShow.com