Design and Synthesis of Image Processing Systems using Reconfigurable Dataflow Graphs - PowerPoint PPT Presentation

About This Presentation
Title:

Design and Synthesis of Image Processing Systems using Reconfigurable Dataflow Graphs

Description:

Design and Synthesis of Image Processing Systems using Reconfigurable Dataflow Graphs Mainak Sen and Shuvra S. Bhattacharyya Department of Electrical and Computer ... – PowerPoint PPT presentation

Number of Views:148
Avg rating:3.0/5.0
Slides: 37
Provided by: ShuvraSBh
Category:

less

Transcript and Presenter's Notes

Title: Design and Synthesis of Image Processing Systems using Reconfigurable Dataflow Graphs


1
Design and Synthesis of Image Processing Systems
using Reconfigurable Dataflow Graphs
  • Mainak Sen and Shuvra S. Bhattacharyya
  • Department of Electrical and Computer
    Engineering, andInstitute for Advanced Computer
    StudiesUniversity of Maryland at College Park
  • Maryland DSPCAD Research Grouphttp//www.ece.umd.
    edu/DSPCAD/home/dspcad.htm

November 22, 2005Leiden University, The
Netherlands
2
Outline
  • Dataflow-based model of computation for modeling
    the behavior of DSP applications
  • Decidable dataflow models
  • Example use of decidable dataflow as a model of
    computation for modeling the mapping of
    (decidable) dataflow behaviors onto embedded
    multiprocessors
  • Structured reconfiguration of dataflow graphs
  • Examples of meta-modeling techniques that can be
    classified as structured, reconfigurable dataflow
  • Parameterized dataflow and its application to SDF
  • Homogeneous-parameterized dataflow and its
    application to SDF and CSDF
  • Experiments on a gesture recognition application
  • Summary

3
Dataflow-based design for DSP(Example from
Agilent ADS tool)
4
DSP-oriented Dataflow Models of Computation
  • Used widely in design tools for DSP
  • Application is modeled as a directed graph
  • Nodes (actors) represent functions
  • Edges represent communication channels between
    functions
  • Nodes produce and consume data from edges
  • Edges buffer data in FIFO (first-in first-out)
    fashion
  • Data-driven execution model
  • A node can execute whenever it has sufficient
    data on its input edges
  • The order in which nodes execute is not part of
    the specification
  • The order is typically determined by the
    compiler, the hardware, or both
  • Iterative execution
  • Body of loop to be iterated a large or infinite
    number of times

5
Dataflow Features and Advantages
  • Exposes coarse-grain parallelism.
  • Exposes high-level structure that facilitates
    analysis, verification, and optimization.
  • Captures multi-rate behavior.
  • Complementary to ongoing advances in DSP compiler
    technology for procedural languages, such as C
    and MATLAB.
  • Encourages desirable software engineering
    practices modularity and code reuse
  • Amenable also to aspect-oriented design.
  • Intuitive to DSP algorithm designers signal flow
    graphs.

6
Evolution of Dataflow Models for DSP
  • Synchronous dataflow static multirate behavior
  • Agilent ADS, Cadence SPW, etc.
  • Well-behaved dataflow schemas for bounded
    dynamics
  • Boolean/integer dataflow Turing complete models
  • Multidimensional synchronous dataflow image and
    video
  • Scalable synchronous dataflow block processing
  • Synopsys COSSAP
  • Cyclo-static dataflow phased behavior
  • Synopsys El Greco, Eonic Systems Virtuoso
    Synchro, System Canvas
  • Bounded dynamic dataflow bounded dynamics
  • The processing graph method reconfigurable
    dynamic DF
  • US Naval Research Laboratory, MCCI Autocoding
    Toolset
  • Parameterized dataflow dynamically-reconfigurable
    static DF
  • Blocked dataflow image and video in terms of
    reconfigurable dataflow

7
Modeling Design Space
(Third dimension simplicity and intuitive appeal)
8
Decidable Dataflow Models
  • Modeling flow for representing static flowgraph
    behavior
  • Cyclo-static dataflow (CSDF), multiphase modeling
    ?
  • Synchronous dataflow (SDF), multirate modeling ?
  • Homogeneous synchronous dataflow (HSDF) ?
  • Acyclic homogeneous synchronous dataflow (task
    graphs)
  • These are in decreasing order or generality
  • Designs represented in the more general models
    can be converted to equivalent representations in
    the less general ones
  • e.g., CSDF? SDF ? HSDF ? task graph
  • HSDF each actor (graph node) produces/consumes
    exactly one data value to/from each incident
    output/input edge
  • Suitable for exposing parallelism
  • Not the best model for minimizing memory
    requirements

9
Synthesis Techniques for Decidable Models
  • Static scheduling low overhead, predictability
  • Performance analysis through synchronization
    graphs
  • Loop scheduling
  • Implicit repetition in the dataflow graph
    (through changes in sample rate) needs to be
    translated into explicit repetition in the form
    of loops on the execution target.
  • Complex design space exists for such translation
  • Complementary to procedural language techniques
    for nested loop compilation
  • Loop scheduling techniques
  • Simulation speedup (minimization of scheduling
    complexity)
  • Code/data minimization
  • Hierarchical parallel scheduling
  • Block processing
  • Task scheduling for latency/throughput
    optimization
  • Probabilistic design exploiting tolerances to
    deadline misses

10
Example Intermediate representations for
synthesis from decidable dataflow models
  • Consider a decidable dataflow behavior that is to
    be implemented on a self-timed, embedded
    multiprocessor
  • Natural way to implement DSP multiprocessors from
    decidable dataflow
  • Actor assignment and ordering are performed
    statically
  • Invocation (dispatch) of actors is performed
    dynamically, through synchronization
  • Candidate mappings of the behavior onto the
    architecture can be represented through an
    intermediate representation that also has
    decidable dataflow semantics
  • This representation is useful for understanding
    the performance, communication overhead, and
    synchronization structure associated with the
    candidate mapping
  • Facilitates the separation of communication and
    synchronization functionality
  • This is a useful modeling methodology for design
    space exploration

11
Interprocessor Communication Graph (Gipc)
Self-timed schedule and its IPC graph
12
The synchronization graph Gs
  • Derived from the interprocessor communication
    graph
  • Synchronization edges are distinguished from
    interprocessor communication (IPC) edges
  • Synchronization edges represent precedence
    constraints that are enforced by synchronization
    protocols
  • IPC edges represent data transfers
  • Interprocessor connections
  • Coincident synchronization and IPC edges ?
    communication together with synchronization
    protocol (conventional approach)
  • IPC edge only ? communication without synch.
    protocol
  • Synchronization edge only ? synchronization
    protocol only

13
Applications of Synchronization Graphs
  • Simulation
  • Throughput estimation through cycle mean
    analysis
  • Removal of redundant synchronizations
  • Resynchronization
  • Conversion to more efficient synchronization
    protocols (strongly connected synchronization
    graphs)
  • Statically determining and minimizing the sizes
    of interprocessor communication buffers
  • All are post-processing methods that can be
    applied to improve a wide range of existing task
    graph scheduling techniques on a wide range of
    multiprocessor architectures.
  • These techniques benefit from good execution
    time estimates, but do not depend on exact
    execution time values to deliver useful results.

14
Beyond Decidable Models
  • Limited expressive power DSP applications
    increasingly employ high-level dynamics in their
    behavior
  • User interface functionality
  • Mode changes
  • Adaptive algorithms
  • Reconfiguration of processing resources/parameters
  • However, key subsystems still exhibit large
    amounts of quasi-static structure --- structure
    that stays fixed across significant windows of
    time.
  • Various dynamic dataflow models have been
    proposed that address the limitation above by
    abandoning most or all restrictions related to
    decidable dataflow
  • However, these methods are correspondingly
    limited in their ability to exploit the
    quasi-static structure described above

15
Parameterized Dataflow Structured Control of
Dynamic Parameters
  • The Key discipline that is imposed on
    reconfiguration is that each subsystem must have
    a consistent view of each of its actors
    (hierarchical or primitive) throughout any given
    iteration of that subsystem.

16
Parameterized Dataflow
parent graph
  • Hierarchical modeling
  • Parameterized DF subsystem is composed of 3
    parmeterized DF graphs
  • init, subinit, body

subsystem
parameter n, ...
subinit
init
  • Subsystem parameters
  • configured in init/subinit, used in body

writes n
body
  • Dynamically reconfigurable

reads n
17
Meta-modeling with parameterized dataflow
  • Parameterized dataflow can be applied to any
    dataflow model of computation (base model) to
    augment that model with dynamic reconfiguration
    capabilities in a structured way
  • Provides for efficient quasi-static scheduling
  • Enables execution to be viewed in terms of a
    sequence of dataflow graphs in the base model
  • Parameterized dataflow XYZ ? Parameterized
    XYZ
  • Examples of parameterized dataflow models of
    computation that we are developing and
    experimenting with
  • parameterized synchronous dataflow (PSDF)
  • parameterized cyclo-static dataflow (PCSDF)

18
Parameterized Synchronous Dataflow (PSDF)
  • Locally synchrony conditions can be formulated
    and checked in a quasi-static fashion to ensure
    that bounded token production and consumption
    along with bounded delays lead to bounded memory
    requirements overall.
  • This is not true of unstructured dynamic dataflow
    models, such as general dynamic dataflow, boolean
    dataflow, and bounded dynamic dataflow
  • Techniques for construction of streamlined looped
    schedules for synchronous dataflow graphs have
    natural and efficient extensions to the
    construction of parameterized looped schedules
    for PSDF graphs.

19
PSDF Example CD to DAT Conversion
initChild
repeat 5 times fire setFac / sets i1, d1,
i2, d2, i3, d3, i4, d4 / int _g1 gcd(i1,
d2) int _g2gcd((i2 x i1)/_g1, d3) int
_g3gcd((i3 x i2 x i1)/(_g2 x _g1), d4)
repeat (d4/_g3) times repeat (d3/_g2)
times
repeat (d2/_g1) times repeat (d1)
times fire CD fire PF1
repeat (i1/_g1) times fire PF2
repeat ((i2 x i1)/(_g2 x _g1)) times
fire PF3 repeat ((i3 x i2 x i1)/(_g3 x
_g2 x _g1)) times fire PF4 repeat
(i4) times fire DAT
params i1, d1, ., i4, d4
setFac (sets i1,d4)
init
preamble
1 1 d1
i4 i1
i3 d2
d4 i2 d3
CD
DAT
PF1
PF4
PF2
PF3
body
body
20
PSDF Example Speech Compression
21
PCSDF Version of Speech Compression
22
Outline
  • Dataflow-based model of computation for modeling
    the behavior of DSP applications
  • Decidable dataflow models
  • Example use of decidable dataflow as a model of
    computation for modeling the mapping of
    (decidable) dataflow behaviors onto embedded
    multiprocessors
  • Structured reconfiguration of dataflow graphs
  • Examples of meta-modeling techniques that can be
    classified as structured, reconfigurable dataflow
  • Parameterized dataflow and its application to SDF
  • Homogeneous-parameterized dataflow and its
    application to SDF and CSDF
  • Experiments on a gesture recognition application
  • Summary

23
Homogeneous Parameterized Dataflow
(HPDF)
  • Parameterized dataflow model that can
    encapsulate dynamicity of application.
  • Meta-modeling technique. Hierarchical actors can
    have any other underlying dataflow model (SDF,
    CSDF, PSDF etc.)
  • Data production consumption rates though
    dynamic are equal across an edge for a large
    number of applications - thus the name
    homogeneous.
  • Reconfiguration can be performed without
    introducing hierarchy when more natural to do so
    (advantage over parameterized dataflow).
  • Parameterized dataflow is a more powerful
    technique and thus can be used to represent a
    wider set of applications.

24
Applications
  • Applications with dynamic run-time data and
    aggregated final-stage processes perform
    especially well for HPDF over SDF semantics.
  • Many applications in image and speech processing
    seem well suited for our model.
  • We applied the model on two applications
  • - A real-time video processing algorithm
    for smart camera developed at Princeton
  • - A face detection algorithm developed at
    CFAR labs in UMD.

25
Application characteristics
  • This structure seems to be abundant in many
    audio/video applications.
  • Our HPDF model is a natural fit for applications
    with the above structure.

26
Gesture recognition algorithm
  • Real-time video processing for gesture
    recognition.
  • Does low-level (red oval) and high-level
    processing.
  • Low-level processing recognizes body parts and
    identifies movements.
  • High-level processing recognized actions.
  • We concentrate on low-level processing.

Ref W. Wolf, B. Ozer, T. LV. Smart cameras as
embedded systems. IEEE Computer Magazine Vol 35,
Iss 9, Sept 2002, Pages 48-53
27
HPDF model of gesture recognition algorithm
Dynamic data
Dynamic data
Aggregating final-stage
n n
p p
Ptolemy II implementation
28
Modeling with HPDF/CSDF
phases pixels s
(s 1) (s 1)
(s 1) (Xi, Yi)
VIDEO INPUT
REGION EXTRACTION
CONTOUR FOLLOWING
(s 1) (s 1)
(s 1) (s 1)
(s 1) (Xi, Yi)
p phases with 1 token and (n-p) phases with 0
token production
(I 0,I ki) (n 1)
ELLIPSE FITTING
MATCH
p (pi1, qi 0)
29
Integrating HPDF and CSDF
  • Number of phases in a fundamental period can vary
    dynamically.
  • Number of tokens produced or consumed in a given
    phase can also vary dynamically.
  • HPDF constraint the total number of tokens
    produced by a source actor of a given edge in a
    given invocation (a fundamental period) must
    equal the total number of tokens consumed by the
    sink in its corresponding invocation.

30
Finer granularity and Input modeling
  • Each frame has 384x240 pixels, so we model the
    input as a CSDF actor with 92160 s phases.
  • Model captures pixel level parallelism present in
    Region.
  • It also captures the frame level parallelism
    through the number of phases in Input (s).

31
Modeling dynamicity - Contour
  • 2 phases for Contour
  • First one scans until finds a contour.
  • Output 0 tokens
  • Second one follows this contour and all the
    overlapping ones.
  • Output ki tokens, each token is a list of
    pixels from a contour
  • Homogeneous condition remains
  • s

32
Scheduling
  • VRCEM
  • (s V)(s R)(2I C)(n E)M
  • (s VR)(2I C)(n E)M

33
Results
  • We applied HPDF to successfully model a face
    detection algorithm also.
  • We developed a TI DSP implementation of the HPDF
    model of the gesture recognition algorithm.
  • The application was run on a TMS320C64xx fixed
    point processor.
  • When implemented with our HPDF model, the
    runtime was 21405671 cycles.
  • With a 40ns cycle period, execution time for the
    application was 0.86 sec.

34
Results (contd.)
  • Scheduling overhead was minimal as imperatively
    highly streamlined quasi-static schedule was
    obtained.
  • Worst case buffer size 642 Kb when the input
    images were 384X240 pixels. HPDF modeling
    suggested buffer reuse between the edges.
  • Original C code had runtime of 27741882 cycles,
    execution time was 1.11 sec with the same clock
    period of 40 ns.
  • HPDF improved runtime by 23.
  • Efficient hardware code generation is being
    looked into using hardware synthesis framework
    developed in our research group.

35
Summary
  • Dataflow-based model of computation for is
    attractive for modeling the behavior of DSP
    applications
  • Decidable dataflow models are useful for exposing
    and exploiting static structure in synthesis
    tools for DSP
  • Decidable dataflow models in conjunction with
    structured reconfigurable techniques allow for
    efficient handling of application dynamics
  • Examples of structured, reconfigurable dataflow
    techniques that we discussed
  • Parameterized dataflow and its application to SDF
  • Homogeneous-parameterized dataflow and its
    application to SDF and CSDF
  • Experiments on a gesture recognition application
  • Other examples include dynamic configuration of
    graph topologies, and blocked dataflow modeling.

36
References
  • B. Bhattacharya and S. S. Bhattacharyya.
    Parameterized dataflow modeling for DSP systems.
    IEEE Transactions on Signal Processing,
    49(10)2408-2421, October 2001
  • S. S. Bhattacharyya, R. Leupers, and P. Marwedel.
    Software synthesis and code generation for DSP.
    IEEE Transactions on Circuits and Systems --- II
    Analog and Digital Signal Processing,
    47(9)849-875, September 2000.
  • G. Bilsen, M. Engels, R. Lauwereins, and J. A.
    Peperstraete. Cyclo-static dataflow. IEEE
    Transactions on Signal Processing, 44(2)397-408,
    February 1996.
  • D. Ko and S. S. Bhattacharyya. Dynamic
    configuration of dataflow graph topology for DSP
    system design. In Proceedings of the
    International Conference on Acoustics, Speech,
    and Signal Processing, pages V-69-V-72,
    Philadelphia, Pennsylvania, March 2005.
  • E. A. Lee and D. G. Messerschmitt. Static
    scheduling of synchronous dataflow programs for
    digital signal processing. IEEE Transactions on
    Computers, February 1987.
  • S. Neuendorffer and E. Lee. Hierarchical
    reconfiguration of dataflow models. In
    Proceedings of the International Conference on
    Formal Methods and Models for Codesign, June
    2004.
  • M. Sen, S. S. Bhattacharyya, T. Lv, and W. Wolf.
    Modeling image processing systems with
    homogeneous parameterized dataflow graphs. In
    Proceedings of the International Conference on
    Acoustics, Speech, and Signal Processing, pages
    V-133-V-136, Philadelphia, Pennsylvania, March
    2005
Write a Comment
User Comments (0)
About PowerShow.com