A systematic approach to exploring embedded system architectures at multiple abstraction levels - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

A systematic approach to exploring embedded system architectures at multiple abstraction levels

Description:

An explicit mapping step to map application tasks onto architecture resources. ... for Design, Optimisation, and Control, pp. 95-100, Barcelona: CIMNE, 2002. ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 30
Provided by: fcy
Category:

less

Transcript and Presenter's Notes

Title: A systematic approach to exploring embedded system architectures at multiple abstraction levels


1
A systematic approach to exploring embedded
system architectures at multiple abstraction
levels
Pimentel, A.D. Erbas, C. Polstra, S.
Informatics Inst., Amsterdam Univ.,
Netherlands IEEE Transactions on Computers , Feb.
2006 Volume 55 , Issue 2 Pages
99-112 Presenter Fu-Ching Yang
2
Outline
  • Abstract
  • Introduction
  • Related work
  • System-level design space exploration
  • Application modeling
  • Architecture modeling
  • Issues of mapping
  • Architecture model refinement through trace
    transformation
  • Case study Motion-JPEG
  • Conclusion

3
Abstract
  • The sheer complexity of today's embedded systems
    forces designers to start with modeling and
    simulating system components and their
    interactions in the very early design stages. It
    is therefore imperative to have good tools for
    exploring a wide range of design choices,
    especially during the early design stages, where
    the design space is at its largest. This paper
    presents an overview of the Sesame framework,
    which provides high-level modeling and simulation
    methods and tools for system-level performance
    evaluation and exploration of heterogeneous
    embedded systems. More specifically, we describe
    Sesame's modeling methodology and trajectory. It
    takes a designer systematically along the path
    from selecting candidate architectures, using
    analytical modeling and multiobjective
    optimization, to simulating these candidate
    architectures with our system-level simulation
    environment. This simulation environment
    subsequently allows for architectural exploration
    at different levels of abstraction while
    maintaining high-level and architecture-independen
    t application specifications. We illustrate all
    these aspects using a case study in which we
    traverse Sesame's exploration trajectory for a
    motion-JPEG encoder application.

4
Introduction
  • Background
  • SoC-based embedded systems often have a
    heterogeneous system architecture
  • Programmable processor ? dedicated hardware block
  • Whats the problem?
  • Traditional design methods fall short for the
    design of these systems
  • They cannot deal with the systems complexity and
    flexibility
  • Solution
  • System level design
  • Modeling and simulating system components and
    their interactions in the early design stage.
  • Minimize the modeling effort and optimize
    simulation speed

5
Overview of the sesame system
  • Basic principle
  • Platform architectures
  • Separation of concerns
  • High level modeling and simulation
  • Works to do
  • Selecting candidate architectures
  • Analytical modeling
  • Multi-objective optimization

6
Related work (cont.)
  • There are several architecture exploration
    environment, such as (metro) polis, Mescal, MESH,
    Milan.
  • Mapping a behavioral application specification to
    an architecture specification
  • What does sesame improve?
  • To separate the modeling of application and
    architecture more
  • architecture-independent application models
  • application-independent architecture models
  • a mapping step that relates these models for
    trace-driven cosimulation.

7
Related work
  • In the domain of hardware/software codesign of
    embedded systems, multiobjective optimization
    studies have been performed extensively for
    system-level synthesis (e.g., 20, 21, 22)
    and platform configuration.
  • In this paper, this is not their primary
    consideration

8
System-level design space exploration
  • Y-chart design methodology
  • Separate application models and architecture
    (performance) models
  • An explicit mapping step to map application tasks
    onto architecture resources.
  • Application model
  • functional behavior of an application in a timing
    and architecture independent manner
  • Architecture model
  • Defines architecture resources and captures their
    performance constraint
  • Essential in this methodology is that an
    application model is independent of architecture
    model
  • Application models can be reused in the
    exploration cycle
  • Gradual refinement of architecture performance
    models

9
Application modeling (cont.)
  • Use the Kahn Process Network (KPN) model
  • This model is natural for describing the streams
    of data samples in a signal processing system.
  • producers are not blocked because queues are of
    infinite length
  • Consumers are blocked when they attempt to get
    data from an empty input channel.
  • This model is determinate
  • the results of the computation does not depend on
    the firing order of the processes
  • Node process in application
  • Directed edge channels between processes

Process A
Process B
Process C
10
Application modeling
  • Describe application model in YML
  • For rapid creation and modification
  • Each node has two characteristics
  • computation requirement
  • workload imposed by the node onto a particular
    component in the architecture model)
  • This is done by instrumenting the code with
    annotations that describe the applications
    computational and communication actions
  • To generate traces of application events for
    architecture model
  • Coarse grained events read, write and execute
  • allele set
  • the processors that it can be mapped onto

Process A
Process B
Process C
11
Architecture modeling
  • Implemented using either Pearl or SystemC
  • Each processor in an architecture model
  • processing capacity
  • power consumption
  • fixed cost.
  • Operate at transaction-level
  • Simulate the performance consequences of the
    computation and communication events generated by
    an application model
  • Only account for architectural performance
    constraints
  • Dont model functional behavior

12
Issues of mapping (cont.)
  • Kahn processes maps to virtual processors in the
    mapping layer
  • Kahn channels maps to FIFO buffers
  • The mapping layer is automatically generated
  • A virtual processor in the mapping layer reads in
    an application trace from a Kahn process via a
    trace event queue and dispatches the events to a
    processing component in the architecture model.
  • The mapping of a virtual processor onto a
    processing component in the architecture model is
    freely adjustable.

13
Issues of mapping (cont.)
  • The enumeration of all possible mappings grows
    exponentially
  • It is a MMPN problem
  • The Multiprocessor Mappings of Process Networks
    problem
  • gi(x) are the constraints
  • Each Kahn node has to be mapped onto a single
    processor
  • Each channel in the application model has to be
    mapped onto a processor or a memory

14
Issues of mapping
  • They use Strength Pareto Evolutionary Algorithm
    (SPEA2) to find a set of approximated
    Pareto-optimal mapping solutions
  • E. Zitzler, M. Laumanns, and L. Thiele, SPEA2
    Improving the Strength Pareto Evolutionary
    Algorithm for Multiobjective Optimization,
    Evolutionary Methods for Design, Optimisation,
    and Control, pp. 95-100, Barcelona CIMNE, 2002.
  • Each mapping solution is represented by an
    individual encoding
  • a chromosome in which the genes encode the values
    of parameters.

15
Architecture model refinement through trace
transformation
  • Refinement of application events is denoted using
    trace transformations
  • Left-hand side the coarse-grained application
    events that need to be refined
  • Right-hand side the resulting
    architecture-level events

cd check-data ld load-data sr signal-room
cr check-room st store-data sd signal-data
16
Architecture model refinement through trace
transformation
  • For example, an application process that
  • Reads a block of data from an input buffer
  • Performs some computation on it
  • Writes the results to an output buffer
  • R -gt E -gt W
  • Assume the hardware have no local memory

See if there is a room in output buffer
The input buffer must remain available until the
processing component has finished operating on it
17
Event refinement using dataflow graphs
  • SDF Synchronous Data Flow
  • It performs the actual event refinement
  • IDF Integer-controlled Data Flow
  • To model repetitions and branching conditions
  • Kahn process in the application model IDF graph
    at mapping layer
  • IDF embedded in the corresponding virtual
    processor
  • The IDF graphs are executable as the actors have
    an execution mechanism
  • firing rules when an actor can fire.
  • When firing an actor
  • Consume the required tokens from its input token
    channels
  • Produce a specified number of tokens on its
    output channels.

18
Example of SDF
  • Two Kahn application processes act as a
    producer-consumer pair communicating pixel blocks
  • The cr actor fires when it receives a W(rite)
    application event
  • SDF actors can be coupled (i.e., mapped) to
    architecture model components.
  • A firing SDF actor may send a token to the
    architecture model to initiate the simulation of
    an event
  • The SDF actor in question is then blocked until
    it receives an acknowledgment token from the
    architecture model indicating that the
    performance consequences of the event have been
    simulated.

Consumer
The delay of the channel a FIFO buffer of b
elements
Producer
19
Example of IDF
  • In the IDF graphs, scheduling information of
    actors is not incorporated into the graph
    definition, but is explicitly supplied by a
    scheduler.

20
Event refinement using dataflow graphs
  • Communication refinement is accomplished
  • Simply replacing SDF actors with refined ones
  • Allowing for evaluating the performance of
    different communication behaviors at the
    architecture level
  • The application model remains unaffected.

Refinement
21
Case study M-JPEG
  • Objective To find promising instances of this
    platform that allow a good mapping of the M-JPEG
    application
  • Application model of the Motion-JPEG encoder

22
Case study M-JPEG
Platform architecture model
Processor and Memory Characteristics
Processor Characteristics
23
Mapping result
  • the nondominated front is shown as obtained by
    plotting 17 nondominated solutions that were
    found by SPEA2 in a single run.
  • Takes about 5 seconds on a 2.8GHz Pentium-4
    machine
  • For large system with 26 processes and 75 channel
  • Takes about 25 seconds on a 2.8GHz Pentium-4
    machine

Parameters of SPEA2 Population size 100 Number
of generations 1000 Mutation probability
0.5 Bit mutation probability 0.01 Crossover
probability 0.8
24
Further investigation
  • Select two nondominated solutions
  • The estimated cycle count
  • They also mapped various application models onto
    models of existing architecture implementation
  • To compare the timing of actual implementation
    with their estimation
  • The errors of their estimations lt 5

Solution 1 PE-1, PE-2, PE-3 Solution 2 PE-0,
PE-1
25
Further investigation
  • To model more implementation details of DCT at
    architecture level
  • Refine the PE onto the DCT task
  • Luminance blocks need to be preshifted before a
    DCT is preformed.

26
The result of refinement
  • Solution 1 has a balanced system
  • Performance of solution 2 is limited by less
    powerful PE-0

Solution 1 PE-1, PE-2, PE-3 Solution 2 PE-0,
PE-1
27
Try to improve PE-0 performance
  • Allows the PE for parallel execution of preshift
    and 2D-DCT
  • Implementation-5 simply reduce the execution
    latency
  • Implementation-6 refines the preshift and 2D-DCT
    operations

28
Conclusion
  • Modeling and simulation methods and tools for
    system-level performance evaluation and
    exploration of heterogeneous SoC-based embedded
    media systems.
  • Use analytical modeling to selects candidate
    architectures and multiobjective optimization
  • Bridges the abstraction gap between application
    and architecture models
  • architectural exploration at different levels of
    abstraction

29
Genetic Algorithm (2)
????2
Write a Comment
User Comments (0)
About PowerShow.com