Title: An Architecture Exploration Framework for Embedded DSP Systems
1An Architecture Exploration Framework for
Embedded DSP Systems
- Presenter Ahmed Elhossini
- ENG6530- Reconfigurable Computing
- School of Engineering, University of Guelph
2Outlines
- Motivations Problem Definition
- Background
- Framework Overview
- Searching the design Space
- Evaluation Validation
- Architecture Exploration Flow
- Conclusions
- Present Work
3Motivations
- The emerge of the concept of System-On-Chip
introduced many challenges for the designer. - A tool that help the designer to make an
estimation of the design requirements is required.
4Problem Definition
- Given a software application, explore the design
space formed by the application and a core
library to define - Hardware architecture
- Computation (processors , cores, etc.)
- Communication (Bus interfaces, FIFOs, etc.)
- Partitioning/Mapping
5Problem Definition
- Find Mi, Ci , Bi j to meat Ai of Si
- The problem is formulated as a multi-objective
optimization problem.
6Sample SOE Architecture
7Architecture Exploration
8Outlines
- Motivations Problem Definition
- Background
- Framework Overview
- Searching the design Space
- Evaluation Validation
- Architecture Exploration Flow
- Conclusions
- Present Work
9Methodologies
- Orthogonalization of concerns is an important
concept in architecture exploration. - Separation of concerns deals with the following
- Separating function and architecture.
- Separating communication and computation.
10The Y Chart
11Requirements Challenges of Architecture
Exploration
- Modeling
- Application Modeling
- Architecture Modeling
- Mapping
- Performance Measurement
- System Update (Searching the Design Space)
- Software
- Architecture
- Mapping
12Architecture Exploration
13Architecture Exploration Support
- Both architecture and application need to be
represented in some form to model different
characteristics of both. - Different representations at different
abstraction levels are used to model applications
and architectures. - Co-simulation of both models is used to evaluate
different implementations.
14Architecture Modelling
- Architecture Description Languages (ADL) are used
to model architecture at a high level of
abstraction. - Hardware Description Languages (HDL) are used to
model the architecture at different levels of
abstractions. - Instruction Set Models are used to model the
behaviour of the processor. - When a micro-architecture templates is used as
the implementation platform models and tools for
simulation are available.
15Application Modelling
- Applications are usually specified in high level
language. Compilers and interpreters are
required. - Kahn Process Networks (KPN) model the application
through concurrent processes communication using
FIFO, unbounded, uni-directional, point to point
channels. - Ptolemy framework provide a good environment to
model different types of application with
different computation models. - Directed Task Graph.
16Evaluation Techniques
17Evaluation Techniques
- Accurate Simulation of the System
- Give accurate evaluation with the cost of long
evaluation time. - Trace Driven Simulation
- An initial program run and extracts all memory
accesses and store them in a trace which is used
for performance estimation. - It is more efficient in the estimation of the
performance of the memory sub-system. - Statistical Simulation
- The statistical simulator does not execute the
program in a precise order, but instead it
simulate a statistical profile of the program.
18Evaluation Techniques
- Analytical Evaluation
- Analytical model is developed for different
components. This model are used for performance
estimation. - The accuracy of analytical evaluation depends on
the accuracy of the developed analytical model. - Intelligent Approaches
- Some research is directed toward the use of
intelligent approaches for the evaluation of
embedded system. - Fuzzy Logic and Neural Networks are an example of
such methods.
19Searching the Design Space
20Searching The Design Space
- Exhaustive search covers all possible solutions
in the design space. - It has the cost of long search time.
- Local Search covers a portion of the design space
to find a local sub-optimal solution. - Advantage short search time
- Disadvantage high possibility of falling into
local minima.
21Mathematical Meta-Heuristics
- A meta-heuristic approach based on genetic
algorithms and particle swarm optimization was
developed. - This approach is designed for multi-objective
optimization. - This approach was verified using a set of
multi-objective problems found in the literature. - Results was prepared for publication in
evolutionary computation journal.
22Outlines
- Motivations Problem Definition
- Background
- Framework Overview
- Searching the design Space
- Evaluation Validation
- Architecture Exploration Flow
- Conclusions
- Present Work
23Framework Overview
24Architecture Exploration Flow
- Input Application Model
- Output Near Optimal Hardware Architecture to
implement the given application. - Target
- Maximize the performance
- Minimize the power consumption
- Minimize the area (resources)
- Maximize the flexibility
25Application Model (Directed Task Graph)
26Chromosome Representation
- The candidate architecture is represented as a
vector of integers.
27Outlines
- Motivations Problem Definition
- Background
- Framework Overview
- Searching the design Space
- Evaluation Validation
- Architecture Exploration Flow
- Conclusions
- Present Work
28Genetic Algorithms
- A meta-heuristic approach.
- Inspired by the natural selection and evolution
of living species. - Based on three basic operations
- Selection
- Crossover
- Mutation
- Problem Representation - Individuals
29Genetic Algorithms
30Genetic Algorithms
- Good exploration capabilities.
- Exploiting capabilities are low.
- The selection operation depends on the fitness of
each individual.
31Particle Swarm Optimization (PSO)
- PSO is a robust stochastic optimization technique
based on the movement and intelligence of swarms. - PSO applies the concept of social interaction to
problem solving. - It was developed in 1995 by James Kennedy
(social-psychologist) and Russell Eberhart
(electrical engineer). - It uses a number of agents (particles) that
constitute a swarm moving around in the search
space looking for the best solution. - Each particle is treated as a point in a
N-dimensional space which adjusts its flying
according to its own flying experience as well as
the flying experience of other particles.
32Particle Swarm Optimization (PSO)
y
x
sk current searching point.
sk1 modified searching point.
vk current velocity.
vk1 modified
velocity.
vpbest velocity based on pbest.
vgbest velocity based on gbest
33Initialization. Positions and velocities
http//www.cems.uwe.ac.uk/jsmith/ci/pso/PSO20min
i20tutorial.ppt
34Neighbourhoods
geographical
social
http//www.cems.uwe.ac.uk/jsmith/ci/pso/PSO20min
i20tutorial.ppt
35Particle Swarm Optimization
- Good Exploiting Capabilities
- Exploration of PSO depends on the particle
movement parameters and the design space. - The selection of gbest, and pbest depends on the
fitness of the different particles.
36Hybrid GA-PSO
- Combines the advantages of both PSO and GA.
- Three Hybrid forms are introduced and tested.
- A common representation was used for both
algorithms.
37Hybrid Representation
38Hybrid GA-PSO-1
- In this algorithm both PSO and GA operations are
used in the same time. - This combination allows gaining the advantages of
both algorithms. - PSO operations can be used to perform local
search at each solution.
39Hybrid GA-PSO-2
- In this combination GA is used to explore the
design space in the first phase of the search. - PSO is then used to exploit the solutions found
by GA.
40Hybrid GA-PSO-3
- PSO is used for exploration.
- GA is used to exploit the search.
41Multi-objective optimization
- Finding the optimum solution that meet different
conflicting constraints, by varying different
parameters. - Exact solution could be found by exhaustive
search but this could be very time consuming. - Exhaustive search speeded up by the study of
parameters dependency and clustering of
parameters.
42Multi-Objective Optimization (Pareto front)
- Defines the set of solutions that covers all the
trade-off of the different objectives. - A solution is said to be pareto optimal if it
part of the pareto front.
43Multi-objective optimization
- Heuristic methods could be used to find a pareto
optimal solution. - Several heuristic methods exist for
multi-objective optimization. - Random Search Pareto (RSP).
- Pareto Simulated Annealing (PSA).
- Pareto Reactive Tabu Search (PRTS).
- Genetic Algorithms.
44Multi-objective Genetic Algorithms
- Two widely used Multi-objective genetic
algorithms - Strength Pareto Evolutionary Algorithm (SPEA).
- Non-dominant Sorting Genetic Algorithm (NSGA).
- SPEA uses a fine grained but and expensive
computation schema. - NSGA uses a coarse grained and less expensive
computation schema.
45Multi-Objective Optimization
46Strength Pareto Algorithm
- Fitness value is assigned to each individual.
- This value represent the strength of the
individual in the Multi-Objective space - How many individuals it dominates.
- The density of solutions near that individual.
- External Archive is used to store the best
individuals found so far. - Strength Pareto algorithm is used for both GA and
PSO
47Strength Pareto Algorithm
48Testing The Meta-Heuristics Search Engine
- Standard Benchmarks for Multi-objective
optimization are used. - The search engine was also used to solve the
scheduling problem (John Huisman) - Task Scheduler used to estimate the number of
cycles required for a specific application. - The search engine used to search the design space
for an optimal schedule. - The Task Scheduler is used for performance
estimation
49Scheduling for Embedded Systems
50Testing The search Engine
51Outlines
- Motivations Problem Definition
- Background
- Framework Overview
- Searching the Design Space
- Evaluation Validation
- Architecture Exploration Flow
- Conclusions
- Present Work
52Core Library
- The core library contains information about
- processing cores
- communication channels and buses
- Components are defined using the following model
53Performance Estimation
- Performance estimation is the task of estimating
the performance measures for a given
architecture. - This estimation is based on the existence of
different models of the cores forming up the
architecture (power, performance , ext.) - Models provided by core vendors are based on
different standards. - Each vendor provides a different estimation tool
at a different level of abstraction.
54Performance Estimation
55Modeling of Embedded DSP cores
- Each architecture is formed from a different set
of cores and configurations. - Different benchmarks are available.
- Each benchmark/architecture combination will be
simulated (Cycle accurate/RTL) and performance
measures will be extracted for each combination. - For each core its own performance measures for
each architecture will be extracted. - For each core, the extracted performance measures
will be used to create a model for that core. - ANN will be used to build the model from the
extracted information.
56Analytical Model
57Modeling of Embedded DSP cores
58Artificial neurons
Neurons work by processing information. They
receive and provide information in form of spikes.
x1 x2 x3 xn-1 xn
w1
Output
w2
Inputs
y
w3
.
.
.
wn-1
wn
The McCullogh-Pitts model
59Artificial neurons
Nonlinear generalization of the McCullogh-Pitts
neuron
y is the neurons output, x is the vector of
inputs, and w is the vector of synaptic
weights. Examples
sigmoidal neuron Gaussian neuron
60Artificial neural networks
Hidden Layer
Inputs Layer
Output Layer
An artificial neural network is composed of many
artificial neurons that are linked together
according to a specific network architecture. The
objective of the neural network is to transform
the inputs into meaningful outputs.
61Neural network tasks
- control
- classification
- prediction
- approximation
These can be reformulated in general as FUNCTION
APPROXIMATION tasks.
Approximation given a set of values of a
function g(x) build a neural network that
approximates the g(x) values for any input x.
62ANN Modeling Initial Results (Linear)
63ANN Modeling Initial Results (Linear)
64Modeling Example
65Outlines
- Motivations Problem Definition
- Background
- Framework Overview
- Searching the Design Space
- Evaluation Validation
- Architecture Exploration Flow
- Conclusions
- Present Work
66Architecture Exploration Flow
- Manually create a call graph from profiling
results. - This graph contains information about different
software blocks. - It also contains the relation between different
blocks.
67VTune Profiling Results
68Architecture Exploration Flow
- Using a Graph Editor call graphs could be
modified and user constraints could be added. - Graph information are stored into two different
formats.
69Graph Editor
70Graph Editor
71Architecture Exploration Flow
- The Library Editor module enables adding,
removing and modifying different components in
the working library. - Library information are stored into two different
files.
72Library Editor
73Architecture Exploration Flow
- The GA-PSA modules used the graph information,
core library, and a parameter file to search the
design space of the problem.
74Results File
75Architecture Exploration Flow
- The results obtained by searching the design
space is decoded using the graph and library
information. - Display Result modules is used to display the
results.
76Generate Different Results
77Generate Different Results
78Outlines
- Motivations Problem Definition
- Background
- Framework Overview
- Searching the Design Space
- Evaluation Validation
- Architecture Exploration Flow
- Conclusions
- Present Work
79Conclusions
- Architecture Exploration is Important with the
new trends of Embedded Systems Design. - Many challenges face the designer during
architecture exploration - Modeling of both architecture and application
- Searching the design space
- Evaluation of generated architectures.
- Meta-Heuristic methods are efficient for
searching the design space in architecture
exploration. - Analytical and statistical methods for the
evaluation of embedded systems give a good
performance estimation in a reasonable time.
80Conclusions
- A framework is introduced for architecture
exploration. - The flow starts with modeling a DSP application
and ends with a set of architecture to implement
the application. - The quality of the framework was verified as
follows - The performance of the search engine was verified
by testing known benchmarks that covers different
level of complexities. - The use of ANN to model different cores ensures
the accuracy of the performance estimation.
81Conclusions
- A hybrid PSO-GA algorithm based on the strength
pareto algorithm is used to search the design
space. - The evaluation of the solutions was improved
using different techniques - The estimation of the performance is now based on
the scheduling of tasks between different
processors. - ANN based power estimator is being developed.
- Analytical model is combined with the two
previous techniques. - A GUI has been developed to enhance the
user-framework interaction.
82Thank You!!!