Optimization of Parallel Task Execution on the Adaptive Reconfigurable Group Organized Computing Sys - PowerPoint PPT Presentation

About This Presentation
Title:

Optimization of Parallel Task Execution on the Adaptive Reconfigurable Group Organized Computing Sys

Description:

Optimization of Parallel Task Execution on the Adaptive Reconfigurable Group ... n - number of resources (VHO) included in the. architecture of the Group Processor ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 26
Provided by: valeriki
Category:

less

Transcript and Presenter's Notes

Title: Optimization of Parallel Task Execution on the Adaptive Reconfigurable Group Organized Computing Sys


1
Optimization of Parallel Task Execution on the
Adaptive Reconfigurable Group Organized Computing
System
Presenter Lev Kirischian Department of
Electrical and Computer Engineering RYERSON
Polytechnic University Toronto, Ontario, CANADA
2
Application of parallel computing systems for
data-flow tasks
  • Digital signal processing (DSP)
  • High performance control Data acquisition
  • Digital communication and broadcasting
  • Cryptography and data security
  • Process modeling and simulation.

3
Presentation of a data-flow task in the form of a
data-flow graph

Data In
MO 1 - MO n - Macro-operators, e.g. digital
filtering, FFT, matrix scaling, etc.
Data Out
4
Correspondence between task and computing system
architecture
  • If the data-flow task is processed on
    conventional SISD architecture processing
    time often cannot satisfy specification
    requirements
  • If the task is processed on SIMD or MIMD
    architectures - cost-effectiveness of these
    parallel computers strongly depend on the task
    algorithm or data structure.
  • One of possible solutions to reach required
    cost-performance requirements is to develop a
    custom computing system where architecture
    covers data-flow graph of the task.

5
Limitations for the custom computing systems
with fixed architecture
1. Decrease of performance if task algorithm
or data structure changes 2. No possibility for
further modernization 3. High cost for
multi-task or multi-mode custom computing
systems.
6
One of possible solutions Reconfigurable
parallel computing systems 1. Ability for
custom configuration of each processing
(functional) unit for a specific macro-operator
2. Ability for custom configuration of
information links between functional units
The above features allow hardware customization
for any data-flow graph and reconfiguration when
task processing is completed.
7
Example of FPGA-based system with architecture
configured for the data-flow task
8
  • Concept of Group Processor in the reconfigurable
    computing system
  • Group Processor (GP) a group of computing
    resources dedicated for the task and configured
    to reflect the task requirements.

9
Group processor life- cycle 1. In the GP -links
and functional units are configured before
task processing 2. GP performs the task as long
as it is necessary without interruption or
time sharing with any other task 3. After
task completion all resources included in
the GP can be reconfigured for any other task.
10
The concept of Reconfigurable Group Organized
computing system
Data Stream
Input / Output data bus
I/O
I/O
I/O
Functional Unit (FU)
Functional Unit (FU)
Functional Unit (FU)
Reconfigurable Interface Module (RIM)
Reconfigurable Interface Module (RIM)
Reconfigurable Interface Module (RIM)
Virtual Bus
Configuration Bus
Host PC
11
Parallel processing of different tasks on the
separated Group Processors
Data out 2
Data out 3
Data in 2
GP 2
GP1 for Task 1
GP 3
Data out 1
I/O
I/O
I/O
I/O
Data in 1
FU 3
FU 2
FU 1
FU 4
Virtual Bus
12
Concept of adaptation of the Group Processor
architecture on the task
  • Architecture-to-task adaptation for the GP
  • selection of resources configuration which
  • satisfies all requirements for task processing
  • (e.g. performance, data throughput,
    reliability, etc.)
  • requires minimal hardware (I.e. logic gates)

Data in
Memory
Memory
Multiplier
Adder
Filter
TIME
T0 T1 T2
13
Virtual Hardware Objects - the resource base of
reconfigurable computing system
  • For FPGA-based systems all architecture
    components (resources) can be presented as
    Virtual Hardware Objects (VHOs) described in one
    of the hardware description languages (for
    example VHDL or AHDL)
  • Each resource can be presented in different
    variants Ri,j, where i indicates the type of
    resource (adder, multiplier, interface module,
    etc.) and j- indicates variant of resource
    presentation in the architecture (for example
    8-bit adder, 16-bit adder, etc.).

14
Concept of Architecture Configuration Graph (ACG)
Multiplier
Adder
Adder
Adder
Bus
Bus
Bus
Bus
Bus
Bus
1
2
3
4
5
6
7
8
9
10
11
12
15
Architecture Configurations Graph arrangement
Architecture graph partial arrangement requires
two procedures 1. Local arrangement and
2. Hierarchic arrangement
Local arrangement of variants for each type of
system resources
Adder
40 nS
20 nS
Processing time
16
Hierarchical arrangement of system resources
Arrangement criteria - K(Ri ) T max(Ri) -
Tmin (Ri) / (mi - 1)

Multiplier
Adder
80nS
20nS
40nS
20nS
40nS
Adder
Adder
Adder
Multiplier
Multiplier
40nS
80nS
20nS
1
2
3
4
5
6
1
2
3
4
5
6
20nS
120 80 60 100 60 40
120 100 80 60 60 40
120 - 60
120 - 100 K(Mult) ----------- 30
K(Adder) ------------ 20
3 - 1
2 - 1
17
Selection of Group Processor architecture based
on the arranged ACG
Required processing time for the task Y A X
B is T
Multiplier
80nS
20nS
40nS
Adder
Adder
Adder
40nS
20nS
1
2
3
4
5
6
120 100 80 60 60 40
GP-architecture Multiplier (2) Adder (1)
Required performance
18
Number of experiments for GP-architecture selectio
n
N (GP opt ) ( n 1 ) log 2 (m 1 m 2 ...m n
) n - number of resources (VHO)
included in the architecture of the
Group Processor m i - number of variants of
each type of resources
Example If n 16 and m1 m 2 m n
32 Total number of experiments (task run on
estimated GP-architecture) N (GP opt) 16 1
16 5 97
19
Self-adaptation mechanism for FPGA-based
reconfigurable data-flow computing systems
Host - PC
Data Source
Architecture generator
Configuration Bus
Library of Virtual Hardware Objects
Reconfigurable platform
Architecture Selector
Performance Analyzer
20
First prototype of Adaptive Reconfigurable Group
Organized (ARGO) computing platform
21
Data Flow Graph for DVB MPEG2 processing
Input Data Streem - MPEG 2
Synchro-Signal Detect
PCR - detection
Null-packet analysis removing
Reference Frequency
Output frequency adjustment
PCR re-stamping
Output MPEG 2 data stream
22
Architecture selection time for 6-mode DVB MPEG 2
stream processor
1. Average time for each architecture
configuration- 7.18 mS 2. Average time for
GP-architecture selection (for the specific
mode) -
175.6 mS 3.Total time for architecture selections
for all modes-1.054 S
23
Hardware implementation of DVB MPEG 2 stream
processor for mode 1 and 4
Input Data -MPEG 2 stream
FU 1 (8 bit In- port)
FU 1
Synchro-Signal Detect
PCR - detection
Null-packet analysis removing
Virtual bus (16 lines)
FU 2
Output frequency adjustment
Reference Frequency
PCR re-stamping
FU 2 Out-port
Output MPEG 2 data stream
24
Hardware implementation of DVB MPEG stream
processor for modes 2, 3, 5 and 6
Input Data -MPEG 2 stream
FU 1 (8 bit In- port)
FU 1
Synchro-Signal Detect
PCR - detection
Null-packet analysis removing
Virtual bus (16 lines)
FU 2
Reference Frequency
Output frequency adjustment
FU 3
PCR re-stamping
FU 3 Out-port
Output MPEG 2 data stream
25
Summary
1. Adaptive Reconfigurable Group Organized (ARGO)
parallel computing system - FPGA-based
configurable system with ability for adaptation
on the task algorithm / data structure. 2. ARGO
-system allows parallel processing of different
data-flow tasks on the dynamically configured
Group Processors (GPs), where each
GP-architecture configuration corresponds to the
algorithm / data specifics of the task assigned
to this processor. 3. Above principles allows
development of cost-effective parallel computing
systems with programmable performance and
reliability with minimum cost of hardware
components and development time.
Write a Comment
User Comments (0)
About PowerShow.com