Instruction Generation For Hybrid Reconfigurable Architectures

About This Presentation

Title:

Instruction Generation For Hybrid Reconfigurable Architectures

Description:

Instruction Generation For Hybrid Reconfigurable Architectures. Philip Brisk, ... matching as well as template generation for hybrid reconfigurable systems ... – PowerPoint PPT presentation

Number of Views:50

Avg rating:3.0/5.0

Slides: 28

Provided by: valueds282

Learn more at: https://cseweb.ucsd.edu

Category:

more less

Transcript and Presenter's Notes

Title: Instruction Generation For Hybrid Reconfigurable Architectures

1
Instruction Generation For Hybrid Reconfigurable
Architectures

Philip Brisk, Adam Kaplan, Ryan Kastner, Majid
Sarrafzadeh
Computer Science Department, UCLA
ECE Department, UCSB
October 11, 2002
CASES
Grenoble, France

2
Outline

What is Instruction Generation?
Related Work
Sequential and Parallel Templates
The Algorithm
Experimental Setup
Experimental Results
Conclusion and Future Work

3
Instruction Generation

Given a set of applications, what computations
should be customized?

Customized (Hard/Soft) Macro in PLD
customized?
Application Specific Instruction set Processor
ALU
Register Bank
Customized Macros
Control

Main Objective complex, commonly occurring
computation patterns
Look for computational patterns at the
instruction level
Basic operation is add, multiply, shift, etc.

4
Customization and Performance

A customized instruction must offer some
measurable performance increase.

In this work, we have categorized two types of
customized instructions and quantified the
performance that they offer us.

Sequential Instructions
Savings could come from either instruction fetch
reduction or datapath optimization. (e.g.
ADD-ADD converted to 3-input ADDER)

Parallel Instructions
Given multiple ALUs and data paths, allow data
independent instructions to be computed
simultaneously.

5
Problem Definition

Determining customized functionality transforms
to regularity extraction
Regularity Extraction - find common
sub-structures (templates) in one or a collection
of graphs
Each application can be specified by collection
of graphs (CDFGs)
Templates are implemented as customized
instructions
Related problem Instruction Selection

6
What Is Instruction Generation?
The Instruction Selection Problem
R1 ? Mfp a R2 ? Ti 4 R1 ? R1 R2 R2 ?FP
X MR1 ? MR2
Templates given as inputs. How do we determine
templates?
7
What Is Instruction Generation?
The Alternative Instruction Generation

Reconfigurable architectures allow us to rethink
the assumptions underlying our notion of
instruction selection.
The target machine language can be changed by
reconfiguring the FPGA to implement new
instructions.
This presents new challenges for mapping IR to
machine language.
We propose a scheme by which this mapping could
be obtained at compile time.

8
What Is Instruction Generation?
Instruction Generation Applications to CAD and
Embedded System Design

Template Generation plays a role in the
interaction between compilation and high-level
synthesis.
Each template corresponds to a resource which
must be provided by the underlying architecture.
A high-level synthesis tool can then allocate
resources and schedule the operations on these
resources.
This work investigates the latency-area tradeoff
created by instruction generation.

9
Related Work

Similar techniques have proven beneficial in
reducing area and increasing performance for the
PipeRench Architecture (Goldstein et al. 2000)
Corazao et. Al have shown that well matched,
regular templates can have a significant positive
impact on critical path delay and clock speed
Kastner et al. (ICCAD02) formulated an algorithm
for template matching as well as template
generation for hybrid reconfigurable systems

10
Our Model of ComputationControl Data Flow Graphs

if (cond1) bb1()
else bb2()
bb3()
switch (test1)
case c1 bb4() break
case c2 bb5() break
case c3 bb6() break
bb7()

bb basic block
11
Instruction Generation

The basic idea an iterative process whereby we
examine dataflow graphs and cluster combinations
of nodes that occur frequently.

Ideally, we want large templates that occur
often.

Sequential Template Generation Identifies
templates where the IR operations have data
dependencies between them.

Parallel Template Generation Identifies
dataflow operations that may be scheduled in
parallel.

12
Sequential Template Generation

Algorithm designed Kastner et al. ICCAD 2001.

Basic idea is to examine each edge in the DFG.
The type of edge can be represented by an ordered
pair consisting of the starting and ending node
types.

Maintain a count for each edge type.

Cluster the most frequently occurring edge by
replacing both vertices (head and tail) with a
super-vertex maintaining the original vertices in
an internal DAG.

13
Sequential Template Generation
14
Parallel Template Generation

Instead of examining DFG edges, we must determine
whether pairs of computations can be scheduled in
parallel.
We introduce a data structure called the
All-Pairs Common Slack Graph (APCSG) to help us
with this analysis.
APCSG edges are placed between nodes that could
possibly be scheduled together.
Two nodes can be scheduled at the same time if
they share common slack between them.

15
All Pairs Common Slack Graph (APCSG)

Common Slack the total number time steps that
two operations x and y could be scheduled using
by some scheduling heuristic.
APCSG undirected graph
Nodes correspond to operations
Edges represent the common slack between every
operation

16
All-Pairs Common Slack Graph (Example)
17
Parallel Template Generation Algorithm
1. Given A Labeled Digraph G(V,E) 2. T is a
set of template types 3. T ? 4. while not
stop_conditions_met(G) I. APCSG
?create_apcsg(G) II. T ?determine_template_candid
ates(APCSG) III. cluster_vertices(G,T)
18
Parallel Template Generation
19
Stopping Conditions

So when should we stop clustering a graph?

Aside from pragmatic arguments, a correct
stopping condition is essential if we are to
prove that our template generation algorithm is
optimal based on some criteria.

20
Stopping Criteria We Have Considered
Stopping Criteria We Have Used

Percentage of Nodes covered
Number of nodes left in the graph
Ratio of the number of nodes in a graph before
and after clustering
Number of unique template types exceed a given
threshold
Templates Exceed a Given Size
Percentage of overall slack lost in the graph
over an iteration.

Template sizes are restricted to be lt 5 nodes
total.
The algorithm stops when the total number of
nodes is less than half of what was started
with...

21
Scheduling Constraints
SCHEDULER

ALU1
CLK
1
2

Essentially, we have scheduled our operations at
the compiler level. What kind of job did we do?
22
Measuring The Damage

Length Of Schedule
The latency of all the operations
Ideally we want it short.
We must measure resulting clustered DAGs
Original, non-clustered DAG
Sequential Templates Only
Sequential and Parallel Templates

23
Experimental Setup
COMPILER IR (SUIF)
Sequential Template Generation Algorithm
Data Flow Graph and DAG Generation from a CDFG
pass
CO - COMPILER
A High Level Synthesis Tool Using A
Locally-Optimal Geometric Scheduling Algorithm
24
Benchmarks

CONVOLUTION Image convolution algorithm.
DeCSS Algorithm for breaking DVD encryption
DES The cryptographic symmetric encryption
standard for over 20 years.
Rijndael AES The new advanced encryption
standard.

25
Experimental Procedure

First, we compiled the program to the SUIF IR
using the front end built by The Portland Group
and Stanford University.
Next, we converted the SUIF IR to CDFG form
Then, we performed template generation on each
basic block for each program.
We selected 4 large dataflow graphs from each
program to schedule and evaluate our result.
We scheduled the dataflow graphs following
template generation and and compared them to the
original graphs.

26
Results
27
Conclusion And Future Work

The sequential template generation algorithm can
be expanded to accommodate parallel templates.
Parallel template generation reduces latency at
the expense of slack and area.
In the future, we plan to repeat these
experiments
with a more realistic architecture description
with ability to cross-schedule parallel
instructions
We also plan to explore compiler transformations,
such as function inlining, to
extract even more regularity
determine a more global view of the program

Write a Comment

User Comments (0)

About PowerShow.com

Instruction Generation For Hybrid Reconfigurable Architectures - PowerPoint PPT Presentation

Instruction Generation For Hybrid Reconfigurable Architectures

Instruction Generation For Hybrid Reconfigurable Architectures. Philip Brisk, ... matching as well as template generation for hybrid reconfigurable systems ... – PowerPoint PPT presentation