Effective Compilation Support for Variable Instruction Set Architecture

About This Presentation

Title:

Effective Compilation Support for Variable Instruction Set Architecture

Description:

Title: Effective Compilation Support for Variable Instruction Set Architecture Last modified by: Fred Chow Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:122

Avg rating:3.0/5.0

Slides: 25

Provided by: ncs104

Learn more at: https://arcb.csc.ncsu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Effective Compilation Support for Variable Instruction Set Architecture

1
Effective Compilation Support for Variable
Instruction Set Architecture

Jack Liu
Timothy Kong
Fred Chow
Cognigine Corp.
www.cognigine.com

2
Outline

VISC Architecture
Compile-time Configurable Code Generation
Managing the Dictionary
Concluding Remarks

3
Configurable Computing

Motivation
Higher performance
processor and instruction set customized to type
of application
Lower hardware cost
non-essential features excluded
Shorter time-to-market

4
Variable Instruction Set Architecture (VISC
ArchitectureTM)

A new approach to configurable computing
Fixed processor hardware
Many types of operations provided
Numerous instruction variants (CISC-style)
Per-program instruction set tailoring during
compile time

5
Background of this work

Cognigine CGN16100 Network Processor
Single-chip, fully programmable network processor
Processing cores
16 Re-configurable Communications Units (RCU)
processor cores
VISC architecture
4 64-bit parallel execution units
Multi-threaded
512 KB on-chip memory (text and data)

6
VISC ArchitectureTM
Dictionary (instruction set for current program)
dictionary entry 32-bit 2 operations 64-bit 4
operations 128-bit 8 operations
256 entries
instruction
opcode opnd0 opnd1 opnd2 opnd3
opcode 8-bit
7
Motivation for VISC Architecture

Efficient way to encode/decode the many operation
variants with different addressing modes
Not all used in each program
High instruction encoding density
Small opcode bit count
Operands shared among multiple operations
Simplified control logic for VLIW-style ILP
Up to 8 operations per cycle

8
Operation Specification

In Dictionary Entry (only specified once)
Operation name
Operation variants
Signed and unsigned
Operand and result sizes 8-bit, 16-bit, 32-bit,
64-bit
Support different sizes among operand(s) or
result
Vector 64v8, 64v16, 64v32, 32v8, 32v16
Data path to each operand/result
In Instruction
Operands encoding formats
Actual operands

9
RCU Architecture

5 Stage Pipeline
4-way multi-threaded
Hardware RSF synchronization

128 bit reconfigurable address path
256 bit reconfigurable data path

10
Roles of Compiler for VISC Architecture

Determine best instruction set stored in
dictionary for best execution time performance
Generate optimized code sequence based on best
instruction set
Cater to various hardware limitations
Dictionary limit
Data path constraints
Dictionary and Instruction encoding constraints

11
New Compilation Approach Configurable Code
Generation

Exact form of generated instructions decided in
the last instruction scheduling phase
Direct result of instruction compaction based on
what is allowed by the hardware

12
Compiler Implementation Method

Retarget SGI Pro64 (Open64) compiler to an
Abstract Machine
Code generator operates on an Abstract Operation
Representation
Code generation optimizations left intact
Add new Instruction and Dictionary Finalization
(IDF) phase as post-pass
IDF Phase 1
Instruction scheduling and folding
Abstract operations converted to target code
sequence
IDF Phase 2
Output VISC instructions and dictionary entries

13
Compiler Phase Structure
C
GNU / Pro64TM Front-end
WHIRL Optimizer
Pro64TM Back-end
Code Generator
IDF
Assembly Program Instructions Dictionary
14
Abstract Operation Representation (AOR)

Each operation corresponds to a micro-operation
in the core execution units
RISC-like formats
r1 op r2, r3
r2 load ltoffsetgt(ltbasegt)
store r2 ltoffsetgt(ltbasegt)
r1 loadimm ltimmgt
Optimizations in AOR reflected in final code
No pre-disposition of compiler to any specific
instruction format

15
Multiple AOR ops can be combined to single target
operation

Operations taking immediate operand
r2 move ltimmgt gt r3 addi r1 ltimmgt
r3 add r1, r2
Operations supporting memory operands
r2 load 4(sp) gt r3 add r1 4(sp)
r3 add r1, r2
Post incre/decre memory operations
r2 load 0(r1) gt r2 load 0(r1)
r1 addi r1, 4
Branches on condition codes
r1 add r2, r3
. . . r1 add r2, r3
compare (r1 ! 0) gt br.z label (only if
immediately after)
br.z label
Others

16
IDF Approach

Instruction scheduling following tasks
Instruction folding
Opcode selection
Modelling of irregular hardware constraints
Modelling of encoding constraints
Monitoring of states of condition codes and
transient registers
Keeping track of dictionary contents
Use enumeration (branch and bound) approach

17
Example of IDF Processing
Dictionary
Input
add xor sub nop
w80 move 0x55 w91 move 0xf8 w70 add
w70, w80 w71 xor w92, w80 w90 sub w92,
w91 store 8(p1) w90
add xor sub nop
3
instruction
op3 8(p1) w70 0x55 0xf8

move and store instructions subsumed
w71, w92 mapped to transient registers

18
IDF Scheduling Algorithm
Input Sequence of operations in BB
Estimate initial boundsch

To speed up the search
Shrink solution space by
Coming up with high initial boundsch
Prune useless search paths continuously
Tight hardware constraints help

Search for schedule with length lt boundsch
boundsch boundsch1
no
yes
19
Managing the Dictionary

Dictionary usage increases due to
Program size more variety of operations
High ILP more combination of operations
Library code linked in
Currently, dictionary contents fixed for each
executable
Role of linker
Merge dictionary entries with identical contents
across files/libraries
Error message on dictionary overflow
Role of compiler
Maximize dictionary entry re-use

20
Dictionary Compilation

Strategy
Keep track of existing dictionary entries during
compilation
Extract dictionary entries from
Libraries and .s files being linked
.o files compiled before current file
Example cc a.c b.o c.s
Maintain table of existing dictionary entries
Add to table as new entries are generated
Re-use existing dictionary entries
Bias scheduling towards dictionary conservation
as dictionary fills up

21
User Control of Dictionary Compilation

Best program performance demands near-full
dictionary.
When dictionary overflow, needs to re-compile.
Provide user control mechanisms
Trade-off between dictionary consumption and
program performance
Command line option -CGdict_usagen n
010
Embedded in code pragma dict_usage n
dict_usage is dictionary budget guideline for IDF
Low dict_usage
Less new dictionary entries created
Low ILP
High dict_usage
Tighter instruction schedule
More dictionary entries created

22
IDF Support of dict_usage

Additional search goal bounddict
Number of new dictionary entries allowed for
current BB
Automatically adjust lower with more pre-existing
entries
When bounddict reached during enumeration,
disallow creating new dictionary entry (unless
single operation)

23
Experimental Results

Summary (with dict_usage10)
ILP from IDF scheduling 1.38 ops per instruction
ILP from relaxed scheduling 1.51 ops per
instruction
23 of all subsumable operations subsumed
Each dictionary entry referred to by 2.63
instructions (statically)
Scheduling via enumeration 100 times slower than
one-pass schedulers
Compilation time 1 to 2 minutes per program

24
Concluding Remarks

VISC approach most suitable as embedded
processors
Limited program size
Dictionary space less of an issue
Slow compilation tolerable
CISC-style instructions enable small code size
Compilation support key to deploying applications
on VISC
Very hard to write in assembly language
Advanced optimizations performed by compiler
Dictionary managed by compiler with user hints
Compile-time configurable code generation enables
RISC compilation techniques to generate CISC
output

Write a Comment

User Comments (0)

About PowerShow.com

Effective Compilation Support for Variable Instruction Set Architecture - PowerPoint PPT Presentation

Effective Compilation Support for Variable Instruction Set Architecture

Title: Effective Compilation Support for Variable Instruction Set Architecture Last modified by: Fred Chow Document presentation format: On-screen Show – PowerPoint PPT presentation