CRISP template architecture - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

CRISP template architecture

Description:

Design time configurable. Parameterized architecture (CPU core, Memory) ... CRISP Configurable Parameter Hierarchy. CRISP. Memory Cluster. Intra-cluster Interconnect ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 32
Provided by: FranciscoB150
Category:

less

Transcript and Presenter's Notes

Title: CRISP template architecture


1
CRISP template architecture
  • Francisco Barat
  • Murali Jayapala
  • Pieter Op de Beeck

2
Motivation
  • Target application domain MPEG-4 apps
  • Compute-intensive applications
  • Many different algorithms
  • Different Clusters of quality
  • Standards evolving in time
  • Enable research on reconfigurable instruction set
    processors
  • Unifying model

3
CRISP definition
  • Configurable
  • Reconfigurable
  • Instruction
  • Set
  • Processor
  • Design time configurable
  • Parameterized architecture
  • (CPU core, Memory)
  • Eg Play Doh from HP labs, ARC Cores
  • Reconfigurable Instruction Set
  • Instruction set is not 100 fixed at design time
  • Eg Chimera, One chip

4
What is Reconfigurable Instruction Set Processor?
  • Instruction set of the processor can be changed
    from one application to another
  • Enabled by
  • Memory Based Decoders

Change the Decoder Memory
5
A Simple Example
Register File
Register Select
Decoder
1010101010101011101001010101001
Interconnect configuration
Opcode
Adder
Multiplier
Enable o/p
Enable o/p
  • Instruction set
  • Add R1, R2
  • Mult R1, R2

6
A Simple Example
Register File
Register Select
Decoder
1010101010101011101001010101001
Interconnect configuration
Opcode
Adder
Multiplier
Enable o/p
Enable o/p
  • Instruction set
  • INCA R1, R2 (R1R2 R2)

7
CRISP, Design Time Configurable
CRISP template
Template
CRISP1
CRISP2
CRISPn
Instances
TI C62 (VLIW)
Remarc
Chimaera
  • Design Space
  • Reconfigurable instruction set
  • VLIW

8
CRISP Configurable Parameter Hierarchy
CRISP
Memory Cluster
Global Interconnect
CPU Core
Data Path
Decoder
Intra-Cluster Interconnect
Memory
Inter-cluster Interconnect
Cluster
Decoder Memory
Intra-cluster Interconnect
Functional Unit
Register File
Inter-segment Interconnect
Segment
Processing Element
Intra-segment Interconnect
9
A CRISP Multiprocessor System
Main Memory
L2 Cache
L1 I Cache
L1 I Cache
L1 Loop Cache
CPU core
CPU core
L1 D Cache
10
Example of an Instance
11
CPU Core (Datapath)
12
CPU Core (Datapath)
Datapath Cluster 1
Datapath Cluster 2
Datapath Cluster 3
Register File 1
Register File 2
Register File 3
FU1
FU2
FU3
FU4
FU5
FU6
13
Parameters CPU Core (Datapath)
  • Datapath Clusters (n)
  • Interconnect between the clusters (type)
  • Data Memory ports (n)
  • (un)protected pipeline, predicated VLIW (yes/no)

14
Register File Model
Write ports
O1
O2
O3
Read A1
Write A1
Register File
Write A2
Read A2
Write A3
R1
R2
Read ports
15
Functional Unit Model
Input ports (Operands)
Opcode
O1
O2
O3
Functional Unit
M1
Memory ports (optional)
M2
R1
R2
Output ports (Results)
16
Parameters Functional Unit
  • Within a Functional Unit
  • Processing elements (PEs)
  • Interconnect Routing between PEs
  • allows spatial computation
  • For Large Functional Units
  • Segments (n)
  • PE (type, granularity)
  • Intra Segment Interconnect (type)
  • Inter Segment Interconnect (type)

17
Functional Unit Internal Model
Segment border
Functional unit border
PE1
PE2
PE1
PE1
PE3
PE1
Intra Segment Interconnect
Inter Segment Interconnect
opcode
inputs
outputs
memory ports
18
Some Examples of Functional Units
  • One segment
  • PE ALU
  • No interconnect
  • One segment
  • PEs LUT
  • Interconnect

19
Types of processing elements
  • Some examples
  • Registers
  • Computation elements
  • ALU
  • multiplier
  • LUT
  • subword shuffler
  • Constant generators

20
Memory
21
CRISP Configurable Parameter Hierarchy
CRISP
Memory Cluster
Global Interconnect
CPU Core
Data Path
Inter-cluster Interconnect
Cluster
Intra-cluster Interconnect
Functional Unit
Register File
Inter-segment Interconnect
Segment
Processing Element
Intra-segment Interconnect
22
A CRISP Multiprocessor System
Main Memory
L2 Cache
L1 I Cache
L1 I Cache
L1 Loop Cache
CPU core
CPU core
L1 D Cache
23
Memory Cluster
Memory
Memory Selection Mechanism
Decoder
Memory Cluster 1
Memory Selection Mechanism
Memory
Memory
Memory Cluster 2
Decoder
Decoder
24
Parameters Memory Cluster
  • Within each Memory Cluster
  • Memories (n)
  • Decoders (n)
  • Memory Cluster Interconnect
  • I/p connected to Memory/Decoder
  • O/p connected to Decoder/Memory
  • Memory Selection Mechanism
  • Address translators

25
Parameters Memory
  • For each Memory
  • Size (width x height)
  • Internal Memory Organization
  • Local Memory Control
  • Loop Control
  • Cache Replacement Policies
  • FIFO control

26
Parameters Decoder
  • Types
  • Fixed Decoding
  • Memory Based
  • Size (width x height)
  • Hybrid
  • Combination of both
  • Software

n bits
Multiplexer
Decoder Memory
Fixed Decoder
m bits
Fixed Decoding
Memory based Decoding
Hybrid
  • Variation in n can be large
  • Variation in m can be large
  • Variation in n is small
  • Variation in m is small

27
An Example (Program Memory)
  • Cluster
  • Cluster 1 1 output(8b)
  • Memory 1 input(4b), 1 output(4b)
  • SDRAM, 1204bx4b
  • Decoder input(4b), output(8b)
  • Fixed
  • Cluster 2 2 inputs(4b, 4b), 2 outputs (8b, 8b)
  • Memory 1 1 input(4b), 1 output(4b)
  • Cache
  • Memory 2 1 input (4b), 1 output (4b)
  • Cache
  • Decoder 1 input (4b), output (8b)
  • Fixed
  • Decoder 2 input (4b), output (8b)
  • Fixed
  • Loop Control
  • Cluster 3 3 inputs(4b, 4b, 4b, 4b), 4 outputs
    (8b, 4b, 6b, 8b)

28
Summary of the Memory parameters
29
Methodology
30
Conclusions
  • New template to do research
  • Complete decoder hierarchy
  • Final remark
  • Good, isnt it?

31
Motivation
  • Power consumption due to Instruction/Data traffic
  • Significant in
  • Current Processors (VLIW)
  • FPGAs , very very long control word
    (configuration word V2LIW or even RLIW)
  • More significant when scaled
  • More Functional Units in parallel
  • To address this
  • Control Memory Exploration
Write a Comment
User Comments (0)
About PowerShow.com