A Dataflow Approach to Design Low Power Control Paths in CGRAs - PowerPoint PPT Presentation

About This Presentation

Title:

A Dataflow Approach to Design Low Power Control Paths in CGRAs

Description:

No code compression technique developed for CGRAs ... Fine-grain Code Compression. Compress unused fields rather than the whole instruction ... – PowerPoint PPT presentation

Number of Views:31

Avg rating:3.0/5.0

Slides: 30

Provided by: fank

Learn more at: https://cccp.eecs.umich.edu

Category:

Tags: approach | cgras | compress | control | dataflow | decoded | design | incoming | low | paths | power

Transcript and Presenter's Notes

Title: A Dataflow Approach to Design Low Power Control Paths in CGRAs

1
A Dataflow Approach to Design Low Power Control
Paths in CGRAs

Hyunchul Park, Yongjun Park, and Scott Mahlke
University of Michigan

2
Coarse-Grained Reconfigurable Architecture (CGRA)

Array of PEs connected in a mesh-like
interconnect
High throughput with a large number of resources
Distributed hardware offers low cost/power
consumption
High flexibility with dynamic reconfiguration

3
CGRA Attractive Alternative to ASICs

Suitable for running multimedia applications for
future embedded systems
High throughput, low power consumption, high
flexibility
Morphosys 8x8 array with RISC processor
SiliconHive hierarchical systolic array
ADRES 4x4 array with tightly coupled VLIW

Morphosys SiliconHive ADRES
viterbi at 80Mbps
h.264 at 30fps
50-60 MOps /mW
3
4
Control Power Explosion
Single PE
PE Instruction

Large number of configuration signals
Distributed interconnect, many resources to
control
Nealy 1000 bits each cycle
No code compression technique developed for CGRAs
Fully decoded instructions are stored in memory
45 total power

4
5
Code Compressions

Huffman encoding
High efficiency, but sequential process
Dictionary-based
Recurring patterns stored in dictionary
Not many patterns found in CGRAs
Instruction level code compression
No-op compression Itanium, DSPs
Only 17 are no-ops in CGRA

6
Fine-grain Code Compression

Compress unused fields rather than the whole
instruction
Opcode, MUX selection, register address
35 of fields contain valid information
Instruction format needs be stored in the memory
Information regarding which fields exist in the
memory
Significant overhead 172 bits (20) for a 4x4
CGRA

6
7
Dynamic Instruction Format Discovery
FU dest lt- src0 src1 RF reg write

Resources need configuration only when data flows
through them
Instruction format can be discovered by looking
at the data flow
Token network from dataflow machines can be
utilized
Token is 1 bit information indicating incoming
data in next cycle
Each PE observes incoming tokens and determines
the instruction format

7
8
Dynamic Configuration of PEs
Dataflow Graph
Mapping
Configuration

Each cycle, tokens are sent to the consuming PEs
Consuming resources collect incoming tokens,
discover instruction formats, and fetch only
necessary instruction fields
Next cycle, resources can execute the scheduled
operations

8
9
Token Generation

Tokens are generated at the beginning of dataflow
live-in nodes in RFs
Each RF read port needs token generation info
26 read ports in 4x4 CGRA
26 bits for token generation vs. 172 bits for
instruction format

10
Token Network

Token network between datapath and decoder
No instruction format, but token generation info
in the memory
Adds 1 cycle between IF and EX stage
Created by cloning the datapath
1 bit interconnect with same topology
Each resource translated to a token processing
module
Encode dest fields, not src fields

10
11
Register File Token Module
token_gen
token sender

Write port MUXes are converted to token receivers
Determine selection bits
Read ports are converted to token senders
Tokens are initially generated here
Token generation information stored in a separate
memory

11
12
FU Token Module

Input MUXes are converted to token receivers
Opcode processor
Fetch opcode field if necessary
Determine token type (data/pred), latency

12
13
System Overview
datapath
14
Experimental Setup

Target multimedia applications for embedded
systems
Modulo scheduling for compute intensive loops in
3D graphics, AAC decoder, AVC decoder (214 loops)
Three different control path designs
baseline fully decoded instructions
static fine-grained code compression with
instruction format stored in the memory
token fine-grain code compression with token
network

14
15
Code Size / Performance

Fine grain code compression increase code
efficiency
Token network further improve code efficiency
Performance degradation
Sharing of fields, allowing only 2 dests

15
16
Power / Area

SRAM read power is greatly reduced with token
network
Introducing token network slightly increases
power and area
Area overhead can be mitigated with the reduced
SRAM area
Hardware overhead for token network is minimal

16
17
Staging Predicates Optimization

Modulo scheduled loops
Prolog (filling pipeline)
Kernel code (steady state)
Epilog (draining pipeline)
Only kernel code is stored in memory
Staging predicate control prolog/epilog phases

i0
i0
i1
i2
II
i1
i2
Overlapped Execution
17
18
Migrating Staging Predicate

Staging predicate
Control information, not data dependent
10 configurations used for routing staging
predicate
Move staging predicates into control path
Increase token by 1 bit staging predicate
Only top nodes are guarded
Staging predicate flows along with tokens
Benefits
Code size reduction
Performance increase

stage 0
stage 1
stage 2
stage 3
data
staging predicate
18
19
Code Size / Performance

Code size reduction by 9
Migrating staging predicates improve performance
by 7
5 increase over baseline

19
20
Power / Area

Power/area of token network increase due to valid
bit
Reduced code size decreases SRAM power/area
Overall overhead for migrating staging predicates
is minimal

20
21
Overall Power
226.4 mW
170.0 mW

System power measured for a kernel loop in AVC
Introducing token network reduces the overall
system power by 25, while achieving 5
performance gain

21
22
Conclusion

Fine grain code compression is a good fit for
CGRAs
Token network can eliminate the instruction
format overhead
Dynamic discovery of instruction format
Small overhead (lt 3)
Migrating staging predicates to token network
improves performance
Applicable to other highly distributed
architectures

22
23
Questions?
24
Token Sender

Each output port of resources are converted into
a token sender
FU output, routing mux output, register file read
ports
Send out tokens only to the specified consumers
in dest fields
Allow only two destinations for each output,
potentially limits the performance

25
Token Receiver

Input MUXes are converted to token receivers
Dest fields are stored in the memory, not src
fields
MUX selection bits are determined with incoming
token position

25
26
Dynamic Instruction Format Discovery

Resources need configuration only when data flows
through them
Instruction format can be discovered by looking
at the data flow
Token network from dataflow machines can be
utilized
Token is 1 bit information indicating incoming
data in next cycle
Each PE observes incoming tokens and determines
the instruction format

26
27
Who Generates Tokens?

Tokens are generated at the start of dataflow
Live-ins
Terminate when they get into a register file
Tokens terminated in register files can be
re-generated
Read ports of register files generate tokens
Token generation information at RF read ports are
stored separately
26 read ports in 4x4 CGRA

28
Reducing Decoder Complexity
MEM
MEM
MEM
MEM

decoder
decoder
decoder
decoder
Token Network

Partitioning the configuration memory and decoder
Trade-off between number of memories and decoder
complexity
Design space exploration for memory partitioning
Which fields are stored in the same memory?
Sharing of field entries in the memory
under-utilized fields

28
29
Memory Partitioning

Bundle fields with the same type field width
uniformity
Design space exploration result for a 4x4 CGRA
sharing degree total entries / total fields
Reduces decoder complexity by 33 over naïve
partitioning
Sharing incurs less than 1 performance
degradation

type fields memories entries total entries sharing degree
opcode 16 2 8 16 1.0
dest 96 8 8 64 0.75
const 16 2 6 12 0.75
reg addr 48 4 6 24 0.5
29

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

World's Best PowerPoint Templates PowerPoint PPT Presentation

World's Best PowerPoint Templates - CrystalGraphics offers more PowerPoint templates than anyone else in the world, with over 4 million to choose from. Winner of the Standing Ovation Award for “Best PowerPoint Templates” from Presentations Magazine. They'll give your presentations a professional, memorable appearance - the kind of sophisticated look that today's audiences expect. Boasting an impressive range of designs, they will support your presentations with inspiring background photos or videos that support your themes, set the right mood, enhance your credibility and inspire your audiences.

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Safe RTL Annotations for Low Power Microprocessor Design PowerPoint PPT Presentation

Safe RTL Annotations for Low Power Microprocessor Design - Safe RTL Annotations for Low Power Microprocessor Design Vinod Viswanath Department of Electrical and Computer Engineering University of Texas at Austin | PowerPoint PPT presentation | free to view

ELEC 5270-001/6270-001(Fall 2006) Low-Power Design of Electronic Circuits Test Power PowerPoint PPT Presentation

ELEC 5270-001/6270-001(Fall 2006) Low-Power Design of Electronic Circuits Test Power - ELEC 5270-001/6270-001(Fall 2006) Low-Power Design of Electronic Circuits Test Power Vishwani D. Agrawal James J. Danaher Professor Department of Electrical and ... | PowerPoint PPT presentation | free to view

Low Power Design From Technology Challenges to Great Products PowerPoint PPT Presentation

Low Power Design From Technology Challenges to Great Products - SoC design technologies: Optimized processors, voltage and frequency scaling, ... A product conception and design team need expertise and solutions in all these areas ... | PowerPoint PPT presentation | free to view

Power Design with Ferrite Core for Inductors and Transformers PowerPoint PPT Presentation

Power Design with Ferrite Core for Inductors and Transformers - Power Design Ferrites are available in a variety of shapes and sizes, which makes them ideal for different applications. The optimal design of a magnetic device with a ferrite core for power applications. Continue reading Power Design with Ferrite Core for Inductors and Transformers | PowerPoint PPT presentation | free to view

Power Design with Ferrite Core for Inductors and Transformers (1) PowerPoint PPT Presentation

Power Design with Ferrite Core for Inductors and Transformers (1) - Power Design Ferrites are available in a variety of shapes and sizes, which makes them ideal for different applications. The optimal design of a magnetic device with a ferrite core for power applications. Continue reading Power Design with Ferrite Core for Inductors and Transformers | PowerPoint PPT presentation | free to view

Transistor and Circuit Design Optimization for Low-Power CMOS By M.Chang, C.Chang,C.Chao,K.Goto,M.Leong,L.Lu,and C.Diaz PowerPoint PPT Presentation

Transistor and Circuit Design Optimization for Low-Power CMOS By M.Chang, C.Chang,C.Chao,K.Goto,M.Leong,L.Lu,and C.Diaz - ... Transistor Stacking and Power-supply Gating F. Dynamic Body Biasing G ... bias and reduced power supply for ... Leakage-Power Management C. Multi ... | PowerPoint PPT presentation | free to view

Logic Synthesis For Low Power CMOS Digital Design PowerPoint PPT Presentation

Logic Synthesis For Low Power CMOS Digital Design - Logic Synthesis For Low Power CMOS Digital Design Outlines Power consumption model Dynamic power minimization Reduction of output gate transitions i. | PowerPoint PPT presentation | free to view

Practical Noise Figure Measurements Including an example LNA design PowerPoint PPT Presentation

Practical Noise Figure Measurements Including an example LNA design - Practical Noise Figure Measurements Including an example LNA design Duncan Boyd Power & Noise PGU, South Queensferry, Scotland Agenda Motivation Model a low noise ... | PowerPoint PPT presentation | free to view

A Comprehensive Approach to Designing Online Courses PowerPoint PPT Presentation

A Comprehensive Approach to Designing Online Courses - ACADEMIC IMPRESSIONS Online Course Design Instructional Design Strategies for Online Courses Alisa Cooper EdD Veronica Diaz, PhD * * * Presenters Veronica Diaz, PhD ... | PowerPoint PPT presentation | free to view

A Comprehensive Approach to Designing Online Courses PowerPoint PPT Presentation

A Comprehensive Approach to Designing Online Courses - ACADEMIC IMPRESSIONS Online Course Design Veronica Diaz, PhD Instructional Design Strategies for Online Courses Presenters Veronica Diaz, PhD Instructional Technology ... | PowerPoint PPT presentation | free to view

Lecture 18: Core Design, Parallel Algos PowerPoint PPT Presentation

Lecture 18: Core Design, Parallel Algos - Lecture 18: Core Design, Parallel Algos Today: Innovations for ILP, TLP, power and parallel algos Sign up for class presentations * * * * * * * * * * * * SMT Pipeline ... | PowerPoint PPT presentation | free to view

The Design of Application Specific Integrated Circuits with High Level Synthesis Approaches PowerPoint PPT Presentation

The Design of Application Specific Integrated Circuits with High Level Synthesis Approaches - The Design of Application Specific Integrated Circuits with High Level Synthesis Approaches Shiann-Rong Kuang ( ) Assistant Professor Dept. of Computer ... | PowerPoint PPT presentation | free to view

Implications of Recent Physics Advances for the Design and Operation of Burning Plasma Experiments such as ITER and FIRE PowerPoint PPT Presentation

Implications of Recent Physics Advances for the Design and Operation of Burning Plasma Experiments such as ITER and FIRE - Advanced DEMO power density of 5 - 10 MWm-3 could be produced. ... ARIES AT (bN 5.4, fbs 90%) 12. OFHC TF ( 7 T) ... from coils and power supplies would be used ... | PowerPoint PPT presentation | free to view

Low Power FPGA Using Pre-defined Dual-Vdd / Dual-Vt Fabrics PowerPoint PPT Presentation

Low Power FPGA Using Pre-defined Dual-Vdd / Dual-Vt Fabrics - Title: FPGA Power Reduction Using Configurable Dual-Vdd Author: Fei Li Last modified by: ama1916 Created Date: 12/26/2003 11:10:25 PM Document presentation format | PowerPoint PPT presentation | free to view

Infection Control Market Size, Share, Trends Analysis Report 2020-2025 PowerPoint PPT Presentation

Infection Control Market Size, Share, Trends Analysis Report 2020-2025 - The global infection control market grew at a CAGR of around 7% during 2014-2019. Looking forward, IMARC Group expects the global infection control market to exhibit strong growth during the next five years. To learn more about this market, visit us at: https://www.imarcgroup.com/infection-control-market | PowerPoint PPT presentation | free to view

Lecture 8 Logic/Circuit Synthesis for Low-Power PowerPoint PPT Presentation

Lecture 8 Logic/Circuit Synthesis for Low-Power - Lecture 8 Logic/Circuit Synthesis for Low-Power Logic Level Optimizations Circuit Level Optimizations Summary Michael L. Bushnell CAIP Center and WINLAB | PowerPoint PPT presentation | free to view

A Comprehensive Approach to Designing Online Courses PowerPoint PPT Presentation

A Comprehensive Approach to Designing Online Courses - ACADEMIC IMPRESSIONS Online Course Design Patricia McGee Plenary Open entry/Open Exit Flexible time Multiple ways to complete assignments Controlled assessment ... | PowerPoint PPT presentation | free to view

How to Build a Low-Cost, Extended-Range RFID Skimmer PowerPoint PPT Presentation

How to Build a Low-Cost, Extended-Range RFID Skimmer - ... Halfway towards full implementation of ... Goals Build extended-range RFID skimmer Collects mass info from RFID devices Outline RFID System design ... | PowerPoint PPT presentation | free to view

Power Management Controls Project Update, August 2001 Bruce Nordman Lawrence Berkeley National Laboratory BNordman@LBL.gov http://eetd.LBL.gov/Controls sponsor: California Energy Commission Public Interest Energy Research (PIER) Program PowerPoint PPT Presentation

Power Management Controls Project Update, August 2001 Bruce Nordman Lawrence Berkeley National Laboratory BNordman@LBL.gov http://eetd.LBL.gov/Controls sponsor: California Energy Commission Public Interest Energy Research (PIER) Program - Title: Power management Author: EAP Mac Fileserver Last modified by: Bruce Nordman Created Date: 5/10/2000 8:50:20 PM Document presentation format | PowerPoint PPT presentation | free to view

Data Flow Analysis 3 15-411 Compiler Design PowerPoint PPT Presentation

Data Flow Analysis 3 15-411 Compiler Design - 15-411 Compiler Design Nov. 8, 2005 Key Reference on Global Optimization Gary A. Kildall, A Unified Approach to Global Program Optimization, ACM Symposium on ... | PowerPoint PPT presentation | free to view

System Design PowerPoint PPT Presentation

System Design - System Design An Engineering Approach to Computer Networking | PowerPoint PPT presentation | free to view

VHDL Design Tips and Low Power Design Techniques PowerPoint PPT Presentation

VHDL Design Tips and Low Power Design Techniques - Low Power Design Techniques Jonathan ... Design Techniques Summary Actel ProASICPlus Design Flow What is Synthesis? The mapping of a behavioral description to a ... | PowerPoint PPT presentation | free to view

A CMOS Low Power Current-Mode Polyphase Filter PowerPoint PPT Presentation

A CMOS Low Power Current-Mode Polyphase Filter - King Fahd University of Petroleum & Minerals KFUPM, Department of Electrical Engineering A CMOS Low Power Current-Mode Polyphase Filter By Hussain Alzaher & Noman ... | PowerPoint PPT presentation | free to view

DC/DC Switching Power Converter with Radiation Hardened Digital Control Based on SRAM FPGAs PowerPoint PPT Presentation

DC/DC Switching Power Converter with Radiation Hardened Digital Control Based on SRAM FPGAs - MAPLD 2004 DC/DC Switching Power Converter with Radiation Hardened Digital Control Based on SRAM FPGAs F. Baronti 1, P.C. Adell 2, W.T. Holman 2, R.D. Schrimpf 2, L.W ... | PowerPoint PPT presentation | free to view

Dataflow Style Combinational Design with VHDL PowerPoint PPT Presentation

Dataflow Style Combinational Design with VHDL - IAY 0600Digital Systems Design. ... Concurrent signal assignment statements assigns a new value to the signal on the left ... the compiler cannot tell the aggregate ... | PowerPoint PPT presentation | free to view

Design Review Spartan IR Camera E Loh, Physics-Astronomy Department, Michigan State University East Lansing, 22 May 2001 PowerPoint PPT Presentation

Design Review Spartan IR Camera E Loh, Physics-Astronomy Department, Michigan State University East Lansing, 22 May 2001 - Title: Design Review Spartan IR Camera Author: Edwin Loh Last modified by: Edwin Loh Created Date: 11/14/2000 9:58:15 AM Document presentation format | PowerPoint PPT presentation | free to view

UNIT-III CONTROL UNIT DESIGN PowerPoint PPT Presentation

UNIT-III CONTROL UNIT DESIGN - UNIT-III CONTROL UNIT DESIGN INTRODUCTION CONTROL TRANSFER ... A micro-programmed control unit is flexible and allows designers to incorporate new and more powerful ... | PowerPoint PPT presentation | free to view