Low Energy Clustered Instruction Fetch and Split Loop Cache Architecture - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Low Energy Clustered Instruction Fetch and Split Loop Cache Architecture

Description:

Instruction Decode. Large memory. Wide cache lines. Expensive switching in. program memory bus ... Instruction Decode. DATAPATH. E = C { aPL0 (1-a) PL1 (1-a) ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 15
Provided by: FranciscoB150
Category:

less

Transcript and Presenter's Notes

Title: Low Energy Clustered Instruction Fetch and Split Loop Cache Architecture


1
Low Energy Clustered Instruction Fetch and Split
Loop Cache Architecture
  • Murali Jayapala Francisco Barat
  • Pieter Op de Beeck
  • ESAT/ACCA, K.U.Leuven, Belgium

Francky Catthoor Rudy Lauwereins IMEC, Leuven,
Belgium
2
Problem
  • Context
  • Embedded Processor Architectures
  • Multimedia Applications
  • Low Power Embedded Systems
  • Power Breakdown
  • 43 of power in Memory
  • StrongARM SA110 A 160MHz 32b 0.5W CMOS ARM
    processor
  • 40 of power in internal memory
  • C6x, Texas Instruments Inc.

3
Problem
Significant Power consumption in Instruction
Memory Hierarchy
I-Cache
Instruction Decode
Active every instruction Cycle
DATAPATH
4
Outline
  • The Approach
  • Loop Cache
  • Contributions
  • Variance of power among different schemes
  • not well analysed
  • Splitting
  • Results
  • Conclusion

5
The Approach
  • Small sized loops in the application
  • a of Execution cycles spent in the loops
  • Employ Loop cache
  • Approach 1
  • Approach 2

6
Approach 1
  • No Change in Pipeline
  • Cache Control
  • Hardware
  • Automatic loop detection logic
  • MCore, L. H. Lee et al
  • Software
  • Explicit placing of cache data
  • N. Bellas et al

E C aPL0 (1-a) PL1 PD
7
Approach 2
  • Reduces pipeline stages
  • Cache Control
  • Hardware
  • Automatic loop detection logic
  • R. Bajwa et al
  • TI Inc, T. Anderson et al
  • Software
  • -

E C aPL0 (1-a) PL1 (1-a) PD
8
Energy analysis
9
Extensions Split Loop Caches
  • Clusters of instruction in loops with different
    code fields activated
  • Address Calculations
  • Index calculations
  • Arithmetic and Logic Computations
  • Computations on array elements
  • ? (Psmall) ? Plarge

10
Extensions 1
  • No Change in Pipeline
  • Cache Control
  • Software
  • Explicit placing of cache data through
    instructions

E C ?i (aiPL0i PDi) (1 - a) PL1
11
Extensions 2
  • Reduces pipeline stages
  • Cache Control
  • Software
  • Explicit placing of cache data through
    instructions
  • Local Controller
  • Demand cache to feed cache
  • Support for control constructs

LC
LC
E C ?i (aiPL0i (1 - a) PDi) (1 - a) PL1

12
Splitting Vertical
13
Results
14
Conclusion
  • Split loop cache architectures
  • Further Power/Energy reduction
  • Enables more compiler oriented optimizations
  • Splitting
  • Basic blocks
  • Emphasis on vertical splitting of the code
  • at granularity of
  • Loops
  • Instructions
Write a Comment
User Comments (0)
About PowerShow.com