Improving the Efficiency of Memory Partitioning by Address Clustering - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Improving the Efficiency of Memory Partitioning by Address Clustering

Description:

Address Clustering-Encoder. Hardware Encode : the swap of ... Encoder Overhead Analysis. Encoders have been synthesized with Synopsys DesignCompier on a 0.25um ... – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 22
Provided by: RR479
Category:

less

Transcript and Presenter's Notes

Title: Improving the Efficiency of Memory Partitioning by Address Clustering


1
Improving the Efficiency of Memory Partitioning
by Address Clustering
  • Alberto Macii Enrico Macii Massimo Poncino
  • Proceedings of the Design ,Automation and Test in
    Europe Conference and Exhibition
  • Presenter Hung Yu Chen

2
Abstract
  • Memory partitioning is a effective approach to
    memory energy optimization in embedded systems.
    Spatial locality of the memory address profile is
    the key property that partitioning exploits to
    determine an efficient multi-bank memory
    architecture.This paper presents an approach,
    called address clustering, for increasing the
    locality of given memory access profile, and thus
    improving the efficiency of partitioning.Results
    obtained on several embedded applications running
    on an ARM7 core show average energy reductions of
    25 (maximum 57) w.r.t a partitioned memory
    architecture synthesized without resorting to
    address clustering.

3
Outline
  • Whats the problem?
  • Memory Energy
  • Memory Partitioning
  • Address Clustering
  • Experimental Result
  • Conclusions

4
Whats the problem?
  • Modern SoC platforms usually contain one or more
    processors.
  • the increasing gap between processor and memory
    speed.
  • Various types of on-chip embedded memories
    providing shorting latencies and wider
    interfaces.
  • Problem
  • Ubiquity of embedded memories makes them the
    largest contributor to the overall energy budget
    of a chip.

5
Memory Energy
  • ModelEmen ?Ni1 Cost(i)
  • Nnumber of accesses during the computation.
  • Cost(i) cost of an access due to the memory
    organization and the cost of the physical
    access given by technology.
  • Memory energy optimization
  • Reducing Cost(i)
  • build low-energy memory architecture.
  • Reducing N
  • modify the memory access pattern.
  • Both two.

6
Memory Partitioning
  • memory partitioning technique.

7
Memory Partitioning (cont.)
  • Figure 1-a
  • The whole address space of the application is
    mapped to a single SRAM memory array.
  • Figure 1-b
  • A dynamic access profile.
  • Figure 1-c
  • The partitioned memory.
  • Notice that we need to account for the power
    consumed in the entire partitioned memory system.

8
Address Clustering-Example
  • MPEG Decoding application for ARM7 core
  • Instruction stream

9
Address Clustering-Example (cont.)
  • Figure 2 show
  • Total number of addresses 31,233 (range from 0
    to 124,892)
  • Memory cut has 1,952 rows 512 columns.
  • Power consumes 170mJ. (44.4 million total read)
  • Memory partitioning
  • Three memory blocks of sizes
  • 736256 696512 892512
  • Power consumes 96mJ. (inclusive of the overhead)
  • 43.5 Energy reduction
  • 696512 keep the majority (82) of the memory
    accesses. (36 million out of 44.4)

10
Address Clustering-Example (cont.)
  • Figure 3 Clustered Address Profile of a MPEG
    Decoder
  • Two memory block sizes 212128 1900512
  • Power 42mJ. (an additional 56 of energy saved)
  • 99 of the memory access. (43.99 million out of
    44.4 )

11
Address Clustering-Problem
  • Find a relocation of a proper subset of the
    address space.
  • Maximize the locality of the dynamic trace.
  • Minimizing the energy consumption of the memory
    architecture
  • Cost Metrics
  • Dynamic access profile C c0,c1,.,cN-1
  • D(C,W) maxi (Si) , i 0, 1, , N-W
  • (Si) ?W-1j0 cij , W a sliding window of
    size
  • d(C,W) D(C,W) / Tot.
  • Tot ?Ni0 Ci

12
Address Clustering-Problem (cont.)
  • Figure4 shows the values of d(C,W) for w 32,
    64, 128, 256, 512, about Figure2.

80
13
Address Clustering-Exploration
  • High-level pseudo-code
  • Explore find a good value of W

14
Address Clustering-Clustering Algorithm
  • Cluster returns a modified trace whose first M
    locations contain the M most visited addresses.

15
Address Clustering-Encoder
  • Hardware Encode
  • the swap of address pair -gt 2M Cluster Address.
  • f(X) represents a function if X belongs to the
    set of 2M.
  • Clustering address X R(X).
  • 32 input, combinational network.

16
Experimental Result
  • Benchmarks are taken from the Ptolemy
    distribution, others come from the MediaBench
    suite.
  • Platform ARM software development kit.
  • Table1
  • Addr total number of distinct addresses.
  • Emono the energy of the monolithic memory that
    contains all the
    data/instructions.
  • Epartitioned total memory energy of a
    partitioned memory architecture.
  • M 256, 512, 1024 memory partitioning combined
    with address clustering.

17
Experimental Result (cont.)
18
Experimental Result (cont.)
  • Original vs. Clustering (Energy)

19
Encoder Overhead Analysis
  • Encoders have been synthesized with Synopsys
    DesignCompier on a 0.25um technology by
    STMicroelectronics
  • Power figure (Figure 8) are obtained with
    Synopsys PowerCompier.
  • The energy figures over the various applications
    is relatively small
  • The complexity of the decoder is basically
    independent of the set of addresses that are
    clustered.
  • The switching activity of the address lines is
    very similar for all benchmarks.

20
Encoder Overhead Analysis (cont.)
  • 16K memory which dissipates about 375 mW
  • frequency of 150Mhz.
  • Power 7.5 mW for M 1024.

21
Conclusions
  • Energy reduction achievable by memory
    partitioning technology can be improved sensibly
    by increasing the locality of the trace.
  • Proposed an architectural solution, called
    Address Clustering.
  • Experimental results on a set of typical embedded
    applications running on an ARM-based system.
  • Address Clustering is able to reduce the energy
    consumption of a partitioned memory architecture
    by 25 on average (maximum 57) with respect to
    the partitioning driving by the original trace.
Write a Comment
User Comments (0)
About PowerShow.com