Lecture 11 LowPower Static RAM Architectures - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Lecture 11 LowPower Static RAM Architectures

Description:

Lecture 11 LowPower Static RAM Architectures – PowerPoint PPT presentation

Number of Views:959
Avg rating:3.0/5.0
Slides: 36
Provided by: pagr
Category:

less

Transcript and Presenter's Notes

Title: Lecture 11 LowPower Static RAM Architectures


1
Lecture 11Low-Power Static RAM Architectures
  • Static SRAM organization and MOS SRAM cell
  • Banked SRAM organization
  • Reducing bit line voltage swing
  • Reducing write driver power sense amplifier
    power
  • Low core supply voltage techniques
  • Summary
  • Michael L. Bushnell
  • CAIP Center and WINLAB
  • ECE Dept., Rutgers U., Piscataway, NJ

2
Introduction
  • Portable computing requires low-power design
  • System-on-a-Chip technology integrates memory
    onto same chip as processors
  • Memory power consumption is major problem
  • Focus on SRAM power reduction
  • Major loss of power
  • DRAM design benefits from low-power SRAM decoder

3
SRAM Organization
4
Problem
  • Activating a word line for a row causes
  • All columns in that row to be active
  • Switches way too much power
  • Must focus on activating only those columns in
    the row
  • That are actually being read/written

5
MOS SRAM Cell
  • Bi-stable inverter loop
  • CMOS usually used for low-power design
  • 6T SRAM cell
  • However, nMOS SRAM has less area
  • 4T SRAM cell
  • nMOS may not use much more power than CMOS
  • If properly designed

6
Cell Transfer Characteristic
7
4T nMOS SRAM Implementation
  • High-valued Rs made from undoped polysilicon
  • Extremely compact fewer transistors, no nWells
  • Extremely small currents required

8
4T Cell Design
  • Higher R
  • Reduces current consumption
  • Lower standby power consumption
  • Must get correct ratio of pass transistors T1/T2
  • To pull-down transistors T3/T4
  • Ratio called the aspect ratio
  • Choose so that internal cell voltages never rise
    enough to upset cell state
  • Energy needed to switch bit-line capacitances gt
    energy to switch cell

9
6T SRAM Cell
  • Resistors replaced by minimum size pFETs

10
6T Cell Design Considerations
  • Supply current drawn is limited to stable state
    transistor leakage currents
  • Inverters have large nMOS width compared to pMOS
    width
  • Inverter switching threshold close to nMOS Vtn
  • Aspect ratio design similar to that for 4T cell

11
SRAM Power Reduction
  • Reduce power (in general) by
  • Lowering switched Capacitance
  • Lowering voltage swing
  • Lowering activity factor
  • Lowering operation frequency
  • Easiest to lower voltage swing in memory on
    bit/word lines
  • Limited by
  • Inability to resolve small voltage differentials
    at adequate speed
  • Increasing soft bit error rates and degraded
    signal integrity

12
Banked SRAM Organization
  • Reduces switched capacitance, reduces power,
    increases speed
  • R x C SRAM means that any access to row
  • Enables R rows
  • Enables all bit lines
  • Ccell is individual cell capacitance
  • Causes R x C x Ccell capacitance to be switched
  • Solution Split memory into B banks
  • Only 1 bank enabled for an access, not all banks
  • Switched capacitance now R x C x Ccell / B

13
SRAM Bank
14
Divided Word Lines
  • Applies banking only to word lines
  • Divided into groups, only 1 of which is enabled
  • Need gate and local driver for each group
  • Severely reduces driven C on word line
  • Limitation is that it adds word line driving delay

15
Reduced Bit Line Voltage Swing
  • Can end sense amplifier read operation as soon as
    differential voltage detection is complete
  • Saves fraction of power needed to accomplish read
  • DV bit line voltage swing
  • Vcore core supply voltage
  • r operation fraction that is read
  • f frequency of core operations
  • Read power ½ Ceff Vcore DV r f
  • Reducing DV often fails increases noise
    sensitivity, sense amp complexity, reduces RAM
    performance

16
Early Word Line Termination
  • Reduces bit line swings

17
Pulsed Word Lines
  • Enable word lines only for precise time
  • Needed to develop bit cell voltage discharge
  • Use pulse generator
  • Gates word line and sense amplifier
  • Need margin for worst-case pulse width
  • Must estimate actual RAM access time

18
Self-Timed RAM Core
  • Different rows have different access speeds
  • Row closest to sense amps is fastest
  • Columns closest to word line drivers enabled
    first
  • Tailor pulse width to RAM access time
  • Use dummy column to time signal flow
  • Forced to known state by shorting one internal
    node
  • Set SR flip-flop to trigger word line
  • By time dummy column sense amp generates high
  • Rest of columns have been sensed
  • Dummy column sense amp resets SR flip-flop, turns
    off word line
  • Dummy column adds insignificant chip area/power
    overhead
  • Called word line kill circuit

19
Dummy Column for Self-Timing
20
Bit Line Precharge Voltage
  • Two methods
  • Uses MOS device static load enhancement nMOS,
    depletion nMOS, standard pMOS
  • Use precharger transistor no static power
    consumed
  • Needs more power to drive clocked precharger
  • Lower precharge V lowers power consumption (less
    bit line voltage swing)
  • Enhancement nMOS most effective
    (precharge to VCC Vtn)
  • Optimal precharge voltage VCC / 2

21
Problems with Lower Precharge V
  • Read forces SRAM internal nodes towards bit line
    voltage
  • Bit line precharges to VCC Vtp
  • Forces cell internal nodes low counteracted by
    weak cell pMOS
  • Read may destroy old cell data
  • Avoid this problem - use different bit line
    voltages for read/write

22
Write Driver Power Reduction
  • Reduce power in word line decoders and drivers
  • Write line driver only drives 1 word line at a
    time
  • Small contribution to overall power
  • Want fast row decoding
  • NAND decoder only changes word line output of 1
    row
  • Slower
  • NOR decoder changes word line output of all but 1
    row
  • Faster but very bad for power

23
Domino NAND Decoder
24
NOR Decoder
25
Improve NAND Decoder Speed
  • Do not decode A address lines into 1 of 2A word
    lines
  • Split decoding process
  • Decode A1 lt A address lines
  • Use 2A1 lines to activate one of second stage
    decoders
  • Second stage decodes A A1 lines into 2(A A1)
    word lines
  • Get total of 2A1 x 2(A A1) 2A lines
  • Recursively repeat to get a tree of intermediate
    decoders extreme is to decode 1 address
    line/stage

26
Multistage NAND Decoder
27
Sense Amp Power Reduction
  • Larger currents improve speed
  • Becomes significant fraction of total power
  • Have a sense amplifier enable signal
  • Power reduction
  • Limit current by enabling sense amp for minimum
    needed period
  • Use self-timed RAM core
  • Use sense amp that automatically cuts off after
    sensing
  • Sets SR flip-flop, once dummy sense amp finished,
    resets SR flip-flop, turns off sense amplifier
    enable
  • Alternative shape tail current of amp by
    activating pull-down transistors of differential
    amps in sequence

28
Differential Sense Amp
29
Differential Charge Sense Amp
30
Self-Timed Sense Amp
31
Self-Latching Sense Amp
  • Self-latching sense amp automatically limits
    currents after sense
  • Cross-coupled amplifying inverter loop
  • Extra transistors transfer bit line voltages to
    inverter loop

32
Latched Sense Amp
33
Low Core Voltage from Single Supply
  • Memory core
  • Square law relationship for both standby and
    dynamic power with respect to core voltage
  • Commodity RAM Have single external supply
    voltage
  • Step this down to get lower core voltage
  • Sakata method Achieve ½ core supply voltage
  • Place 2 identical DRAM cores in series
  • If average power consumption fairly constant
  • Results in potential divider
  • Top and bottom core supplies VCC / 2

34
Voltage Supply Step-Down Circuits
  • Step-down circuit design with low DC voltage,
    significant current drain is hard
  • Cannot use inductors and pulse width control
    circuits
  • Implement by charging N series capacitors from
    VCC
  • Achieve parallel capacitor connection by
    opening/closing switches
  • Steps voltage down to VCC / N
  • Run at high rate, to get smooth supply waveform
  • Need near-ideal switches and capacitors hard to
    get in CMOS

35
Summary
  • Switched capacitance reduction techniques
  • Reduces power
  • Also improves performance
  • Banked SRAM Organization
  • Divided word lines
  • Change decoder design (useful for DRAMS, too)
  • Voltage swing reduction techniques
  • Early word line cutoff
  • Reduced bit line swings
  • Self-timed RAM core
Write a Comment
User Comments (0)
About PowerShow.com