Architecture Exploration For Ambient Energy Harvesting Nonvolatile Processors - PowerPoint PPT Presentation

About This Presentation
Title:

Architecture Exploration For Ambient Energy Harvesting Nonvolatile Processors

Description:

Title: Architecture Exploration for Ambient Energy Harvesting Nonvolatile Processors Author: Gautam Das Govardan Last modified by: Gautam Das Govardan – PowerPoint PPT presentation

Number of Views:149
Avg rating:3.0/5.0
Slides: 32
Provided by: Gauta4
Category:

less

Transcript and Presenter's Notes

Title: Architecture Exploration For Ambient Energy Harvesting Nonvolatile Processors


1
Architecture Exploration For Ambient Energy
Harvesting Nonvolatile Processors
2
Introduction
  • Future powered by technology harvesting ambient
    energy sources
  • Battery-free systems
  • Ambient energy sources
  • Solar Energy
  • Wi-Fi and Radio Frequency (RF) energy
  • Motion energy Piezoelectric devices
  • Eg. Wireless powered smart contact lens

3
Application Categories
  • Applications vary in complexity, throughput
    constraints and computational demands.
  • Based on demand for nonvolatility, categorized
    into
  • Signal detection and sensing Detection and
    relaying. Eg. UV radiation, blood pressure, blood
    sugar level, temperature
  • Signal detection and analysis Computation
    carried out for analyzing the signal for
    diagnosis. Eg. wearable EEG/ECG
  • Signal prediction Predicts future pattern. Eg.
    Wearable systems that warn against seizures
  • Ambient energy sources are unreliable. Category 1
    is easier to implement
  • Category 2, 3 require QoS (to be completed within
    fixed time)

4
Energy Harvesting System Structure
  • Energy Harvesting and Management
  • Determines entire power used for signal
  • sensing, processing and transmission
  • Digital Signal Processor More about it
  • later
  • I/O Interface and analog RF frontend
  • Digital interfaces, antennas, etc

5
Processor Design Volatile Vs Nonvolatile
  • Volatile processor with
  • periodic checkpointing
  • Forced rollback to previously
  • checkpointed state
  • NV processor enables more
  • complex state-dependent
  • signal processing that tolerates
  • power source insufficiency and
  • unreliability consumes more
  • power for read and write

6
Architectural Exploration
  • Parameters to be analyzed
  • Number of pipeline stages
  • Data to be backed up
  • Frequency of backup
  • Assumptions
  • MIPS ISA
  • Clock frequency - 8 KHz limited strength of the
    Wi-Fi signal used
  • Instruction memory (ROM) and ICache (SRAM, NVM)
  • Data memory (nonvolatile) and DCache (NV
    write-back)

7
Non-pipelined Configuration (NP)
  • Entire state of the processor can be
    characterized by a single instruction state
  • Program Counter (PC) Instruction being executed
    and needs to be stored
  • Register File (RegFile) Volatile RegFile is
    energy efficient due to frequent usage and large
    number of frequent read and writes
  • Tradeoff between energy consumed in backing up
    and recovering data and the overall performance
  • Which data to save? When to save? 3 policies
  • Backup Every Cycle (BEC)
  • On Demand All Backup (ODAB)
  • On Demand Selective Backup (ODSB)

8
NV Backup Every Cycle (BEC)
  • Employs NVM RegFile inspite of significant energy
    penalty, else volatile and nonvolatile need to be
    updated every cycle
  • PC and few registers in RegFile written every
    cycle
  • Instructions like StoreWord and Jump do not
    require RegFile write

9
NV On Demand All Backup (ODAB)
  • All RegFile entries to be backed up in the event
    of reduced power state
  • If input power lt preset threshold, power warning
    signal is activated
  • Control unit backs up PC and resets atomic flag
  • Upon power restore, energy is accumulated in the
    capacitor

10
NV On Demand Selective Backup (ODSB)
  • Synchronous power warning signal ensures that
    current PC finishes executing and writing back.
    PC 4 is stored to avoid re-execution
  • Change flag to identify if a register has been
    written into
  • Control unit doesnt generate address for
    unchanged data
  • Reduces backup time and energy penalty

11
Simulation Results And Comparison
  • Total area is similar as NVM cache and backup
    blocks are much bigger than logic
  • BEC has lowest peak frequency due to frequent
    backups
  • Recovery time Time from activation of Energy OK
    signal to the time all backup operations are
    complete
  • ODSB backup time lt ODAB backup time

12
Simulation Results And Comparison
  • ODSB is more energy efficient with stable source
    like solar
  • ODSB can reduce backup energy penalty by 69 with
    0.002 area overhead
  • BEC doesnt need time to accumulate energy in
    cap, viable when power failure is extremely
    frequent (less than 1 in 10 cycles)

13
N-stage-pipeline
  • Increased circuit complexity and activity factor
    results in higher power threshold compared to
    non-pipelined processor
  • 5 Stage Pipeline (5SP) under study
  • Two backup schemes proposed
  • Shifted PC and Volatile Flip-flops (SPC/VFF)
  • Nonvolatile Flip-flops Solution (NVFF)

14
Shifted PC Volatile FF (SPC/VFF)
  • Pipelined data flow with bypass and forward,
    complex control flow to handle hazard
  • Shifter buffer stores the PC value in each
    pipeline stage
  • When power is down, PC in write back stage will
    be finished, unfinished PC to be backed up will
    be in data memory stage
  • Shifter used instead of rolling back since
    different
  • PC needs to be backed up for jump and branch
  • An extra 4 clock cycles are needed to re-execute
  • the last 4 instructions lost from the latter
    pipeline stages after recovery

15
Nonvolatile FF Solution (NVFF)
  • This solution uses NVM flip-flops
  • SPC/VFF requires 11 less time and 57 less
    energy than NVFF

16
Out-of-order Processor (OoO)
  • More complex than NP and 5SP
  • System state is broadly distributed across
    structures such as PC, ROB, RegFile, Map Table,
    Issue Queue, Load Store Queue, BHT and BTB
  • Larger power requirement ? fewer periods where
    the input power exceeds the min threshold. Which
    structures need to be backed up?

17
Resource Selection Strategies
  • The resource selection strategies proposed are
  • Minimum State Resource backup solution (MinR)
  • Low-latency Backup solution (LLB)
  • Middle-level Backup solution (MLB)
  • Min-state-lost Backup solution (MPL)
  • Integrated Flexible Atomic Backup Solution (IFA)

18
Resource Selection Strategies
  • Minimum State Resource backup solution (MinR)
  • Backs up min number of bits required to preserve
    functionality
  • Depends on branch misprediction mechanism to
    minimize the number of valid/ relevant state bits
    prior to backup.
  • ROB and PC Backs up the first uncommitted PC at
    the head of ROB
  • ARegFile is backed up as it is small
  • Map Table Pseudo-Misprediction is used to
    restore Map table
  • PRegFile, Ready Table, Free List, BHT, BTB can be
    recovered

19
Resource Selection Strategies
  • Low Latency Backup solution (LLB) Aims to
    minimize the number of bits to store if backup
    begins immediately
  • Backs up the entire ROB, IQ, ARegFile, Map Table
    and PRegFile
  • Middle-level Backup solution (MLB) Backs up
    Ready table and Free List as well
  • Min-state-lost Backup solution (MPL) All
    structures including BHT, BTB backed up
  • Integrated Flexible Atomic Backup Solution (IFA)
    Even if the power is below threshold, it could
    allow for an optional state (BHT) to be stored
    subjected to optimistic attempt

20
OOO Strategies Comparison
  • In MinR pseudo-misprediction operation for map
    table requires extra backup clock cycles. While
    recovering, extra clock cycles needed to restore
    PRegFile, Ready Table and Free Table

21
OoO Strategies Comparison
  • LLB ROB, PRegFile are large ? increase backup
    time and energy. Recovery energy is smaller as
    instructions in ROB are backed up (no
    re-execution)
  • MPL incurs largest backup and recovery penalties,
    but backing up all structures incurs min latency
    to return to peak performance after a power
    failure
  • OoO needs higher threshold, but periods of
    sufficient power are common enough to allow
    superior performance to pay for lost clock cycles

22
Simulation Results
  • The configurations are compared with baseline
    non-pipelined volatile processor without
    checkpointing or data backup
  • The volatile processors progress returns to zero
    when power drops to below threshold
  • Nonvolatile NP and 5SP have higher power
    threshold
  • OoO runs for only a small fraction of time but
    its performance can be upto 4x faster than NP and
    5SP

23
Validation
  • Non pipelined On Demand strategy was explored
    using an actual fabricated processor (THU1010N)
  • It has an Intel 8051 CISC like architecture
  • The saved state includes the state machine that
    captures current instruction
  • PC, RegFiles are FeRAM based FF. FF have
    additional backup FeCap
  • NV processor based system interfaced to a solar
    panel and UV sensor

24
Operation
  • Upon power failure detection, NV control logic
    backs up DFFs to FeCaps
  • When power resumes, data is restored from FeCaps
    to DFFs
  • Internal RC oscillator is used. External osc
    becomes unstable with low power
  • Simulator calibration
  • Several kernels executed both on platform and
    simulator
  • Intermittent power supply modeled by a 1KHz
    square waveform
  • Processor frequency 3MHz
  • Each kernel is executed 1000 times to obtain
    completion time
  • Stable power case No mismatch Unstable power
    case mismatch lt 5
  • Simulator averages energy consumed by instruction
    to estimate remaining energy

25
Dependence On Input Power
  • Input signal characteristics plays a major role
    in determining optimal design.
  • Performance of backup schemes with home and
    office Wi-Fi sources for harvesting
  • In home, NP ODSB architecture is best performing,
    in office OoO MPL is most desirable

26
Dependence On Nature Of Input Source
  • Input energy sources differ in magnitude
  • For each case, the best performing backup policy
    is adopted
  • For same input power source, the actual execution
    time for NP and 5SP is almost same
  • Higher power threshold in OoO results in longer
    Off time

27
Meeting QoS Requirements
  • Some application (like ECG) require periodic
    outputs within fixed time periods QoS
    constraints
  • Ambient energy - unreliable
  • Piezo and solar can provide almost 100 QoS
  • QoS can be improved by
  • Shrinking size and using FinFETs
  • Power reduction techniques dark silicon aware
    architecture, clock gating, DVFS, DATS, Tunnel
    FET, low power sub-threshold circuits

28
Conclusion
  • Explored various factors battery-less system
    with ambient energy
  • Intermittent energy source Different nonvolatile
    processor configurations, techniques to conserve
    state while maximizing forward progress
  • Examined tradeoffs between performance and energy
    for different architecture
  • Compared and validated simulation results with
    nonvolatile solar energy harvesting processor
    platform
  • The video of HPCA 2015 Best Paper Competition Demo

29
References
  • KaiSheng Ma, Yang Zheng, Shuangchen Li, Karthik
    Swaminathan, Xueqing Li, Yongpan Liu, Jack
    Sampson, Yuan Xie, Vijaykrishnan Narayanan. "
    Architecture Exploration for Ambient Energy
    Harvesting Nonvolatile Processors", The
    International Symposium on High-Performance
    Computer Architecture (HPCA-21)
  • A. Parks, A. Sample, Z. Yi, and J. Smith. A
    wireless sensing platform utilizing ambient RF
    energy. In IEEE Radio and Wireless Symposium
    (RWS), 2013.
  • S. Kannan, A. Gavrilovska, K. Schwan, and D.
    Milojicic. Optimizing checkpoints using NVM as
    virtual memory. In IPDPS, 2013.
  • X. Dong, C. Xu, Y. Xie, and N. Jouppi. NVSim A
    circuit-level performance, energy, and area model
    for emerging nonvolatile memory. IEEE
    Transactions on Computer-Aided Design of
    Integrated Circuits and Systems, 31(7)9941007,
    2012.

30
Questions?
31
Thank You
Write a Comment
User Comments (0)
About PowerShow.com