A First-step Towards an Architecture Tuning Methodology for Low Power - PowerPoint PPT Presentation

About This Presentation
Title:

A First-step Towards an Architecture Tuning Methodology for Low Power

Description:

Instruction-set power data. Design flow using the tuning environment. Change application ... small-loop table, datapath shortcuts, register-file copies, etc. ... – PowerPoint PPT presentation

Number of Views:14
Avg rating:3.0/5.0
Slides: 18
Provided by: scie67
Category:

less

Transcript and Presenter's Notes

Title: A First-step Towards an Architecture Tuning Methodology for Low Power


1
A First-step Towards an Architecture Tuning
Methodology for Low Power
  • Greg Stitt, Frank Vahid, Tony Givargis
  • Dept. of Computer Science Engineering
  • University of California, Riverside
  • also with the Center for Embedded Computer
    Systems, UC Irvine

Roman Lysecky Department of IP Management
Conexant Newport Beach
This work was supported by the National Science
Foundation under grants CCR-9811164 and
CCR-9876006, and by a Design Automation
Conference graduate scholarship.
2
Introduction advent of cores
  • In the past, board-level embedded systems were
    built using discrete ICs
  • Today, single-IC systems are increasingly being
    built, using IPs (Intellectual Property)
  • A.k.a. cores
  • Hard core layout
  • Firm core structure (HDL)
  • Soft core synthesizable behavior (HDL)
  • System-on-a-chip (SOC)

3
Introduction embedded systems
  • SOCs implementing an embedded system have a
    unique feature
  • Implements a particular application
  • Thus, the processor may execute a single fixed
    program that never changes
  • Unlike desktop systems, which execute a variety
    of programs
  • Examples digital camera, automobile
    cruise-controller
  • We can exploit this fixed-program feature
  • For example, by using mask-programmed ROM
  • But much more can be done

4
Introduction architecture tuning
  • Architecture tuning
  • A way to exploit the fixed-program feature of
    embedded systems
  • First, do architecture design for the particular
    application
  • Then, tune the core-based system architecture
    to the particular application program, before IC
    fabrication
  • Goals better performance, power, size

5
Introduction architecture tuning
  • Examples of tuning optimizations
  • Memory hierarchy no cache, L1 cache, L1L2 cache
  • Cache organization size, associativity, line
    size
  • Bus structure, data/address encoding
  • Microprocessor optimizations
  • Internal small-loop table
  • Controller partitioning
  • Datapath shortcuts
  • Register file copies

6
Introduction Tuning is a special case of Y-Chart
iteration
  • Philips/TriMedia approach of simultaneously
    developing architecture and its applications

Architecture
Applications
Mapping
Analysis
Numbers
7
Problem description
  • Focus of this work
  • Tuning a microcontroller to its program
  • Goal is reduced power without performance loss
  • Restrict tuning to maintain exact instruction set
    compatibility
  • No instructions may be added or deleted
  • Thus, no modification to software development
    environment
  • Also, no problems with porting software to/from
    other versions of the microcontroller
  • Instruction set incompatibility can be a show
    stopper

8
Previous work
  • Application-specific instruction-set processors
    Fisher99
  • Customize a microprocessor to its application(s)
  • e.g., Tensilica
  • Customized instruction-set, requiring customized
    tools
  • Tuning compiler to architecture Tiwari et al 94
  • Architectural description languages to inform
    compiler of architecture features Halambi et al
    99
  • Tuning cache and cache/bus Givargis et al 99
    organization to application

9
Tuning environment
  • Currently for the 8051 microcontroller
  • Starts from VHDL synthesizable model of 8051
    (soft core)
  • Uses Synopsys synthesis, simulation and power
    analysis
  • Uses 8051 instruction-set simulator
  • Uses numerous scripts
  • Goal of the enviroment
  • Understand how power is being consumed for a
    particular application, so that modifications to
    the architecture (or application) can be made to
    minimize that power
  • Three main tools
  • Architectural view
  • Instruction-set view
  • Program/data memory view

10
Tuning environment architectural view tool
11
Tuning environment instruction-set view tool
Instruction Power (mW) ADDC_1 7.340834 ADD_1 7.350
741 ANL_1 6.631394 CLR_1 3.76228 CPL_1 5.481627 DA
5.28897 DEC_1 5.368807 DIV 7.716592 INC_1 4.66286
2 MOVC_1 6.078014 MOVC_2 5.021021 MOV_1 5.577664 M
OV_2 6.164267 MUL 5.522886 NOP 4.900275 ORL_1 6.95
4121 POP 8.103867 PUSH 8.7116
12
Tuning environment program/data memory view tool
Addr Ins Freq Pwr FreqPwr 00000 LJMP 1 0 0 0000
3 MOV_9 108 5.46067 589.752 00005 MOV_9 108 5.460
67 589.752 00007 MOV_9 108 5.46067 589.752 00009
MOV_9 108 5.46067 589.752 00011 RET 108 0 0 000
12 MOV_9 27 5.46067 147.438 00014 MOV_9 27 5.4606
7 147.438 00016 MOV_9 27 5.46067 147.438 00018 M
OV_9 27 5.46067 147.438 00020 MOV_4 27 4.83507 13
0.547 00022 LCALL 27 0 0
Addr Purpose Accesses 00128 P0
1311 00129 SP 70317 00130 DPL
31189 00131 DPH
7977 00144 P1 161 00208 PSW
413527 00224 ACC
360949 00240 B 2598
13
Tuning environment
14
Design flow using the tuning environment
15
Sample tuning optimization
  • Observation
  • RAM consumes much power
  • Address 224 accessed frequently
  • Possible tuning optimization
  • Replace this RAM location by a register inside
    the CTRL module
  • Steps
  • Modify VHDL model
  • Run all three view tools
  • Results
  • Power reduction 7.67 to 7.27 mW

Addr Purpose Accesses 00128 P0
1311 00129 SP 70317 00130 DPL
31189 00131 DPH
7977 00144 P1 161 00208 PSW
413527 00224 ACC
360949 00240 B 2598
16
Some recent data
  • Applied the tuning environment for a particular
    application
  • Converted two frequently-accessed RAM locations
    to registers
  • 15 total power savings
  • Introduced datapath shortcuts for the two most
    common register-to-register moves of the
    application, thus bypassing the ALU
  • 10 total power savings
  • Partitioned the controller into two, one small
    one implementing the frequently-executed
    instructions
  • 10-15 power savings, but we expect much more if
    we do a better job partitioning the design

17
Conclusions
  • Described an environment for tuning a
    microprocessor to its application for low power
  • Full instruction set compatibility
  • Multiple views helps find power hogs
  • Fully automated
  • Focus is now on developing tuning optimizations
  • Controller partitioning, small-loop table,
    datapath shortcuts, register-file copies, etc.
  • Investigate possibility of automating tuning
    optimizations, develop more general tuning
    methodology
  • Environment for the 8051 is available on the web
  • http//www.cs.ucr.edu/dalton
Write a Comment
User Comments (0)
About PowerShow.com