Retrospective on the VIRAM-1 Design Decisions - PowerPoint PPT Presentation

About This Presentation
Title:

Retrospective on the VIRAM-1 Design Decisions

Description:

Low power is important for embedded and multimedia applications ... Some of the are quite complex to implement and not obviously need for multimedia codes ... – PowerPoint PPT presentation

Number of Views:13
Avg rating:3.0/5.0
Slides: 14
Provided by: Christofor9
Category:

less

Transcript and Presenter's Notes

Title: Retrospective on the VIRAM-1 Design Decisions


1
Retrospective on the VIRAM-1 Design Decisions
  • Christoforos E. Kozyrakis
  • kozyraki_at_cs.berkeley.edu
  • IRAM Retreat
  • January 9, 2001

2
What We Probably Got Right
  • Low power design approach
  • Use of a commercial MIPS core
  • Permutation instructions
  • Fixed-point arithmetic model
  • Single load-store unit
  • Dropping of the network interface
  • Testing infrastructure

3
Low Power Design Approach
  • Two design alternatives for VIRAM-1
  • 200 MHz, 2 W, 4 vector lanes
  • 500 MHz, 10 W (?), 4-8 vector lanes (?)
  • Low power was the right choice because
  • Low power is important for embedded and
    multimedia applications
  • It is easier to design a low power processor than
    a high frequency one
  • High power consumption would severely interfere
    with DRAM operation

4
Use of Commercial MIPS Core
  • Scalar core alternatives
  • Custom design optimized for a vector unit
  • Commercial core with generic coprocessor
    interface
  • The MIPS m5Kc core was a great choice because
  • It is a flexible, synthesizable design with a lot
    of documentation and support
  • It comes with a RTL simulation environment which
    we reused for VIRAM-1
  • It allowed us to work on a demo system based on a
    MIPS daughter-card and demo board

5
Other Issues We Got Right
  • Simple instructions for intra-register
    permutations
  • Allow the vectorization of reductions and FFT
  • Simple implementation compared to a general
    permutation
  • Single load-store unit
  • Not sufficient memory bandwidth for two units
  • Address calculation and translation resources are
    expensive
  • Not obviously useful for most media applications

6
Other Issues We Got Right
  • Dropping of the network interface
  • Not necessary for embedded/multimedia systems
  • Would introduce significant design complexity
  • Testing infrastructure
  • Highly automated and easy to use for developing
    tests and verifying the complete VIRAM-1 design

7
What We Probably Got Wrong
  • Insufficient benchmarking at early project stages
  • Support for 64-bit data-types
  • Lack of sub-banks in DRAM macros
  • Dropping the decoupled pipeline
  • Use of a crossbar for memory transfers
  • Too much support for arithmetic exceptions
  • Too much support for conditional execution

8
Insufficient Benchmarking
  • Limited benchmarking was performed early enough
    to affect major design decisions
  • Previous experience and intuition used in several
    cases
  • Reasons for limited benchmarking
  • Lack of compiler
  • Lack of flexible performance model
  • Lack of man power and time
  • Some of the following issues could probably be
    avoided if we had done more benchmarking

9
Support for 64-bit Data Types
  • VIRAM-1 supports 64-bit integer operations
  • Excluding encryption, few multimedia applications
    require 64-bit operations
  • Benefits from not supporting 64-bit operations
  • Large area savings from datapaths and pipeline
    registers
  • Large wiring savings from reduced width of data
    busses
  • Fewer modes to support and verify

10
Lack of DRAM Sub-banks
  • The DRAM macro used has a single bank
  • No overlapping of accesses to different rows is
    allowed
  • Significant performance bottleneck for
    applications with strided or random accesses
  • 4 addresses per cycle for 8 banks with 5 cycles
    random access cycle
  • Bank conflicts reduce random bandwidth even
    further

11
Other Issues We Got Wrong
  • Dropping the decoupled pipeline
  • The delayed pipeline was preferred to a
    decoupled one due to complexity and power
    advantages, despite the performance issues
  • Due to the length of the pipeline and the lack of
    sub-banks, it is not obvious that this was a wise
    decision
  • Use of a crossbar for memory transfers
  • The memory crossbar is the weakest design
    component in terms of scalability and flexibility
  • Alternative approaches (e.g. ring) were probably
    worth a closer examination before rejecting

12
Other Issues We Got Wrong
  • Too much support for arithmetic exceptions
  • VIRAM-1 includes extensive support for software
    speculation, user-level handlers, precise
    execution (slower) for arithmetic exceptions
  • Many of these features will never be used by the
    compiler, multimedia applications, or system
    software
  • Too much support for conditional execution
  • VIRAM-1 implements all possible alternatives for
    vector conditional execution (masked
    instructions, masked merger, scatter-gather,
    compress-expand)
  • Some of the are quite complex to implement and
    not obviously need for multimedia codes

13
What May Be Too Early To Call
  • Full-custom design of integer datapaths
  • Optimal area and power consumption but requires
    significant design time
  • Maybe we could use an ASIC approach based on
    tiling specialized macro-cells or library
    components
  • Use of two multipliers per vector lane
  • Most applications dont have such a high ration
    of multiply or multiply-add operations
  • Consumes a significant amount of area
Write a Comment
User Comments (0)
About PowerShow.com