The ESW Paradigm - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

The ESW Paradigm

Description:

The ESW Paradigm. Manoj Franklin & Guirndar S. Sohi. 05/10/2002. Observations ... unconventional PL paradigm. comm cost can be high. SS, VLIW (sequential) ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 15
Provided by: victo56
Category:
Tags: esw | paradigm

less

Transcript and Presenter's Notes

Title: The ESW Paradigm


1
The ESW Paradigm
  • Manoj Franklin Guirndar S. Sohi
  • 05/10/2002

2
Observations
  • Large exploitable ILP, theoretically
  • Close instructions dependent parallelism
    possible further down stream
  • Centralized resources is bad
  • Minimizing comm cost is important

3
What about others?
  • Dataflow model
  • most general
  • unconventional PL paradigm
  • comm cost can be high
  • SS, VLIW (sequential)
  • temporal locality
  • large centralized HW
  • compiler too dumb
  • not scalable
  • ESW dataflow sequential

4
Design Goals
  • Decentralized resources
  • Minimize wasted execution
  • Speculative memory address disambiguation
  • realizability

Replace large dynamic window with many small ones
5
How it works
  • Basic window
  • Single entry, loop-free, call-free block
  • Equal, superset or subset of basic block
  • Execute basic windows in parallel
  • Multiple independent stages
  • Complete with branch prediction, L1 cache, reg
    fileetc.

6
Dist Inst Supply
Optimization Snooping on L2-L1 Cache traffic
7
Dist Inter-Inst Comm
  • Architecture
  • dist. future file
  • create/use masks for dep. check
  • Observation
  • Register use mostly within basic block
  • The rest in subsequent blocks

8
Dist DMem System
  • Problem
  • Addr. space large, cant create/use mask
  • Need to maintain consistency between multiple
    copies
  • Solution ARB

9
ARB
  • - Bits cleared upon commit
  • Restart stages when dependency violated
  • When load, forward values from ARB if already
    exists

Q. What happens when ARBs full?
10
Simulation Environment
  • Custom simulator using MIPS R2000 pipeline
  • Up to 2 inst fetch/decode/issued/ per IE
  • Up to 32 inst per basic window
  • 4K word L1 cache, 64KB L2 DM Cache (100 hit
    rate, what??)
  • 3-bit counter branch prediction

11
Results
  • Optimizations
  • Moving up instruction
  • Expand basic window (in eqntott and expresso)

Basic window lt basic block
But is 100 cache hit rate reasonable?
12
Discussion
  • Compare this to CMP? RAW?
  • Does the trade-off strike a balance?

13
New Results (1)
In order execution
14
New Results (2)
Out of order execution
Write a Comment
User Comments (0)
About PowerShow.com