CHP and Verilog Modeling of Asynchronous Processes - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

CHP and Verilog Modeling of Asynchronous Processes

Description:

Achieves 700 in 0.25 micron technology. F. Input. Channels (L) Output. Channels (R) ... Need one Le per input rail. Acknowledge only input channel that is read ... – PowerPoint PPT presentation

Number of Views:100
Avg rating:3.0/5.0
Slides: 28
Provided by: sangy4
Category:

less

Transcript and Presenter's Notes

Title: CHP and Verilog Modeling of Asynchronous Processes


1
CHP and Verilog Modeling of Asynchronous Processes
  • Peter A. Beerel
  • University of Southern California

2
Outline
  • Weak Condition Half Buffer Template
  • CHP and Verilog
  • Precharge Half and Full Buffer Template
  • CHP and Verilog
  • Verilog Modeling Issues
  • Conditional Reads / Writes
  • Performance (Forward Backward Latency)
  • Modeling larger blocks
  • Top-down design in Verilog

3
Four-phase Protocol
  • Two-phases active communication
  • Two-phases Return to zero phases

4
Asynchronous Functional Blocks
F
Output Channels (R)
Input Channels (L)
  • Functionality
  • Read a subset of input channels
  • Waits for input channels to have data tokens
  • Compute F and write to a subset of output
    channels
  • Waits for output channels to be reset
  • Acknowledges input channels
  • Resets output channels
  • Upon being acknowledged

5
Fine Grain Pipelining
F
Output Channels (R)
Input Channels (L)
  • Basic Idea
  • Each stage very small
  • Fast latency
  • Two gate delays forward latency
  • One dynamic gate plus the inverter
  • Fast cycle time
  • 10-18 gate delays depending on template
  • Achieves 700 in 0.25 micron technology

6
Weak Condition Half Buffer (Lines)
  • Weak Condition
  • Validity of outputs implies validity of inputs
  • Inputs consumed -gt can assert acknowledgement
  • Reset of outputs implies reset of inputs
  • Inputs reset -gt can de-assert acknowledgement

Re
Le
L0
C
R0
Re
Le
Bank of C-latches
R
L
R1
C
L1
WCHB 1-of-N Buffer
WCHB 1-of-2 Buffer (optimized)
7
CHP Behavioral Description of WCHB
  • Ra L R? La ? Ra L R? La?
  • Interpretation Repeat idefinitely
  • Ra L wait for right channel to be reset
    and new token to arrive on left channel
  • R? send new token on left channel
  • La ? acknowledge left hand side
  • Ra L wait for right hand side to
    acknowledge and left hand side to reset
  • R? Lower right hand side acknowledge
  • La ? Lower left acknowledge
  • Initial Condition
  • Ra La 0 L R reset

8
Verilog Psuedocode for WCHB
  • Initial
  • La 0 R space
  • Always
  • Begin
  • Fork
  • Wait negedge (Ra)
  • Wait posedge (L0) or posedge(L1) / assuming
    dual rail /
  • Join
  • R lt- Func( L )
  • La 1
  • Fork
  • Wait posedge(Ra)
  • Wait pngedge(L0) or nedgedge (L1 0)
  • Join
  • Reset R
  • La 0
  • End

9
Verilog Psuedocode for WCHB
  • Alternative wait on condition rather than events
  • Wait tests and stalls until condition is
    satisfied
  • Need not worry about missing events
  • Always
  • Begin
  • Wait (Ra 0)
  • Wait (L0 1 or L11) / assuming dual rail
    /
  • R lt- Func( L )
  • La 1
  • Wait (Ra 1)
  • Wait (L0 0 and L1 0)
  • Reset R
  • La 0
  • End

10
Precharge Half-Buffer (Lines)
  • Goals
  • Use pre-charge logic and an input completion
    detector instead of weak-condition logic
  • Requires 2 guard transistors (Pc and Eval) in
    Function blocks
  • Removes many P-transistors in pull-up and
    pull-down

Pc
Eval
C
Re
Le
Rx
Pc
Eval
Pull-down Logic
RCD
LCD
L
F
R
L
Function block schematic for each output rail
11
CHP Behavioral Description of PCHB
  • Ra L R? La ? Ra R? L La?
  • Interpretation Repeat idefinitely
  • Ra L wait for right channel to be reset
    and new token to arrive on input channel
  • R? send new token on output channel
  • La ? acknowledge to input
  • Ra R ? wait for output channel to
    acknowledge before resetting outputs
  • L La ? wait for input channel to reset
    before resetting acknowledge
  • Initial Condition
  • Ra La 0 L R reset
  • Increase parallelism over WCHB
  • Output hand side resets without waiting for input
    side reset

12
Verilog Psuedocode for PCHB
  • Initial
  • La 0 R space
  • Always
  • Begin
  • Wait Ra 0
  • Wait (L0 1 or L1 1) / assuming dual
    rail /
  • R lt- Func( L)
  • La 1
  • Wait Ra 1
  • Reset R
  • Wait (L0 0 and L1 0)
  • La 0
  • End

13
CHP Behavioral Description of PCFB
  • Ra L R? La ? ( Ra R? ), (L
    La?)
  • Interpretation Repeat idefinitely
  • Ra L wait for right channel to be reset
    and new token to arrive on input channel
  • R? send new token on output channel
  • La ? acknowledge to input
  • (Ra R ?), L La ?) wait for output
    channel to acknowledge before resetting outputs
    and concurrently wait for input channel to reset
    before resetting acknowledge
  • Initial Condition
  • Ra La 0 L R reset
  • Increase parallelism over PCHB
  • Input and output handshake completes concurrently
  • New token can arrive on inputs while output
    channel still busy
  • Slack 1 instead of ½.

14
Verilog Psuedocode for PCFB
  • Initial
  • La 0 R space
  • Always
  • Begin
  • Wait Ra 0
  • Wait (L0 1 or L1 1) / assuming dual
    rail /
  • R lt- Func( L) La 1
  • Fork
  • Begin
  • Wait Ra 1 Reset R
  • End
  • Begin
  • Wait (L0 0 and L1 0) La 0
  • End
  • Join
  • End

15
Conditional Reading and Writing
  • Conditional Reading of Input Channels
  • Le generation modified
  • Need one Le per input rail
  • Acknowledge only input channel that is read
  • Control signals need to be part of Le logic
  • Conditional Writing of Output Channels
  • Skip circuitry
  • Generate extra N1 output that is not routed out
    but goes to completion detection
  • Modify circuitry so that you dont wait for reset
    of output channels that were not written

16
Conditional Join A Merge Circuit
17
Verilog Psuedocode for PCHB Join
  • Initial
  • La 0 R space
  • Always
  • Begin
  • Wait Ra 0
  • Wait (S0 1 or S1 1) / assuming dual
    rail /
  • If S0 1
  • Wait L1
  • R lt- L1 / psuedocode must be expanded /
  • La 1 Sa 1
  • Else
  • Wait L2
  • R lt- L2 / psuedocode must be expanded /
  • La 1 Sa 1
  • Wait Ra 1 Reset R
  • Fork
  • Begin Wait (L0 0 and L1 0) La 0 End
  • Begin Wait (S0 0 and S1 0) Sa 0 End
  • Join

18
Conditional Fork A Split Circuit
19
Verilog Psuedocode for PCHB Split
  • Initial
  • La 0 R space
  • Always
  • Begin
  • Wait (Rxa 0 and Rya 0)
  • Wait (S0 1 or S1 1) / assuming dual
    rail /
  • Wait (L0 1 or L1 1) / assuming dual
    rail /
  • If S0 1
  • Rx lt- L / psuedocode must be expanded /
  • La 1 Sa 1 Wait Rxa 1 Reset Rx
  • Else
  • Ry lt- L / psuedocode must be expanded /
  • La 1 Sa 1 Wait Rya 1 Reset Ry
  • Wait (L0 0 and L1 0) La 0
  • Wait (S0 0 and S1 0) Sa 0
  • End

20
Modeling Delays of PCHB in Verilog
  • Begin
  • Wait Ra 0
  • Wait (L0 1 or L1 1) / assuming dual
    rail /
  • Forward_latency R lt- Func( L)
  • Lack_delay La 1
  • Wait Ra 1
  • Precharge_delay Reset R
  • Wait (L0 0 and L1 0)
  • Lackreset_delay La 0
  • End
  • Delay Interpretations
  • Lack_delay RCD C-element
  • Assumes LCD lt Forward_latency RCD
  • Lackreset_delay Max(RCD, LCD) C-element
  • Which one of the max depends on arrival time of
    L

21
Local Cycle Time of PCHB
C
C
C
RCD
LCD
RCD
LCD
RCD
LCD
F1
F2
F3
  • Maximum of
  • 3 Evaluations 1 RCD 1 C-element 1 Precharge
    1 RCD 1 C-element
  • 2 Evaluations 1 RCD 1 C-element 1 Precharge
    1 LCD 1 C-element
  • 1 Evaluation 1 RCD 1 C-element 1 Precharge
    1 RCD 1 C-element
  • 1 Evaluation 1 LCD 1 C-element 1 Precharge
    1 RCD 1 C-element

22
Cycle time of PCHB in Verilog
  • Begin
  • Wait Ra 0
  • Wait (L0 1 or L1 1) / assuming dual
    rail /
  • Forward_latency R lt- Func( L)
  • Lack_delay La 1
  • Wait Ra 1
  • Precharge_delay Reset R
  • Wait (L0 0 and L1 0)
  • Lackreset_delay La 0
  • End
  • Cycle Time is Max of
  • 3 Forward_latency Lack_delay
    Precharge_delay Lackreset_delay
  • 2 Forward_latency Lack_delay Precharge_delay
    Lackreset_delay
  • Forward_latency Lack_delay Precharge_delay
    Lackreset_delay
  • Backward latency Cycle_time Forward_latency
    (?)

23
Modeling Larger Blocks in Verilog
  • Asynchronous Matrix Multiply Example
  • y30 0, i 0
  • Mulin?x
  • i lt 3 -gt
  • y30 y30 coeff(j) x
  • i i 1
  • i 3 -gt
  • Accout30 ! (y30 coeff(j) x
    )
  • i 0, y30 0

24
Matrix Multiply in Verilog Psuedocode
  • Initial
  • Begin
  • y30 0 i 0 MulinAck 0 Accout space
  • End
  • Always
  • Begin
  • Wait AccoutAck 0 Wait (Valid(Mulin)) x
    ChannelTypetoInt(Mulin)
  • Fork
  • Begin
  • If (i lt 3)
  • y30 coeff(j) x i
  • Else
  • Begin
  • Accout30 InttoChannelType(y30
    coeff(j) x )
  • i 0, y30 0
  • Wait AccoutAck 1 Accout30 space
  • End
  • End
  • Begin

25
Issues with Matrix Multiply
  • Model not pipelined
  • Presented model is a full buffer
  • No further pipelining described
  • Real Implementation
  • Possibly much finer grain pipelined
  • Created by top-down design methodology
  • Presented model can be used to check functional
    correctness of more detailed designs
  • But not performance due to lack of pipelining

26
Top-Down Refinement
Test Bench
Behavioral Non-Pipelined Model
Fork
Comparator
Structural Pipelined Model
27
Top-Down Design in Verilog/Cadence
  • Verilog
  • Channels must be precisely defined at all levels
  • 32 bit bus implementation decision must be made
    early
  • Sixteen 1-of-4 channels OR
  • Thirty-two 1-of-2 channels
  • Little support for abstract channel types
  • Also a limitation of Cadence (?)
  • Conversion to and from internal data
    representation needs to be done for every block
  • No clean method
  • Functions can be defined but must be included in
    each module that uses it
  • VHDL is somewhat more powerful and can handle
    these issues much better
Write a Comment
User Comments (0)
About PowerShow.com