VHDL Design Tips and Low Power Design Techniques - PowerPoint PPT Presentation

Loading...

PPT – VHDL Design Tips and Low Power Design Techniques PowerPoint presentation | free to download - id: 55d38a-ZGUzM



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

VHDL Design Tips and Low Power Design Techniques

Description:

Low Power Design Techniques Jonathan ... Design Techniques Summary Actel ProASICPlus Design Flow What is Synthesis? The mapping of a behavioral description to a ... – PowerPoint PPT presentation

Number of Views:382
Avg rating:3.0/5.0
Slides: 55
Provided by: Belh5
Learn more at: http://klabs.org
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: VHDL Design Tips and Low Power Design Techniques


1
VHDL Design Tips and Low PowerDesign Techniques
  • Jonathan Alexander
  • Applications Consulting Manager
  • Actel Corporation
  • MAPLD 2004

2
Agenda
  • Advanced VHDL
  • ProASICPlus Synthesis, Options and Attributes
  • Timing Specifications
  • Design Hints
  • Power-Conscious Design Techniques
  • Summary

3
Actel ProASICPlus Design Flow
VHDL Source
Directives
Logic Optimization
Attributes
Synthesis
Timing
Technology Mapping
Place Route
Timing, Pin, Placement
Technology Implementation
4
What is Synthesis?
  • The mapping of a behavioral description to a
    specific target technology,
  • i.e. Generates a structural netlist from a HDL
    description
  • Includes optimization steps
  • Optimize the design implementation for
  • Higher Speed
  • Smaller Area
  • Lower Power

5
ProASICPlus HDL Attributes and Directives
  • Attributes are used to direct the way your design
    is optimized and mapped during synthesis.
  • Directives control the way your design is
    analyzed prior to synthesis. Because of this,
    directives must be included in your VHDL source
    code.
  • Three important ProASICPlus attributes or
    directives are available
  • syn_maxfan (attribute)
  • syn_keep (directive)
  • syn_encoding (attribute)

6
ProASICPlus HDL Attributes and Directives (contd)
  • syn_maxfan Value
  • Value Range gt 4
  • Can be assigned to an input port, register
    output, or a net
  • Overrides the global Fanout Limit setting
  • The tool will replicate the signal if this
    attribute is associated with it
  • Syntax
  • In the HDL code
  • attribute syn_maxfan of data_in signal is 1000
  • In the constraint file
  • define_attribute clk syn_maxfan 200

7
ProASICPlus HDL Attributes and Directives (contd)
  • syn_keep 1
  • When associated with a signal, this directive
    prevents Synplify from combining or collapsing
    the node.
  • This attribute can be associated with
    combinatorial signals only
  • Syntax
  • In the HDL code
  • Attribute syn_keep of st signal is Integer 1
  • In the constraint file
  • define_attribute st syn_keep 1

8
Agenda
  • Advanced VHDL
  • ProASICPlusSynthesis and Options and Attributes
  • Timing Specifications
  • Design Hints
  • Power-Conscious Design Techniques
  • Summary

9
Timing Constraints Specification
  • Synplify ProASICPlus mapper allows specification
    of the following
  • Global Design Frequency
  • Multi-clock design
  • Skew between two clocks
  • Input and output delays
  • Functional multi-cycle and false paths
  • All these timing specifications are available in
    the GUI, the presentation will cover the sdc
    constructs only.

10
Design Frequency Specification
  • Multiple Clocks
  • Graphical User Interface Frequency item allows
    specification of a global value for all clocks
  • This setting influences the operator architecture
    selection (speed or area) during mapping
  • This value should be set to the highest frequency
    required in the design
  • To specify individual values for different
    clocks, use the following sdc construct
  • define_clock clock_1 -freq ltValue1gt
  • define_clock clock_2 -freq ltValue2gt

11
Skew Specification in Synplify
  • To define a skew between two clocks, use the
    following constraint
  • define_clock_delay -rise clock1 -rise clock2
    value
  • Example
  • define_clock_delay -rise CLK19M -rise MPU_CLK
    1.0
  • define_clock_delay -rise MPU_CLK -rise CLK19M
    2.0

12
Input Delay
  • Specifies the input arrival time of a signal in
    relation to the clock.
  • It is used at the input ports, to model the
    interface of the inputs of the FPGA with the
    outside environment.
  • The value entered should represent the delay
    outside of the chip before the signal arrives at
    the input pin
  • To specify the input delay on an input port,
    use the following constraint
  • define_input_delay InputPortName Value

13
Output Delay
  • Specifies the delay of the logic outside the FPGA
    driven by the top-level outputs.
  • Used to model the interface of the outputs of the
    FPGA with the outside environment.
  • To specify the output delay, use the following
    constraints
  • define_output_delay OutputPortName Value

14
Functional False Path
  • define_false_path allows user to specify paths
    which will be ignored for timing analysis, but
    will still be optimized, without priority within
    Synplify.
  • The following options are available
  • -from lt a register or input pingt
  • -to lta register or output pingt
  • -through ltthrough a net signalgt
  • Example
  • define_false_path -from Register_A
  • define_false_path -to Register_B
  • Paths to Register_B are ignored
  • define_false_path -through test_net
  • Paths through Int_Net are ignored

15
Agenda
  • Advanced VHDL
  • ProASICPlus Synthesis, Options and Attributes
  • Timing Specifications
  • Design Hints
  • Power-Conscious Design Techniques
  • Summary

16
Late Arrival Signals Prioritization
-- Initial Description case State is when WAIT
gt if Critical then Target
lt Source_1 else Target lt Source_2
end if when ACTIVE gt if Critical
then Target lt Source_1 else
Target lt Source_3 end if when . end
case
-- Modified Description if Critical then Target
ltSource_1 else case State is when WAIT
gt Target lt Source_2 when ACTIVE gt
Target lt Source_3 when . end
case end if
State
State
Target
Source_2
Target
Source_1
Source_1
Critical
Critical
17
Late Arrival Signal Another Hint !
Max
. begin if ((A_late B) gt Max) then
Out C else Out D end if
end Process
gt
C
Out
mux
D
A_late
gt
if ((B - Max) gt A_late) Out C else Out D.
C
Out
mux
D
18
Signal vs Variable
  • Variable assignments are sensitive to order.
  • Variables are updated immediately
  • Signal assignments are order independent.
  • Signal assignments are scheduled

Process (Clk) begin if (ClkEvent and
Clk1) then Trgt1 lt In1 xor In2
Trgt2 lt Trgt1 Trgt3 lt Trgt2
end if end process
Signal vTarg3 std_logic Process
(Clk) Variable vTarg1, vTarg2 ... begin if
(ClkEvent and Clk1) then vTrgt1
In1 xor In2 vTrgt2 vTrgt1
vTrgt3 lt vTrgt2 end if end process
Process (Clk) Variable vTarg1, vTarg2 ...
begin if (ClkEvent and Clk1) then
Trgt3 lt vTrgt2 vTrgt2
vTrgt1 vTrgt1 In1 xor In2 end
if end process
Process (Clk) begin if (ClkEvent and
Clk1) then Trgt2 lt Trgt1
Trgt3 lt Trgt2 Trgt1 lt In1 xor
In2 end if end process
Trgt3
Trgt3
Trgt3
19
Resource Sharing and Operand Alignment
With Resource Sharing (Smaller)
Operand Alignment (Faster)
HDL Code
process (X, Y, Z, Sel) begin if (Sel 0)
then Res lt X Y else
Res lt Y Z end if end process
() Especially if Y is a Late Arrival Signal
Without Resource Sharing (Larger and Slower)
Implementations
20
Resource Sharing to Avoid
  • Buses

Sel
With Resource Sharing (Larger and Slower)
X
16
VHDL Code
mux
1
Y

16
Eq
Z
process (X, Y, Z, T, Sel) begin if (Sel 0)
then Eq lt (X Y) else
Eq lt (Z T) end if end process
mux
T
Sel
1
Without Resource Sharing (Smaller and Faster)
Eq
1
Implementation
21
Internal Three-state Buffers
  • At the VHDL Level
  • Either Using the Multiplexer based modified VHDL
    code, or
  • Replace the three-state structure using the
    equivalent following AND-OR structure

tri_out
tri_en1
tri_in1
tri_en2
tri_in2
tri_en3
tri_in3
tri_en4
tri_in4
tri_out
mux_out
22
Agenda
  • Advanced VHDL
  • Power-Conscious Design Techniques
  • Data Path Selection
  • FSM Encoding
  • Gating Clocks and Signals
  • Advanced Power Design Practices
  • Summary

23
Sources of Dynamic Power Consumption
  • Switching
  • CMOS circuits dissipate power during switching
  • The more logic levels used, the more switching
    activity needed
  • Frequency
  • Dynamic power increases linearly with frequency
  • Loading
  • Dynamic power increases with capacitive loading
  • Glitch Propagation
  • Glitches cause excessive switching to occur at
    relatively high frequencies.
  • Clock Trees
  • Clock Trees operate at high frequency under heavy
    loading, so they contribute significantly to the
    total power consumption.

24
Data Path Elements Selection
  • Basic block selection is critical as the
    power/speed tradeoff has to be well identified
  • Power is switching activity dependent, thus input
    data pattern dependent
  • Watch the architecture of the basic arithmetic
    and logic blocks
  • Check area/speed and fanout distribution/number
    of logic levels
  • High fanout large number of logic level
    higher glitch propagation
  • Investigate pipelining effect on power
    dissipation
  • Impact on clock tree power consumption
  • Impact on block fanout distribution

25
Data Path Architectures
  • Adders Architectures
  • Architecture Evaluation
  • Test Results
  • Multipliers
  • Architectures and Power Implications
  • Pipelined Configurations
  • Pipeline Effect on Power
  • Pipelining vs re-Timing

26
Review Ripple Adder
Carry signal switching propagates through all
the stages and consumes Power
27
Review Carry Look-Ahead Adder
  • Carry signal switching propagates through less
    stages
  • However, higher number of Logic Level

28
Carry Select Adder Overview
Principle Do it twice (considering Carry0 and
Carry1) then when actual Carry is ready,
Select appropriate result
  • Carry signal switching propagates through less
    stages
  • However, higher duplication and complexity

29
Adder Architectures
Forward Carry Look Ahead (CLF) Fastest but also
largest Brent and Kung (BK) Almost same speed as
CLF but drastically smaller Carry Look Ahead
(CLA) Relatively small and slow Ripple (RPL)
Smallest but slowest
Brent and Kung Best area/speed tradeoff
30
Adders Power Dissipation
  • Brent and Kung Lowest Power Dissipation
  • Lowest logic levels
  • Lowest fanout

31
Data Path Architectures
  • Adders Architectures
  • Architecture Evaluation
  • Test Results
  • Multipliers
  • Architectures and Power Implications
  • Pipelined Configurations
  • Pipeline Effect on Power
  • Pipelining vs re-Timing

32
Multipliers Power Consumption
  • Wallace Advantages Over Carry-Save Multiplier
    (CSM)
  • Uniform switching propagation
  • Less logic levels
  • Lower average fanout

33
Data Path Architectures
  • Adders Architectures
  • Architecture Evaluation
  • Test Results
  • Multiplier
  • Architectures and Power Implications
  • Pipelined Configurations
  • Pipeline Effect on Power
  • Pipelining vs re-Timing

34
Pipelining for Glitch Reduction
  • A logically deep internal net is typically
    affected by more primary inputs switching, and is
    therefore more susceptible to glitches
  • Pipelining shortens the depth of combinatorial
    logic by inserting pipeline registers
  • Pipelining is very effective for data path
    elements such as parity trees and multipliers

35
Pipelining Effect on Power
Pipelining increases clock tree power, but
overall power is lowered
36
Pipelining vs. Re-timing
  • Pipelining introduces new registers
  • Re-timing does not introduce new registers
  • Example FIR re-timing
  • Re-timing also reduces power
  • Registers prevent glitch propagation through high
    logic-level paths (ie mulitpliers)

37
Agenda
  • Advanced VHDL
  • Power Conscious Design Techniques
  • Data Path Selection
  • FSM Encoding
  • Gating Clocks and Signals
  • Advanced Power Design Practices
  • Summary

38
FSM and Counter Encoding Impact on Power
39
Counters and FSMsState Register Transitions
40
Counters Power Measurement on ProASIC
Power dissipation for 200 instances of 8
bit-counters As expected Gray counters dissipate
less power (25)
41
FSM Encoding Effects on Power
42
Agenda
  • Advanced VHDL
  • Power Conscious Design Techniques
  • Data Path Selection
  • FSM Encoding and Effect on Power
  • Gating Clocks Signals
  • Advanced Power Design Practices
  • Summary

43
Signal Gating
  • There are several logic implementations of signal
    gating

Latch or FF

Tri-state buffer
44
Gating Clocks
  • Most Used mechanism to gate clocks

Data_Out (N Bits)
New_Data
New_Data (N Bits)
LD_Enable
FSM
FSM
L A T C H
LD_Enable
CLK_En
CLK
CLK
Gating clock signals with combinatorial logic is
not recommended. Glitches are easily created by
the clock gate which may result in incorrect
triggering of the register
45
Gating Signals Address Decoder Example
OUT0
IN0
IN1
OUT1
OUT2
Enable/Select
OUT3
A switching activity on one of the input of the
decoder will induce an large number of toggling
outputs Enable/Select signal prevents the
propagation of their switching activity
46
Agenda
  • Advanced VHDL
  • Power Conscious Design Techniques
  • Data Path Selection
  • FSM Encoding and Effect on Power
  • Gating Clocks and Signals
  • Advanced Practices
  • Summary

47
VHDL Coding Effect on Power
  • Example IF THEN . ELSE .
  • Re-organizing the code helps to prevent
    propagation of switching activity

48
Delay Balancing
  • If all primary inputs have the same arrival time
    and the same switching probability, balancing
    trees eliminates switching propagation

Un-Balanced
Balanced
49
Guarded Evaluation
  • Technique used to reduce switching activity by
    adding latches or floating gates at the inputs of
    combinatorial blocks if their outputs are not
    used.
  • Example Results of multiplier may or may not be
    used depending on the condition, Adding
    transparent Latches or AND gates on the inputs
    avoids power dissipation as they mask useless
    input activity.

50
Pre-computation Based Power Reduction
Common Clock
Combinatorial Logic
R1
Pre-Computation Input
Outputs
R2
Gated Input
Pre-Computation Logic
51
Operator Reduction
  • Based on transformations of operations into
    computationally equivalent implementations
  • Example Distributive Multiplication over
    Addition (resource sharing)
  • (XY) (ZY) (XZ) Y

52
Input Signals Ordering
  • Never forget that adders are commutative and
    associative
  • Amplitude of IN is larger than the amplitude of
    IN gtgt 7 and IN gtgt 8

Switching Probability
INgtgt7
INgtgt8
IN
Sign Bit Correlation
2 4 6 8 10 12 14 ..
Bit Number
53
Summary
  • Advanced VHDL Design Tips
  • Identify critical and late arrival signals in
    your design
  • Write code in a way that reduces the logic levels
    for such signals
  • Perform functions such as state determination
    while waiting for late signals
  • Low Power Design Techniques
  • Reduce switching activity per clock cycle
  • Reduce propagation of switching activity
  • Use power-efficient architecture and encoding
  • Disable logic blocks whose outputs are not used
  • Re-evaluate expressions to achieve the above

54
Additional Resources
  • Documents available on http//www.actel.com
  • Low Power Resource Center
  • http//www.actel.com/products/rescenter/power/inde
    x.html
  • Power Conscious Design with ProASIC
  • http//www.actel.com/documents/PowerConscious.pdf
  • Low Power Design for Antifuse FPGAs
  • http//www.actel.com/documents/lowpower.pdf
About PowerShow.com