Power/Performance/Cost Efficiency of Adiabatic Circuits, as a function of Device On/Off Power Ratios - PowerPoint PPT Presentation

About This Presentation

Title:

Power/Performance/Cost Efficiency of Adiabatic Circuits, as a function of Device On/Off Power Ratios

Description:

Power/Performance/Cost Efficiency of. Adiabatic Circuits, as a ... Quantum-dot (Lent & Tougaw, mid-'90s-present) Quantum computing implementations (inherently) ... – PowerPoint PPT presentation

Number of Views:145

Avg rating:3.0/5.0

Slides: 52

Provided by: Jam123

Category:

more less

Transcript and Presenter's Notes

Title: Power/Performance/Cost Efficiency of Adiabatic Circuits, as a function of Device On/Off Power Ratios

1
Power/Performance/Cost Efficiency ofAdiabatic
Circuits, as a function ofDevice On/Off Power
Ratios

Michael P. FrankCISE Department / ECE
Dept.Brown Bag Seminar
Tue., Mar. 26

2
(No Transcript)
3
Source ITRS 99
4
Across Multiple Technologies
Vacuum Tubes
IntegratedCircuits
Mechanical
DiscreteTransistors
ElectromechanicalRelays
Source Kurzweil, The Age of Spiritual Machines,
pp. 22-25
5
½CV2 based on ITRS 99 figures for Vdd and
minimum transistor gate capacitance. T300 K
6
Information Entropy
1 2 3
Example System with 3 two-state subsystems,such
as quantum spins.
Ruled outby someknowledge
Informational Spin label Status
1 Entropy 2 Known
information 3 Entropy
238 states
7
Illustrating Landauers principle
Before bit erasure
After bit erasure
State ofbit to beerased.

s0
0
0
s??0
State ofrest ofsystem(thermalmodes, c.)
Nstates

sN-1
s??N-1
0
0
Unitary(1-1)evolution
2Nstates
s?0
s??N
1
0
Nstates

s??2N-1
0
s?N-1
1
8
Conventional Gates are Irreversible

Logic gate behavior (on receiving new input)
Many-to-one transformation of local state!
Required to dissipate bT by Landauer principle
Incurs ½CV2 dissipation in 2 out of 4 cases.

Transformation of local state
Example Static CMOS Inverter
in
out
9
(No Transcript)
10
Adiabatic Charging in CMOS
Exact formula (if R const.)for frequency
factor f ? RC/t
11
Adiabaticity is Fundamental

Adiabatic (dissipation ? quickness) processes
can occur in any type of system.
Cf. Adiabatic theorem of quantum mechanics.
Specific adiabatic logics have been described for
many proposed future device technologies
Superconducting (Likharev 82, Averin et al. 01)
Nanomechanical (Drexler 92, Merkle mid-90s)
Quantum-dot (Lent Tougaw, mid-90s-present)
Quantum computing implementations (inherently)
Claim Work on architectures analysis for
adiabatic CMOS will still apply post-CMOS!

12
Adiabatic Rules for Transistors

Rule 1 Never turn on a transistor if it has a
nonzero voltage across it!
I.e., between its source drain terminals.
Why This erases info. causes ½CV2 disspation.
Rule 2 Never apply a nonzero voltage across a
transistor even during any on?off transition!
Why When partially turned on, the transistor has
relatively low R, gets rel. high PV2/R
dissipation.
Corollary Never turn off a transistor when it
has a nonzero current going through it!
Why As R gradually increases, the VIR voltage
drop will build, and then rule 2 will be violated.

13
Adiabatic Rules continued

Transistor Rule 3 Never apply a large voltage
across any on transistor.
Why So transition will be more reversible
dissipation will approach CV2(RC/t), not ½CV2.
Adiabatic rules for other components
Diodes Dont use them at all!
There is always a built-in voltage drop across
them!
Resistors Avoid moderate network resistances.
e.g. stay away from range gt10 k? and lt1 M?
Capacitors Minimize, reliability permitting.
Note Adiabatic dissipation scales with C2!

14
Transistor Rules Summarized
Legal transitions in green. (For n- or
p-FETs.)Dissipative states and transitions in
red.
off
high
low
off
off
high
high
low
low
off
high
low
on
on
high
low
high
low
on
on
low
low
high
high
15
?
Transformation of local state
16
Simple Reversible CMOS Latch

Uses a standard CMOS transmission gate
Sequence of operation
(1) input initially matches latch contents
(output),
(2) input changes?output changes, (3) latch
closes, (4) input removed.

b
a
Before Input Inputinput arrived removedin out
in out in outa a a a a a b b a b
P
in
out
b
a
17
Generic Frictional Coefficients

Normal defs. of friction (coeff. of sliding
friction, viscosity, etc.) may not apply to all
processes.
For a given mechanism executing a specified
process (i.e., following a specified desired
trajectory or -ies) adiabatically over a time t
Energy coefficient cE ?Elostt ?Elost/q
Energy dissipated from traj. per unit of
quickness
Note quickness q 1/t has units like Hz
Entropy coefficient cS ?Smadet ?Smade/q
New entropy generated per unit of quickness
Note that cE cST at temperature T.

What matters!
18
Energy Coefficient in Electronics

For charging capacitive load C by voltage V
through effective resistance R cE ?Elostt
(CV2RC/t)t C2V2R
If the resistances are voltage-controlled
switches with gain factor k controlled by the
same voltage V, then effective R ? 1/kV cE
C2V/k
In constant-field-scaled CMOS, k ? 1/hox ? ?, C ?
?, and V ? ?, so cE ? ?3/? ?4 ?Elost cE/t
? ?4/? ?3 (like CV2
energy)

19
Entropy coefficients of some reversible logic
gate operations

From Frank 98, Ultimate theoretical models of
nanocomputers (Nanotechnology journal)
SCRL, circa 1997 1 b/Hz
Optimistic reversible CMOS 10 b/kHz
Merkles quantum FET 1.2 b/GHz
Nanomechanical rod logic .07 b/GHz
Superconducting PQ gate 25 b/THz
Helical logic .01 b/THz

How low can you go? We dont really know!
20
Quantifying Leakage

For a given structured system
Leakage power Pleak dEleak / dt
Spontaneous entropy generation rate Sleak
dSleak / dt
Again, note Pleak Sleak T at temperature T.

21
Minimum Losses w. Leakage
Etot Eadia Eleak
Eleak Pleaktr
Eadia cE / tr
22
Min. energy Roff/Ron ratio

Note that cE C2V2Ron and if dominant leakage
is source/drain Pleak V2/Roff
So cEPleak C2V4/(Roff/Ron) Emin
2(cEPleak)1/2 2CV2(Roff/Ron)?1/2
So Qmax ½CV2 / (2CV2(Roff/Ron)?1/2)
¼(Roff/Ron)1/2 ¼(Ion/Ioff)1/2

23
Clock/Power Supply Desiderata

Requirements for an adiabatic timing signal /
power supply
Generate trapezoidal waveform with very flat
high/low regions
Flatness limits Q of logic.
Waveform during transitions is ideally linear,
But this does not affect maximum Q, only energy
coefficient.
Operate resonantly with logic, with high Q.
Power supply Q will limit overall system Q
Reasonable cost, compared to logic it powers.
If possible, scale Q ? t (cycle time)
Required to be considered an adiabatic mechanism.
May conflict w. inductor scaling laws!
At the least, Q should be high at leakage-limited
speed

(Ideally,independentof t.)
24
Supply concepts in my research

Superpose several sinusoidal signals from
phase-synchronized oscillators at harmonics of
fundamental frequency
Weight these frequency components as per Fourier
transform of desired waveform
Create relatively high-L integrated inductors via
vertical, helical metal coils
Only thin oxide layers between turns
Use mechanically oscillating, capacitive MEMS
structures in vacuo as high-Q (10k) oscillator
Use geometry to get desired wave shape directly

25
A MEMS Supply Concept

Energy storedmechanically.
Variable couplingstrength -gt customwave shape.
Can reduce lossesthrough balancing,filtering.
Issue How toadjust frequency?

26
Summary of Limiting Factors

When considering adiabaticizing a system
What fraction of system power is in logic? fL
Vs. Displays, transmitters, propulsion.
What fraction of logic is done adiabatically? fa
Can be all, but w. cost-efficiency overheads.
How large is the Ion/Ioff ratio of switches?
Affects leakage minimum adiabatic energy.
What is the Qsup of the resonant power supply?
What is the relative cost of power logic? r
E.g. decreasing power cost by r by increasingHW
cost by ? r will not help. Power premium

27
Minimizing cost/performance

P Cost of power in original system
H Cost of logic HW in original system
P rH H P/r
For cost-efficiency inverse to energy savings
tot,min Pr-1/2 Hr1/2 2 Pr-1/2
tot,orig P H (1r)H ((1r)/r) P
tot,orig/tot,min ½(1r)r-1/2 ?
½r1/2 for large r

28
Summary of adiabatic limits

Cost-effective adiabatic energy savings factor
Sa Econv / Eadia in cost-effective adiabatic
system
Some rough upper bounds on Sa Sa ?
1/(1?fL) Sa ? 1/(1?fa) Sa ? ¼(Ion/Ioff)1/2
Sa ? Qsup Sa ? r1/2
Discussion ignores benefits from adiabatics of
denser packing smaller communications delays in
parallel algorithms.

(worse than thesefor non-idealcomputations)
29
Motivation for this study

We want to know how to carry out any arbitrary
computation in a way that is reversible to an
arbitrarily high degree.
Up to limits set by leakage, power supply, etc.
We want to do this as efficiently as possible
Using as few device ticks as possible
(spacetime)
Minimizes HW cost, leakage losses
Using as few adiabatic transitions as possible
(ops)
Minimizes frictional losses
But, a desired computation may be originally
specd in terms of irreversible primitives.

30
General-Case vs. Special-Case

Wed like to know two kinds of things
For arbitrary general-purpose computations,
How to automatically emulate them in a fairly
efficient reversible way,
w/o needing new intelligent/creative design work
in each case?
For various specific computations of interest,
What are the most efficient reversible
algorithms?
Or at least, the most efficient that we can find?
Note These may not look anything like the most
efficient irreversible algorithms!

31
The Landauer embedding

The obvious embedding of irreversible ops into
expanding reversible ones leads to a linear
increase in space through time. (Landauer 61)
Or, increase in width of an input-consuming
circuit

Expandingoperations(e.g., AND)
Desiredoutput
Garbagebits
input
Circuit depth, or time ?
32
Lecerf Reversal

Lecerf (63) was interested in the group-theory
question of whether an iterated permutation of
items would eventually return to initial item.
Proved undecidable by reducing Turings halting
problem to this question, w. a reversible TM.
Reversible TM reverses direction instead of
halting.
Returns to initial state iff irreversible TM
would halt.
Only problemNo useful output data!

Desiredoutput
f
f ? 1
Garbage
Copy ofInput
Input
33
The Bennett Trick

Bennett (73) pointed out that you could simply
fan-out (reversibly copy) the desired output
before reversing.
Note O(T) storage is still temporarily needed!

Desired output
f
f ? 1
Copy ofInput
Input
Garbage
34
Triangle Representation

Represents any use of Bennett 73 embedding

State ofirrev. comp._at_ time ti?ti
Time in irreversiblesystem
AdiabaticProcess
?ti
Reversephase
Forwardphase
State ofirrev. comp._at_ time ti
Mass on anyvertical line space usage_at_ that
time
Time in reversiblesystem
35
Improving Spacetime Efficiency

Bennett 73 transforms a computation taking
spacetime ST to one taking ?(ST2) in the worst
case.
Can we do better?
Bennett 89 Described a technique that takes
spacetime
Actually, can generalize slightly and arrange for
exponent on T to be 1?, where ??0 (very slowly)
Lange, McKenzie, Tapp 97 Space ?(S) is
possible, if you use time ?(exp(?(S)))
Not any more spacetime-efficient than Bennett.

36
Pebble Game Representation
37
Triangle representation
k 2n 3
k 3n 2
38
Analysis of Bennett Algorithm

n of recursive levels of algorithm
k of lower-level iterations to go forward 1
higher-level step
Tr of reversible lowest-level steps
executed c(2k?1)n (c a small
constant, e.g. 2)
Ti of irreversible steps emulated kn
So, n logk Ti, and so Tr c(2k?1)log Ti/log k
celog(2k?1)log(Ti)/log k cTilog(2k ?1)/log k

(n1 spikes)
E.g. k2 Tr 2Tilog(3)/log(2)
39
Cost-Efficiency Analysis

Total cost of doing a computation includes
Spacetime costs (storage used, integrated over
time)
Includes time-amortized manufacturing cost
Includes cost of total energy leakage
leakage from any in-use storage element
Irreversibility costs (energy loss from irrev.
ops)
Total number of irreversible bit-erasures, CV2 gt
kT each.
Adiabatic costs (energy loss from reversible
ops.)
Proportional to number na of adiabatic ops
performed,times ce, divided by time top of a
single op.

40
Bennett 89 alg. is not optimal
k 2n 3
k 3n 2
Just look at all the spacetime it wastes!!!
41
Parallel Frank02 algorithm

We can simply squish the triangles closer
together to eliminate the wasted spacetime!
Resulting algorithm is linear time for all n and
k and dominates Ben89 for time, spacetime,
energy!

k3n2
k2n3
Emulated time
k4n1
Real time
42
Setup for Analysis

For energy-dominated limit,
let cost equal energy.
c energy coefficient, r r(min) leakage
power
i energy dissipation per irreversible
state-change
Let the on/off ratio Ron/off r(max)/r(min)
Pmax/Pmin.
Note that c ? itmin i (i / r(max)),
so r(max) ? i2/c
So Ron/off ? i2 / cr(min) i2 / cr

43
Time Taken

There are n levels of recursion.
Each multiplies the width of the base of the
triangle by k.
Lowest-level triangles take time ctop.
Total time is thus ctopkn.

k4n1
Width 4 sub-units
44
Number of Adiabatic Ops

Each triangle contains k (k ? 1) 2k ? 1
immediate sub-triangles.
There are n levels of recursion.
Thus number of adiabatic ops is c(2k ? 1)n

k3n2
52 25little triangles(adiabaticoperations)
45
Spacetime Usage

Each triangle includes the spacetime usage of all
k ? 1 of its subtriangles,
Plus,additional spacetime units, each
consisting of 1 storage unit, for time
topkn?1

k5n1
1 state of irrev. mach. Being stored
1
2
Time top kn-1
3
Resulting recurrence relationST(k,0) 1 (or
c)ST(k,n) (2k?1)ST(k,n?1) (k2?3k2)kn?1/2
123 units
46
Reversible Cost

Adiabatic cost plus spacetime cost r a r
(2k-1)nc/t ST(k,n)rt
Minimizing over t gives r 2(2k-1)n
ST(k,n) c r1/2
But, in energy-dominated limit, c r ? i2 /
Ron/off,
So r 2i (2k-1)n ST(k,n) / Ron/off1/2

47
Tot. Cost, Orig. Cost, Advantage

Total cost i for irreversible operation
performed at end of algorithm, plus reversible
cost, gives tot i 1 2(2k-1)n
ST(k,n) / Ron/off1/2
Original irreversible machine performing kn ops
would use cost orig ikn, so,
Advantage ratio between reversible irreversible
cost,

48
Optimization Algorithm

For any given value on Ron/off,
Scan the possible values of n (up to some limit),
For each of those, scan the possible values of k,
Until the maximum R(i/r) for that n is found
(the function only has a single local maximum)
And return the max R(i/r) over all n tried.

49
Spacetime blowup
Energy saved
k
n
50
Asymptotic Scaling

The potential energy savings factor scales as
R(i/r) ? Ron/off0.4,
while the spacetime overhead goes only as
R(i/r) ? R(i/r)0.45, or Ron/off0.18.
E.g., with an Ron/off of 109, you can do
worst-case computation in an adiabatic circuit
with
An energy savings of up to a factor of 1,200 !
But, this point is 700,000 less
hardware-efficient!

51
Conclusions

A new, more spacetime-efficient
energy-efficient algorithm for doing arbitrary
computations adiabatically has been described.
The energy savings in worst-case computations
goes as the 0.4th power of device on/off ratio.
Best case computations 0.5th power.
However, the reduction in spacetime efficiency
scales with energy savings to the 1.6th power.
Still much faster than we would like!
Adiabatics can be generally cost-effective, but
still only for heavily energy-dominated apps.