EEL 5930 sec. 5, Spring 05Physical Limits of

Computing

http//www.eng.fsu.edu/mpf

- Slides for a course taught byMichael P. Frankin

the Department of Electrical Computer

Engineering

Module 6 Fundamental Physical Limits of

Computing

- A Brief Survey

Fundamental Physical Limits of Computing

ImpliedUniversal Facts

Affected Quantities in Information Processing

Thoroughly ConfirmedPhysical Theories

Speed-of-LightLimit

Communications Latency

Theory ofRelativity

Information Capacity

UncertaintyPrinciple

Information Bandwidth

Definitionof Energy

Memory Access Times

QuantumTheory

Reversibility

2nd Law ofThermodynamics

Processing Rate

Adiabatic Theorem

Energy Loss per Operation

Gravity

A Slightly More Detailed View

The Speed-of-Light Limit on Information

Propagation Velocity

- What are its implications for future computer

architectures?

Implications for Computing

- Minimum communications latency!
- Minimum memory-access latency!
- Need for Processing-in-Memory architectures!
- Mesh-type topologies are optimally scalable!
- Hillis, Vitanyi, Bilardi Preparata
- Together w. 3-dimensionality of space implies
- No network topology with ?(n3) connectivity (

nodes reachable in n hops) is scalable! - Meshes w. 2-3 dimensions are optimally scalable.
- Precise number depends on reversible computing

theory!

How Bad it Is, Already

- Consider a 3.2 GHz processor (off todays shelf)
- In 1 cycle a signal can propagate at most
- c/(3.2 GHz) 9.4 cm
- For a 1-cycle round-trip to cache memory back
- Cache location can be at most 4.7 cm away!
- Electrical signals travel at 0.5 c in typical

materials - In practice, a 1-cycle memory can be at most 2.34

cm away! - Already ? logics in labs at 100 GHz speeds!
- E.g., superconducting logic technology RSFQ
- 1-cycle round trips only within 1.5 mm!
- Much smaller than a typical chip diameter!
- As f?, architectures must be increasingly local.

Latency Scaling w. Memory Size

- Avg. time to randomly access anyone of n bits of

storage (accessibleinformation) scales as

?(n1/3). - This will remain true in all future technologies!
- Quantum mechanics gives a minimum size for bits
- Esp. assuming temperature pressure are limited.
- Thus n bits require a ?(n)-volume region of

space. - Minimum diameter of this region is ?(n1/3).
- At lightspeed, random access takes ?(n1/3) time!
- Assuming a non-negative curvature region of

spacetime. - Of course, specific memory technologies (or a

suite of available technologies) may scale even

worse than this!

?(n1/3)

n bits

Scalability Maximal Scalability

- A multiprocessor architecture accompanying

performance model is scalable if - it can be scaled up to arbitrarily large

problem sizes, and/or arbitrarily large numbers

of processors, without the predictions of the

performance model breaking down. - An architecture ( model) is maximally scalable

for a given problem if - it is scalable, and if no other scalable

architecture can claim asymptotically superior

performance on that problem - It is universally maximally scalable (UMS) if it

is maximally scalable on all problems! - I will briefly mention some characteristics of

architectures that are universally maximally

scalable

Shared Memory isnt Scalable

- Any implementation of shared memory requires

communication between nodes. - As the of nodes increases, we get
- Extra contention for any shared BW
- Increased latency (inevitably).
- Can hide communication delays to a limited

extent, by latency hiding - Find other work to do during the latency delay

slot. - But, the amount of other work available is

limited by node storage capacity, parallizability

of the set of running applications, etc.

Unit-Time Message Passing Isnt Scalable

- Model Any node can pass a message to any other

in a single constant-time interval (independent

of the total number of nodes) - Same scaling problems as shared memory!
- Even if we assume BW contention (traffic) isnt a

problem, unit-time assumption is still a problem. - Not possible for all N, given speed-of-light

limit! - Need cube root of N asymptotic time, at minimum.

Many Interconnect Topologies arent Scalable

- Suppose we dont require a node can talk to any

other in 1 time unit, but only to selected

others. - Some such schemes still have scalability

problems, e.g. - Hypercubes
- Binary trees, fat trees
- Butterfly networks
- Any topology in which the number of unit-time

hops to reach any one of N nodes is of order less

than N1/3 is necessarily doomed to failure! - Caveat Except in negative-curvature spacetimes!

Only Meshes (or subgraphs thereof) Are Scalable

- See papers by Hillis, Vitanyi, Bilardi

Preparata - 1-D meshes
- linear chain, ring, star (w. fixed of arms)
- 2-D meshes
- square grid, hex grid, cylinder, 2-sphere,

2-torus, - 3-D meshes
- crystal-like lattices w. various symmetries
- Caveat
- Scalability in 3rd dimension is limited by

energy/information I/O considerations!

Amorphousarrangementsin ?3d, w. localcomms.,

are also ok

Ideally Scalable Architectures

Claim A 2- or 3-D mesh multiprocessor with a

fixed-size memory hierarchy per node is an

optimal scalable computer systems design (for any

application).

Processing Node

Processing Node

Processing Node

Local memory hierarchy(optimal fixed size)

Local memory hierarchy(optimal fixed size)

Local memory hierarchy(optimal fixed size)

Processing Node

Processing Node

Processing Node

Local memory hierarchy(optimal fixed size)

Local memory hierarchy(optimal fixed size)

Local memory hierarchy(optimal fixed size)

Mesh interconnection network

Landauers Principle

- Low-level physics is reversible
- Means, the time-evolution of a state is bijective
- Deterministic looking backwards in time
- as well as forwards
- Physical information (like energy) is conserved
- Cannot be created or destroyed, only reversibly

rearranged and modified - Implies the 2nd Law of Thermodynamics
- Entropy (unknown info.) in a closed, unmeasured

system can only increase (as we lose track of the

state) - Irreversible bit erasure really just moves the

bit into surroundings, increasing entropy heat

Scaling in 3rd Dimension?

- Computing based on ordinary irreversible bit

operations only scales in 3d up to a point. - Discarded information associated energy must be

removed thru surface. Energy flux limited. - Even a single layer of circuitry in a

high-performance CPU can barely be kept cool

today! - Computing with reversible, adiabatic operations

does better - Scales in 3d, up to a point
- Then with square root of further increases in

thickness, up to a point. (Scales in 2.5

dimensions!) - Scales to much larger thickness than irreversible!

Universal Maximum Scalability

- Existence proof for universally maximally

scalable (UMS) architectures - Physics itself is a universal maximally scalable

architecture because any real computer is

merely a special case of a physical system. - Obviously, no restricted class of real computers

can beat the performance scalability of physical

systems in general. - Unfortunately, physics doesnt give us a very

simple or convenient programming model. - Comprehensive expertise at programming physics

means mastery of all physical engineering

disciplines chemical, electrical, mechanical,

optical, etc. - Wed like an easier programming model than this!

Simpler UMS Architectures

- (I propose) any practical UMS architecture will

have the following features - Processing elements characterized by constant

parameters (independent of of processors) - Makes it easy to scale multiprocessors to large

capacities. - Mesh-type message-passing interconnection

network, arbitrarily scalable in 2 dimensions - w. limited scalability in 3rd dimension.
- Processing elements that can be operated in an

arbitrarily reversible way, at least, up to a

point. - Enables improved 3-d scalability in a limited

regime - (In long term) Have capability for

quantum-coherent operation, for extra perf. on

some probs.

Limits on Amount of Information Content

Some Quantities of Interest

- We would like to know if there are limits on
- Information density
- Bits per unit volume
- Affects physical size and thus propagation

delayacross memories and processors. Also

affects cost. - Information flux
- Bits per unit area per unit time
- Affects cross-sectional bandwidth, data I/O

rates, rates of standard-information input

effective-entropy removal - Rate of computation
- Number of distinguishable-state changes per

unit time - Affects rate of information processing achievable

in individual devices

Bit Density No classical limit

- In classical (continuum) physics, even a single

particle has a real-valued positionmomentum - All such states are considered physically

distinct - Each position momentum coordinate in general

requires an infinite string of digits to specify - x 4.592181291845019587661625618991009 meters
- p 2.393492301938881726153514427394001 kg m/s
- Even the smallest system contains an infinite

amount of information! ? No limit to bit

density. - This picture is the basis for various analog

computing models studied by some theoreticians. - Wee problem Classical physics is dead wrong!

The Quantum Continuum

- In QM, still ? uncountably many describable

states (mathematically possible wavefunctions) - Can theoretically take infinite info. to describe
- But, not all this info has physical relevance!
- States are only physically distinguishable when

their state vectors are orthogonal. - States that are only indistinguishably different

can only lead to indistinguishably different

consequences (resulting states) - due to linearity of quantum physics
- There is no physical consequence from presuming

an infinite of bits in ones wavefunction!

Quantum Particle-in-a-Box

- Uncountably manycontinuouswavefunctions?
- No, can expresswave as a vectorover

countablymany orthogonalnormal modes. - Fourier transform
- High-frequencymodes have higherenergy (Ehf)

alimit on average energy impliesthey have low

probability.

Ways of Counting States

- The entire field of quantum statistical mechanics

is all about this, but here are some simple ways - For a system w. a constant of particles
- of states numerical volume of the

position-momentum configuration space (phase

space) - When measured in units where h1.
- Exactly approached in the macroscopic limit.
- Unfortunately, of particles is not usually

constant! - Quantum field theory bounds
- Smith-Lloyd bound. Still ignores gravity.
- General relativistic bounds
- Bekenstein bound, holographic bound.

Smith-Lloyd Bound

Smith 95Lloyd 00

- Based on counting modes of quantum fields.
- S entropy, M mass, V volume
- q number of distinct particle types
- Lloyds bound is tighter by a factor of
- Note
- Maximum entropy density scales with only the 3/4

power of mass-energy density! - E.g., Increasing entropy density by a factor of

1,000 requires increasing energy density by

10,000.

Whence this scaling relation?

- Note that in the field theory limit, S ? E3/4.
- Where does the ¾ power come from?
- Consider a typical mode in field spectrum
- Note that the minimum size of agiven wavelet is

its wavelength ?. - of distinguishable wave-packet location states

in a given volume ? 1/?3 - Each such state carries just a little entropy
- occupation number of that state ( of photons in

it) - ?1/?3 particles each energy ?1/?, ?1/?4 energy
- S?1/?3 ? E?1/?4 ? S?E3/4

Whence the distribution?

- Could the use of more particles (with less energy

per particle) yield greater entropy? - What frequency spectrum (power level or particle

number density as a function of frequency) gives

the largest states? - Note ? a minimum particle energy in finite-sized

box - No. The Smith-Lloyd bound is based on the

blackbody radiation spectrum. - We know this spectrum has the maximum info.

content among abstract states, b/c its the

equilibrium state! - Empirically verified in hot ovens, etc.

Examples w. Smith-Lloyd Bound

- For systems at the density of water (1 g/cm3),

composed only of photons - Smiths example 1 m3 box holds 61034 bits
- 60 kb/Å3
- Lloyds example 1 liter ultimate laptop,

21031 b - 21 kb/Å3
- Pretty high, but whats wrong with this picture?
- Example requires very high temperaturepressure!
- Temperature around 1/2 billion Kelvins!!
- Photonic pressure on the order of 1016 psi!!
- Like a miniature piece of the big bang. -Lloyd
- Probably not feasible to implement any time soon!

More Normal Temperatures

- Lets pick a more reasonable temperature 1356 K

(melting point of copper) - The entropy density of light is only 0.74

bits/?m3! - Less than the bit density in a DRAM today!
- Bit size is comparable to avg. wavelength of

optical-frequency light emitted by melting copper - Lesson Photons are not a viable nanoscale info.

storage medium at ordinary temperatures. - They simply arent dense enough!
- CPUs that do logic with optical photons cant

have their logic devices packed very densely.

Entropy Density of Solids

- Can easily calculate from standard empirical

thermochemical data. - E.g. see CRC Handbook of Chemistry Physics.
- Obtain entropy by integrating heat capacity

temperature, as temperature increases - Example result, for copper
- Has one of the highest entropy densities among

pure elements, at atmospheric pressure. - _at_ room temperature 6 bits/atom, 0.5 b/Å3
- At boiling point 1.5 b/Å3
- Cesium has one of the highest bits/atom at room

temperature, about 15. - But, only 0.13 b/Å3
- Lithium has a high bits/mass, 0.7 bits/amu.

1012denser thanits light!

Related toconductivity?

General-Relativistic Bounds

- Note the Smith-Lloyd bound does not take into

account the effects of general relativity. - Earlier bound from Bekenstein Derives a limit on

entropy from black-hole physics - S lt (2?ER / ?c) nats
- E total energy of system
- R radius of the system (min sphere)
- Limit only attained by black holes!
- Black holes have 1/4 nat entropy per square

Planck length of surface (event horizon) area. - Absolute minimum size of a nat 2 Planck lengths,

square

41039 b/Å3average ent. dens.of a 1-m

radiusblack hole!(Mass?Saturn)

The Holographic Bound

- Based on Bekenstein black-hole bound.
- The information content I within any surface of

area A (independent of its energy content!)

is I A/(2?P)2 nats - ?P is the Planck length (see lecture on units)
- Implies that any 3D object (of any size) is

completely definable via a flat (2D) hologram

on its surface having Planck-scale resolution. - This information is all entropy only in the case

of a black hole with event horizonthat surface.

Holographic Bound Example

- The age of the universe is 13.7 Gyr 1 WMAP.
- Radius of currently-observed part would thus be

13.7 Glyr - But, due to expansion, its edge is actually 46.6

Glyr away today. - Cosmic horizon due to acceleration is 62 Glyr

away - The universe is flat, so Euclidean formulas

apply - The surface area of the eventually-observable

universe is - A 4pr2 4p(62 Glyr)2 4.331054 m2
- The volume of the eventually-observable universe

is - V (4/3)pr3 (4/3)p(62 Glyr)3 8.481080 m3
- Now, we can calculate the universes total info.

content, and its average information density! - I An/4?P2 (pr2/?P2) n 4.1510123 n

5.9810123 b - I/V 7.061042 b/m3 7.0610-3 b/fm3 1b/(.19

fm)3 - A proton is 1 fm in radius.
- Very close to 1 bit per quark-sized volume!

Do Black Holes Destroy Information?

- Currently, it seems that no one completely

understands exactly how information is preserved

during black hole accretion, for later

re-emission in the Hawking radiation. - Perhaps via infinite time dilation at event

horizon? - Some researchers have claimed that black holes

must be doing something irreversible in their

interior (destroying information). - However, the arguments for this may not be valid.
- Recent string theory calculations contradict this

claim. - The issue seems not yet fully resolved, but I

have many references on it if youre interested. - Interesting note Stephen Hawking recently

conceded a bet he had made, and decided black

holes do not destroy information.

Implications of InformationDensity Limits

- There is a minimum size for a bit-device.
- thus there is a minimum communication latency to

randomly access a memory containing n bits - as we discussed earlier.
- There is also a minimum cost per bit, if there is

a minimum cost per unit of matter/energy. - Implications for communications bandwidth limits
- coming up

Some Quantities of Interest

- We would like to know if there are limits on
- Information density
- Bits per unit volume
- Affects physical size and thus propagation

delayacross memories and processors. Also

affects cost. - Information flux
- Bits per unit area per unit time
- Affects cross-sectional bandwidth, data I/O

rates, rates of standard-information input

effective entropy removal - Rate of computation
- Number of distinguishable-state changes per

unit time - Affects rate of information processing achievable

in individual devices

Communication Limits

- Latency (propagation-time delay) limit from

earlier, due to speed of light. - Teaches us scalable interconnection technologies
- Bandwidth (information rate) limits
- Classical information-theory limit (Shannon)
- Limit, per-channel, given signal bandwidth SNR.
- Limits based on field theory (Smith/Lloyd)
- Limit given only area and power.
- Applies to I/O, cross-sectional bandwidths in

parallel machines, and entropy removal rates.

Hartley-Shannon Law

- The maximum information rate (capacity) of a

single wave-based communication channel is C

B log (1S/N) - Where
- B bandwidth of channel, in frequency (1/Tper)

units - S signal power level
- N noise power level
- The log base gives the information unit, as usual
- Law not sufficiently powerful for our purposes!
- Does not tell us how many effective channels are

possible, - given available power and/or area.
- Does not give us any limit if..
- we are allowed to indefinitely increase bandwidth

used, - or indefinitely decrease the noise floor (better

isolation).

Density Flux

- Note that any time you have
- a limit ? on density (per volume) of something,
- a limit v on its propagation velocity,
- this automatically implies
- a limit F ?v on the flux
- by which I mean amount per time per area
- Note also we always have a limit (c) on velocity!
- At speeds near c, must account for relativistic

effects - Often, slower velocities vltc may also be

relevant - Electron saturation velocity, in various

materials - Max velocity of air or liquid coolant in a

cooling system - Thus, a density limit ? implies flux limit F?c

Cross-section

v

Relativistic Effects

- For normal matter (bound massive-particle states)

moving at a velocity v approaching c - Entropy density increases by a factor 1/?
- Due to relativistic length contraction
- But, energy density increases by factor 1/?2
- Both length contraction mass amplification!
- ? entropy density scales up only w. square root

(1/2 power) of energy density from high velocity - Note that light travels at c already,
- its entropy density scales with energy density

to the 3/4 power. ? Light wins in limit as v?c. - If you want to maximize entropy flux/energy flux

Max. Entropy Flux Using Light

Smith 95

- Where
- FS entropy flux
- FE energy flux
- ?SB Stefan-Boltzmann constant, ?2kB4/60c2?3
- This is derived from the same field-theory

arguments as the information density bound. - Again, the blackbody spectrum maximizes the

entropy flux, given the energy flux - Because it is the equilibrium spectrum!

Entropy Flux Examples

- Consider a 10 cm wide, flat, square wireless

tablet with a 10 W power supply. - Whats its maximum possible rate of bit

transmission? - Independent of spectrum used, noise floor, etc.
- Answer
- Energy flux 10 W/2(10 cm)2 (use both sides)
- Smiths formula gives 2.21021 bps
- Whats the rate per square nanometer surface?
- Only 109 kbps! (ISDN speed, in a 100 GHz CPU?)
- 100 Gbps/nm2 ? nearly 1 GW power!

Light is not informationally dense enough for

high-bandwidth communication between densely

packed nanometer-scale devices at reasonable

power levels!!!

Entropy Flux w. Atomic Matter

- Consider liquid copper (?S 1.5 b/Å3) moving

along at a leisurely v 10 cm/s - BW 1.5x1027 bps through the 10-cm wide square!
- A million times higher BW than with 10W light!
- 150 Gbps/nm2 entropy flux!
- Plenty for nano-scale devices to talk to their

neighbors - Most of this entropy is in the conduction

electrons... - Less conductive materials have much less entropy
- Can probably do similarly well (or better) just

moving the electrons in solid copper. (Higher

velocities attainable.) - Nano-wires can probably carry gt100 Gbps

electrically. - Lesson
- For maximum bandwidth density at realistic power

levels, encode information using states of matter

(electrons) rather than states of radiation

(light).

Exercise Kinetic energy flux?

Some Quantities of Interest

- We would like to know if there are limits on
- Infropy density
- Bits per unit volume
- Affects physical size and thus propagation

delayacross memories and processors. Also

affects cost. - Infropy flux
- Bits per unit area per unit time
- Affects cross-sectional bandwidth, data I/O

rates, rates of standard-information input

effective entropy removal - Rate of computation
- Number of distinguishable-state changes per

unit time - Affects rate of information processing achievable

in individual devices

Computation Speed Limits

The Margolus-Levitin Bound

- The maximum rate ?? at which a system can

transition between distinguishable (orthogonal)

states is ?? ? 4(E ? E0)/h - where
- E average energy (expectation value of energy

over all states, weighted by their probability) - E0 energy of lowest-energy or ground state of

system - h Plancks constant (converts energy to

frequency) - Implication for computing
- A circuit node cant switch between 2 logic

states faster than this frequency determined by

its energy.

This is for pops,rate of nops ishalf as great.

Example of Frequency Bound

- Consider Lloyds 1 liter, 1 kg ultimate laptop
- Total gravitating mass-energy E of 9?1016 J
- Gives a limit of 5?1050 bit-operations per

second! - If laptop contains 2?1031 bits (photonic

maximum), - each bit can change state at a frequency of

2.5?1019 Hz (25 EHz) - 12 billion times higher-frequency than todays 2

GHz Intel processors - 250 million times higher-frequency than todays

100 GHz superconducting logic - But, the Margolus-Levitin limit may be far from

achievable in practice!

More Realistic Estimates

- Most of the energy in complex stable structures

is not accessible for computational purposes... - Tied up in the rest masses of atomic nuclei,
- Which form anchor points for electron orbitals
- mass energy of core atomic electrons,
- Which fill up low-energy states not involved in

bonding, - of electrons involved in atomic bonds
- Which are needed to hold the structure together
- Conjecture Can obtain tighter valid quantum

bounds on info. densities state-transition

rates by considering only the accessible energy. - Energy whose state-information is manipulable.

More Realistic Examples

- Suppose the following system is accessible1

electron confined to a (10 nm)3 volume, at an

average potential of 10 V above ground state. - Accessible energy 10 eV
- Accessible-energy density 10 eV/(10 nm)3
- Maximum entropy in Smith bound 1.4 bits?
- Not clear yet whether bound is applicable to this

case. - Maximum rate of change 9.7 PHz
- 5 million typical frequencies in todays CPUs
- 100,000 frequencies in todays superconducting

logics

Summary of Fundamental Limits