Loading...

PPT – Quantum Computer Architectures for Physical Simulations Dr. Mike Frank University of Florida CISE Department mpf@cise.ufl.edu PowerPoint presentation | free to download - id: 808c8-ZDc1Z

The Adobe Flash plugin is needed to view this content

Quantum Computer Architectures for Physical

Simulations Dr. Mike Frank University of

Florida CISE Department mpf_at_cise.ufl.edu

- Presented at
- Quantum Computation for Physical Modeling

Workshop Wed., May 8, 2002

Summary of Current Research

- One of my two major research projects
- Reversible Quantum Computing Project
- Studying developing physical computing theory
- Technology-independent physical limits of

computing - Ultimate models of computing for complexity

theory - Nanocomputer systems engineering
- Optimizing cost-efficiency scalability of future

computing - Reversible computing quantum computing
- Realistic models, CPU architectures, optimal

scaling advantages

MIT 1995-1999, UF 1999-

RevComp project heritage

- Grew out of work started by the MIT Information

Mechanics group in the 1970s. - Key members Fredkin, Toffoli, Margolus
- Occasional collaboration with Feynman, Bennett,
- This group laid much theoretical groundwork for

reversible ( eventually quantum) computing. - MIT Reversible Computing project (1990s)
- Group leaders Knight, Margolus
- Students Younis, DSouza, Becker, Vieri, Frank,

Ammer - Focus Reducing reversible computing to practice
- CMOS circuit styles, test chips, architectures,

complexity theory, algorithms, high-level

languages.

RevComp group at UF

- Reversible Quantum Computing group
- Organized by myself (CISE/ECE depts.)
- UF collabs. in CISE, ECE, Math, Phys./Chem.
- Notable graduate DoRon Motter (highest honors)
- Now doing PhD work in quantum circuits at U.

Mich. - 2 current Ph.D students (all ECE)
- Current focus
- Removing remaining barriers to near-term

practicality of reversible computing - Improved circuit styles, efficient power supplies
- Other applications
- Long-term study of physical computing theory and

scaling advantages of reversible/quantum models.

Who We Are

- Dr. Michael Frank
- MIT Ph.D. stud. postdoc, 1996-97 1999.
- Area exam studies on quantum computing.
- DARPA-funded reversible computing research.
- 1999-now Head of Reversible Quantum Computing

group at UFs CISE dept. - http//www.cise.ufl.edu/research/revcomp

Feynman

Some lineage

Knight

Sussman

Minsky

Frank

Fredkin

Toffoli

Margolus

Who We Are, cont.

- DoRon Motter
- Undergrad in UF CISE dept., 1997-2000.
- Coursework in CS quantum mechanics.
- Sr. highest honors thesis w. Dr. Frank, 2000.
- Now a Masters student at U. Mich.
- Advisor Igor Markov, U. Mich.
- DARPA-funded project on quantum logic systhesis

Overview of Talk

- 1. Optimally scalable QC models/architectures
- Universally maximally scalable (UMS) models
- Physical limits shape the ultimate model
- Appropriate programming models
- 2. Physics sim. algorithms on our QC model
- Many-particle Schrödinger equation
- Numerically stable classical reversible sims
- Quantum versions
- Quantum field theory
- 3. QC simulation on classical computers
- Visualization techniques, various optimizations
- Polynomial-space techniques

1. Optimally scalable quantum computer

architectures

- Universally maximally scalable (UMS) models
- Physical limits shape the ultimate model
- Appropriate programming models

Nanocomputer systems engineering

- A key goal of my long-term research program
- Develop key foundations for a new discipline of

nanocomputer systems engineering suited for

meeting the challenges of computing at the

nanoscale. - And convey it to peers, teach it to students.
- The new field will integrate concerns methods

from a variety of disciplines - Physics of computing ? Algorithm design
- Systems engineering ? Electrical eng., etc.
- Computer architecture ? Quantum computing!
- Complexity theory

Source ITRS 99

½CV2 based on ITRS 99 figures for Vdd and

minimum transistor gate capacitance. T300 K

Physical Computing Theory

- The study of theoretical models of computation

that are based on (or closely tied to) physics - Make no nonphysical assumptions!
- Includes the study of
- Fundamental physical limits of computing
- Physically-based models of computing
- Includes reversible and/or quantum models
- Ultimate (asymptotically optimal) models
- An asymptotically tight Churchs thesis
- Model-independent basis for complexity theory
- Basis for design of future nanocomputer

architectures - Asymptotic scaling of architectures algorithms
- Physically optimal algorithms

Ultimate Models of Computing

- We would like models of computing that match the

real computational power of physics. - Not too weak, not too strong.
- Most traditional models of computing only match

physics to within polynomial factors. - Misleading asymptotic performance of algorithms.
- Not good enough to form the basis for a real

systems engineering optimization of

architectures. - Develop models of computing that are
- As powerful as physically possible on all

problems - Realistic within asymptotic constant factors

Unstructured Search Problem

- Given a set S of N elements and a black-box

function fS?0,1, find an element x?S such that

f(x)1, if one exists (or if not, say so). - Any NP problem can be cast as an unstructured

search problem. - Not necessarily the optimal approach, however.
- Bounds on classical run-time
- ?(N) expected queries in worst case (0 or 1

solns) - Have to try N/2 elements on average before

finding soln. - Have to try all N if there is no solution.
- If elements are length-? bit strings,
- Expected trials is ?(2?) - exponential in ?.

Bad!

Quantum Unstructured Search

- Minimum time to solve unstructured search problem

on a (serial) quantum computer is - ?(N1/2) queries (2?/2) (21/2)?
- Still exponential, but with a smaller base.
- The minimum of queries can be achieved using

Grovers algorithm.

Classical Unstructured Search

- The classical serial algorithm takes ?(N) time.
- But Suppose we search in parallel!
- Have MltN processors running in parallel.
- Each searches a different subset of N/M elements

of the search space. - If processors are ballistic reversible
- Can cluster them in a dense mesh of diameter

?(M1/3). - Time accounting
- Computation time ?(N/M)
- Communication time ?(M1/3) (at lightspeed)
- Total T ? N/M M1/3 is minimized when M ?

N3/4 ? N1/4 Faster than Grovers

algorithm!

M1/3

M

ClassicalQuantum Parallelism

- Similar setup to classical parallelism
- M processors, searching N/M items each.
- Except, each processor uses Grovers algorithm.
- Time accounting
- Computation T ? (N/M)1/2
- Communication T ? M1/3 (as before)
- Total T ? (N/M)1/2 M1/3
- Total is minimized when M?N 3/5
- Minimized total is T ? N1/5.
- I.e., quantum unstructured search is really only

N1/4/N1/5 N1/20 faster than classical!

Scalability Maximal Scalability

- A multiprocessor architecture accompanying

performance model is scalable if - it can be scaled up to arbitrarily large

problem sizes, and/or arbitrarily large numbers

of processors, without the predictions of the

performance model breaking down. - An architecture ( model) is maximally scalable

for a given problem if - it is scalable and if no other scalable

architecture can claim asymptotically superior

performance on that problem - It is universally maximally scalable (UMS) if it

is maximally scalable on all problems! - I will briefly mention some characteristics of

architectures that are universally maximally

scalable

Universal Maximum Scalability

- Existence proof for universally maximally

scalable (UMS) architectures - Physics itself is a universal maximally scalable

architecture because any real computer is

merely a special case of a physical system. - Obviously, no real computer can beat the

performance of physical systems in general. - Unfortunately, physics doesnt give us a very

simple or convenient programming model. - Comprehensive expertise at programming physics

means mastery of all physical engineering

disciplines chemical, electrical, mechanical,

optical, etc. - Wed like an easier programming model than this!

Physics Constrains the Ultimate Model

Limits of Quantum Computers

- Quantum computers remain subject to all the

fundamental limits previously mentioned! - Entropy density limit - only 2n distingable

states! - Contrary to press manglings, a quantum computer

cannot store exponentially large amounts of

arbitrary data! - Information propagation limit - at most c
- Bells theorem, teleportation, superluminal

wave velocities do not give gtc information

propagation - Quantum field theory is explicitly local
- Computation rate limit - at most 4E/h rate of

orthogonal transitions, given available energy E. - Non-orthogonal ones are faster, but accomplish

less work - Speedups are due to fewer ops needed, not faster

ops

Simple UMS Architectures

- (I propose) any practical UMS architecture will

have the following features - Processing elements characterized by constant

parameters (independent of of processors) - Mesh-type message-passing interconnection

network, arbitrarily scalable in 2 dimensions - w. limited scalability in 3rd dimension.
- Processing elements that can be operated in a

highly reversible mode, at least up to some

limit. - Enables improved 3-d scalability, in a limited

regime - (In long term) Have capability for

quantum-coherent operation, for extra perf. on

some probs.

Ideally Scalable Architectures

Conjecture A 2- or 3-D mesh multiprocessor with

a fixed-size memory hierarchy per node is an

optimal scalable computer systems design (for any

application).

Processing Node

Processing Node

Processing Node

Local memory hierarchy (optimal fixed size)

Local memory hierarchy (optimal fixed size)

Local memory hierarchy (optimal fixed size)

Processing Node

Processing Node

Processing Node

Local memory hierarchy (optimal fixed size)

Local memory hierarchy (optimal fixed size)

Local memory hierarchy (optimal fixed size)

Mesh interconnection network

Some device parameters

- The following parameters are considered

fixed (for a given device/node technology) - (Maximum) number of bits of state per node
- (Minimum) node volume
- (Minimum) transition time (per ? transition)
- (Minimum) entropy generated per bit erased (?k ln

2) - (Minimum) static entropy generation rates
- For devices even just quiescently maintaining

state info - Related to energy leakage rates, decoherence

times - (Minimum) adiabatic frictional coefficient
- For devices undergoing reversible transitions
- (Maximum) quality factor Q of active transitions
- (Minimum) device cost

System-level parameters

- The following parameters may be adjusted as the

problem size increases - Number of nodes utilized
- Arrangement of utilized nodes in x, y, z
- Spreading nodes out optimizes perf. on some

problems - Rate of change of an externally-applied clocking

signal (time-dependent potential) - Allows trading off adiabaticity vs. speed of

computation, as a function of the number of nodes

Entropy coefficients of some reversible logic

gate operations

- From Frank, Ultimate theoretical models of

nanocomputers (Nanotechnology, 1998) - SCRL, circa 1997 1 b/Hz
- Optimistic reversible CMOS 10 b/kHz
- Merkles quantum FET 1.2 b/GHz
- Nanomechanical rod logic .07 b/GHz
- Superconducting PQ gate 25 b/THz
- Helical logic .01 b/THz

How low can you go? We dont really know!

Thermodynamics Scalability

- The fastest parallel algorithms for many problems

ideally require a 3-D mesh topology. - Minimizes communication latencies between points
- But, entropy flux bounds imply entropy generation

rates can scale only proportionally to systems

2-D outer (convex hull) surface area. - Assuming upper bounds on temperature pressure
- So, can harness 3rd dimension only to the extent

that useful operations can be made reversible. - Optimizing efficiency requires a careful tradeoff

between performance, power, cost...

(No Transcript)

Reversible/Adiabatic CMOS

- Chips designed at MIT, 1996-1999

Minimum Losses w. Leakage

Etot Eadia Eleak

Eleak Pleaktr

Eadia cE / tr

Reversible Emulation - Ben89

k 2 n 3

k 3 n 2

Bennett 89 alg. is not optimal

k 2 n 3

k 3 n 2

Just look at all the spacetime it wastes!!!

Parallel Frank02 algorithm

- We can move the triangles closer together, to

eliminate the wasted spacetime. - Resulting algorithm is linear time for all n and

k and dominates Ben89 for time, spacetime,

energy!

k3 n2

k2 n3

Emulated time

k4 n1

Real time

Spacetime blowup

Energy saved

k

n

Perf. scaling w. of devices

- If alg. is not limited by communications needs,
- Use irreversible processors spread in a 2-D

layer. - Remove entropy along perpendicular dimension.
- No entropy rate limits,
- so no speed advantage from reversibility.
- If alg. requires only local communication, latency

? cyc. time, in an NDNDND mesh, - Leak-free reversible machine perf. scales better!
- Irreversible tcyc ?(ND1/3)
- Reversible tcyc ?(ND1/4) ?(ND1/12) faster!
- To boost reversibility speedup by 10, one must

consider 1036-CPU machines (1.7 trillion moles

of CPUs!) - 1.7 trillion moles of H atoms weighs 1.7 million

metric tons! - A 100-m tall hill of H-atom sized CPUs!

Lower bound on irrev. time

- Simulate Nproc ND3 cells for Nsteps ND steps.
- Consider a sequence of ND update steps.
- Final cell value depends on ND4 ops in time T.
- All ops must occur within radius r cT of cell.
- Surface area A ? T2, rate Rop ? T2 sustainable.
- Nops ? Rop T ? T3 needs to be at least ND4.
- ? T must be ?(ND4/3) to do all ND steps.
- Average time per step must be ?(ND1/3).
- Any irreversible machine (of any technology or

architecture) must obey this bound!

Irreversible 3-D Mesh

Reversible 3-D Mesh

Non-local Communication

- Best computational task for reversibility
- Each processor must exchange messages with

another that is ND1/2 nodes away on each cycle - Unsure what real-world problem demands this

pattern! - In this case, reversible speedup scales with

number of CPUs to only the 1/18th power. - To boost reversibility speedup by 10, only

need 1018 (or 1.7 micromoles) of CPUs - If each was a 1-nm cluster of 100 C atoms, this

is only 2 mg mass, volume 1 mm3. - Current VLSI Need cost level of 25B before

you see a speedup.

Open issues for reversible comp.

- Integrate realistic fundamental models of the

clocking system into the engineering analysis. - There is an open issue about the scalability of

clock distribution systems. - Exist quantum bounds on reusability of timing

signals. - Not yet clear if reversible clocking is scalable.
- Fortunately, self-timed reversible computing also

appears to be a possibility. - Not yet clear if this approach works above 1-D

models. - Simulation experiments planned to investigate

this. - Develop efficient physical realizations of

nano-scale bit-devices timing systems.

Timing in Adiabatic Systems

- When multiple adiabatic devices interact, the

relative timing must be precise, in order to

ensure that adiabatic rules are met. - There are two basic approaches to timing
- Global (a.k.a. clocked, a.k.a. synchronous)

timing - Approach in nearly all conventional irreversible

CPUs - Basis for all practical adiabatic/quantum

computing mechanisms proposed to date - Local (a.k.a. self-timed, a.k.a. asynchronous)

timing - Implemented in a few commercial irreversible

chips. - Feynman 86 showed a self-timed serial reversible

computation was implementable in QM, in principle - Margolus 90 extended this to a 2-D model with

1-D of parallelism. - Will it work in 3-D?

Global Timing

- Examples of adiabatic systems designed on the

basis of global, synchronous timing - Adiabatic CMOS with external power/clock rails
- Superconducting parametric quantron (Likharev)
- Adiabatic Quantum-Dot Cellular Automaton (Lent)
- Adiabatic mechanical logics (Merkle, Drexler)
- All proposed quantum computers
- But, a problem Synchronous timing may not scale!
- Work by Janzig others raises issues of possible

limits due to quantum uncertainty. Unresolved.

Clock/Power Supply Desiderata

- Requirements for an adiabatic timing signal /

power supply - Generate trapezoidal waveform with very flat

high/low regions - Flatness limits Q of logic.
- Waveform during transitions is ideally linear,
- But this does not affect maximum Q, only energy

coefficient. - Operate resonantly with logic, with high Q.
- Power supply Q will limit overall system Q
- Reasonable cost, compared to logic it powers.
- If possible, scale Q ? t (cycle time)
- Required to be considered an adiabatic mechanism.
- May conflict w. inductor scaling laws!
- At the least, Q should be high at leakage-limited

speed

(Ideally, independent of t.)

Supply concepts in my research

- Superpose several sinusoidal signals from

phase-synchronized oscillators at harmonics of

fundamental frequency - Weight these frequency components as per Fourier

transform of desired waveform - Create relatively high-L integrated inductors via

vertical, helical metal coils - Only thin oxide layers between turns
- Use mechanically oscillating, capacitive MEMS

structures in vacuo as high-Q (10k) oscillator - Use geometry to get desired wave shape directly

Newer Supply Concepts

- Transmission-line-based adiabatic resonators.
- See transparency.
- MEMS-based resonant power supply
- See transparency, next slide
- Ideal adiabatic supplies - Can they exist?
- Idealized mechanical model See transparency.
- But, may be quantum limits to reusability/scalabil

ity of global timing signals. - This is a very fundamental issue!

A MEMS Supply Concept

- Energy stored mechanically.
- Variable coupling strength -gt custom wave shape.
- Can reduce losses through balancing, filtering.
- Issue How to adjust frequency?

Programming Model Desiderata

- Should permit optimally efficient quantum

algorithms (constant-factor slowdowns only). - Should have reasonable constant factor overheads.
- Unit cell complexity should be kept low for ease

of design assembly. - Should provide a clear separation between program

and data, where appropriate. - Should be straightforward to program (and to

write compilers for).

Candidate Programming Model

- Unit-cell capabilities
- A small number of n-qubit integer registers.
- Perform programmable 2- and 3- qubit ops on

selected data bits (or n-qubit words) - Classical digital ops CNOT, CCNOT, swaps, etc.
- 1-bit analog unitary ops
- w. precision up to the limit of the qubit device

technology - Treat imprecision like decoherence noise, correct

it away? - Flow of control
- For reversibility, could have 2 instruction

registers, which take turns executing loading

each other. - Data movement
- Streaming between neighboring unit cells.

Node Architecture Sketch

Instruction registers

Data path to/from neighbor node (in 2d or 3d)

I/O registers

Execution unit

Data registers

2. Simulating physical systems on our QC model

- Many-particle Schrödinger equation
- Numerically stable classical reversible sims
- Quantum equivalents
- Quantum field theory

Simulating Wave Mechanics

- The basic problem situation
- Given
- A (possibly complex) initial wavefunction

in an N-dimensional position basis,

and - a (possibly complex and time-varying) potential

energy function , - a time t after (or before) t0,
- Compute
- Many practical physics applications...

The Problem with the Problem

- An efficient technique (when possible)
- Convert V to the corresponding Hamiltonian H.
- Find the energy eigenstates of H.
- Project ? onto eigenstate basis.
- Multiply each component by .
- Project back onto position basis.
- Problem
- It may be intractable to find the eigenstates!
- We resort to numerical methods...

History of Reversible Schrödinger Sim.

See http//www.cise.ufl.edu/mpf/sch

- Technique discovered by Ed Fredkin and student

William Barton at MIT in 1975. - Subsequently proved by Feynman to exactly

conserve a certain probability measure - Pt Rt2 It?1It1
- 1-D simulations in C/Xlib written by Frank at MIT

in 1996. Good behavior observed. - 1 2-D simulations in Java, and proof of

stability by Motter at UF in 2000. - User-friendly Java GUI by Holz at UF, 2002.

(Rreal, Iimag., ttime step index)

Difference Equations

- Consider any system with state x that evolves

according to a diff. eq. that is 1st-order in

time x f(x) - Discretize time to finite scale ?t, and use a

difference equation instead x(t ?t) x(t)

?t f(x(t)) - Problem Behavior not always numerically stable.
- Errors can accumulate and grow exponentially.

Centered Difference Equations

- Discretize derivatives in a symmetric fashion
- Leads to update rules like x(t ?t) x(t ?

?t) 2?t f(x(t)) - Problem States at odd- vs. even- numbered time

steps not constrained to stay close to each other!

2?tf

x1

g

x2

g

x3

g

x4

Centered Schrödinger Equation

- Schrödingers equation for 1 particle in 1-D
- Replace time ( also space) derivatives with

centered differences. - Centered difference equation has real part at odd

times that depends only on imaginary part at even

times, vice-versa. - Drift not an issue - real imaginary parts

represent different state components!

R1

g

?

I2

g

R3

g

I4

?

Proof of Stability

- Technique is proved perfectly numerically stable

convergent assuming V is 0 and ?x2/?t gt ?/m

(an angular velocity) - Elements of proof
- Lax-Richmyer equivalence convergence?stability.
- Analyze amplitudes of Fourier-transformed basis
- Sufficient due to Parsevals relation
- Use theorem (cf. Strikwerda) equating stability

to certain conditions on the roots of an

amplification polynomial ?(g,?), which are

satisfied by our rule. - Empirically, technique looks perfectly stable

even for more complex potential energy funcs.

Phenomena Observed in Model

- Perfect reversibility
- Wave packet momentum
- Conservation of probability mass
- Harmonic oscillator
- Tunnelling/reflection at potential energy

barriers - Interference fringes
- Diffraction

Gaussian wave packet moving to the right Array

of small sharp potential-energy barriers

Initial reflection/refraction of wave packet

A little later

Aimed a little higher

A faster-moving particle

Interesting Features of this Model

- Can be implemented perfectly reversibly, with

zero asymptotic spacetime overhead - Every last bit is accounted for!
- As a result, algorithm can run adiabatically,

with power dissipation approaching zero - Modulo leakage frictional losses
- Can map it to a unitary quantum algorithm
- Direct mapping
- Classical reversible ops only, no quantum speedup
- Indirect (implicit) mapping
- Simulate p particles on kd lattice sites using pd

lg k qubits - Time per update step is order pd lg k instead of

kpd

Implicit Mapping

- Use pd integer registers xj, each lg k qubits

long - Amplitude of joint state of all registers

represents amplitude of wavefunction point x - The difference equation term for dimension j

amounts to multiplication of state by matrix - (can be normalized to be) nearly unitary for

small ? - Idea Can approximate Dj using 1-qubit ops on

low-order bit of xi, plus CCNOTs to do carries.

xi

Field Theory Systems

- Goal Simulate field theory for p particle types

in d-dimensional space over kd lattice sites - General approach At each lattice site, have p

integer qubit registers nj, denoting the

occupancy number of particle type j. - nj 0 or 1 for each type of fermion
- nj from 0 to nmax for bosons
- nmax determined by available total energy
- Use quantum LGCA model (type I)
- Interact particles at site using collision

operator - Includes particle creation/annihilation operators
- Stream particles between sites after collision

step

3. Quantum computer simulation on classical

computers

- Visualization techniques, various optimizations
- Polynomial-space techniques

Simulation of QC Algorithms

- Visualization
- Project states onto 2-D/3-D spaces
- Corresponding to register pairs/triplets.
- Use HSV color space to represent amplitudes.
- Visualize gate ops with continuous color change.
- Simulation Efficiency
- Optimizations
- Track only states having non-zero amplitude.
- Linear-space simulations of n-qubit systems.

Visualization Technique

- Illustration 3 stages of Shors algorithm
- Register value ? spatial position of pixel
- Phase angle ? pixel color hue.
- Magnitude ? pixel color saturation.

Initial State

After doing Hadamard transform on all bits of a

After modular exponentiation bxa (mod N)

State After Fourier Transform

Efficient QC Simulations

- Task Simulate an n-qubit quantum computer.
- Maximally stupid approach
- Store a 2n-element vector
- Multiply it by a full 2n2n matrix for each gate

op - Some obvious optimizations
- Never store whole matrix (compute dynamically)
- Store only nonzero elements of state vector
- Especially helpful when qubits are highly

correlated - Do only constant work per nonzero vector element
- Scatter amplitude from each state to 1 or 2

successors - Drop small-probability-mass sets of states
- Linearity of QM implies no chaotic growth of

errors

Linear-space quantum simulation

- A popular myth
- Simulating an n-qubit (or n-particle) quantum

system takes e?(n) space (as well as time). - The usual justification
- It takes e?(n) numbers even to represent a single

?(n)-dimensional state vector, in general. - The hole in that argument
- Can simulate the statistical behavior of a

quantum system w/o ever storing a state vector! - Result BQP ? PSPACE known since BV93...
- But practical poly-space sims are rarely described

The Basic Idea

- Inspiration
- Feynmans path integral formulation of QED.
- Gives the amplitude of a given final

configuration by accumulating amplitude over all

paths from initial to final configurations. - Each path consists of only a single

?(n)-coordinate configuration at each time, not a

full wavefunction over the configuration space. - Can enumerate all paths, while only ever

representing one path at a time.

Simulating Quantum Computations

- Given
- Any n-qubit quantum computation, expressed as a

sequence of 1-qubit gates and CNOT gates. - An initial state s0 which is just a basis state

in the classical bitwise basis, e.g. ?00000?. - Goal
- Generate a final basis state stochasically with

the same probability distribution as the quantum

computer would do.

U2

U3

U4

U1

Matrix Representation

- Consider each gate as rank-2n unitary matrix
- Each CNOT application is a 0-1 (permutation)

matrix - a classical reversible bit-operation. - With appropriate row ordering, each Ui gate

application is block-diagonal, w. each 22 block

equal to Ui. - We need never represent these full matrices!
- The 1 or 2 nonzero entries in a given row can be

located computed immediately given the row id

(bit string) and Ui.

The Linear-Space Algorithm

- Generate a random coin c?0,1.
- Initialize probability accumulator p?0.
- For each final n-bit string y at time t,
- Compute its amplitude ?(y) as follows
- Generate its possible 1 or 2 predecessor strings

x1 (and maybe x2) given the gate-op preceding t. - For each predecessor, compute its amplitude at

time t?1 recursively using this same algorithm, - unless t0, in which case ?1 if ?x?s0, 0

otherwise. - Add predecessor amplitudes, weighted by entries.
- Maybe output y, using roulette wheel algorithm
- Accumlate Pry into total p ? p ?(y)2
- Output y and halt if pgtc.

A Further Optimization

- Dont even have to enumerate all final states!
- Instead Stochasically follow a trajectory.
- Basic idea
- Keep track of 1 current state its amplitude

?0. - For CNOTs Deterministically transform state.
- For Us
- Calculate amplitude ?1 of neighbor state w.

path-integral - Calculate amplitudes ?0 and ?1 after qubit op
- Choose 1 successor as new current state, using

?2 distrib.

u00

?0

?0

Current state

u10

Possible successors

u01

?1

?1

u11

Neighbor state

Complexity Comparison

- To simulate t gate ops (c CNOTs u 1-bit unitary

ops) of an n-qubit quantum computer - Space Time
- Traditional method 2n t2n
- Path-integral method tn n2t
- (Actually, only the u unitary ops, not all t ops

or all n qubits, contribute to any of the

exponents here.) - Upshot
- Lower space usage can allow larger systems to be

simulated, for short periods. - Run time is competitive for case when t lt n

Conclusion

- A grab-bag of tricks and techniques
- Outline of a research program is taking shape
- Quantum computing is really interesting...
- Now if only I can get someone to pay me to devote

my full time to studying it!

Slides left over from USC talk

- To import as needed

Reversibility of Physics

- The universe is (apparently) a closed system
- Closed systems evolve via unitary transforms
- Apparent wavefunction collapse doesnt contradict

this (confirmed by work of Everett, Zurek, etc.) - Time-evolution of concrete state of universe (or

closed subsystems) is reversible - Invertible (bijective)
- Deterministic looking backwards in time
- Total info. (log of poss. states) doesnt

decrease - Can increase, though, if volume is increasing
- Information cannot be destroyed!

Illustrating Landauers principle

Before bit erasure

After bit erasure

s0

0

0

s0

N states

sN-1

0

0

sN-1

Unitary (1-1) evolution

2N states

s0

sN

1

0

N states

0

sN-1

s2N-1

1

Benefits of Reversible Computing

- Reduces energy/cooling costs of computing
- Improves performance per unit power consumed
- Given heat flux limits in the cooling system,
- Improves performance per unit convex hull area
- A faster machine in a given size box.
- For communication-intensive parallel algorithms,
- Improves performance, period!
- All these benefits are by small polynomial

factors in the integration scale the device

properties.

Quantum Computing pros/cons

- Pros
- Removes an unnecessary restriction on the types

of quantum states ops usable for computation. - Opens up exponentially shorter paths to solving

some types of problems (e.g., factoring,

simulation) - Cons
- Sensitive, requires overhead for error

correction. - Also, still remains subject to fundamental

physical bounds on info. density, rate of state

change! - Myth A quantum memory can store an

exponentially large amount of data. - Myth A quantum computer can perform operations

at an exponentially faster rate than a classical

one.

Some goals of my QC work

- Develop a UMS model of computation that

incorporates quantum computing. - Design simulate quantum computer architectures,

programming languages, etc. - Describe how to do the systems-engineering

optimization of quantum computers for various

problems of interest.

Conclusion

- As we near the physical limits of computing,
- Further improvements will require an increasingly

sophisticated interdisciplinary integration of

concerns across many levels of engineering. - I am developing a principled nanocomputer systems

engineering methodology - And applying it to the problem of determining the

real cost-efficiency of new models of computing - Reversible computing
- Quantum computing
- Building the foundations of a new discipline that

will be critical in coming decades.