Title: Quantum Computer Architectures for Physical Simulations Dr. Mike Frank University of Florida CISE Department mpf@cise.ufl.edu
1Quantum Computer Architectures for Physical
Simulations Dr. Mike FrankUniversity of
FloridaCISE Departmentmpf_at_cise.ufl.edu
- Presented at
- Quantum Computation for Physical Modeling
WorkshopWed., May 8, 2002
2Summary of Current Research
- One of my two major research projects
- Reversible Quantum Computing Project
- Studying developing physical computing theory
- Technology-independent physical limits of
computing - Ultimate models of computing for complexity
theory - Nanocomputer systems engineering
- Optimizing cost-efficiency scalability of future
computing - Reversible computing quantum computing
- Realistic models, CPU architectures, optimal
scaling advantages
MIT 1995-1999,UF 1999-
3RevComp project heritage
- Grew out of work started by the MIT Information
Mechanics group in the 1970s. - Key members Fredkin, Toffoli, Margolus
- Occasional collaboration with Feynman, Bennett,
- This group laid much theoretical groundwork for
reversible ( eventually quantum) computing. - MIT Reversible Computing project (1990s)
- Group leaders Knight, Margolus
- Students Younis, DSouza, Becker, Vieri, Frank,
Ammer - Focus Reducing reversible computing to practice
- CMOS circuit styles, test chips, architectures,
complexity theory, algorithms, high-level
languages.
4RevComp group at UF
- Reversible Quantum Computing group
- Organized by myself (CISE/ECE depts.)
- UF collabs. in CISE, ECE, Math, Phys./Chem.
- Notable graduate DoRon Motter (highest honors)
- Now doing PhD work in quantum circuits at U.
Mich. - 2 current Ph.D students (all ECE)
- Current focus
- Removing remaining barriers to near-term
practicality of reversible computing - Improved circuit styles, efficient power supplies
- Other applications
- Long-term study of physical computing theory and
scaling advantages of reversible/quantum models.
5Who We Are
- Dr. Michael Frank
- MIT Ph.D. stud. postdoc, 1996-97 1999.
- Area exam studies on quantum computing.
- DARPA-funded reversible computing research.
- 1999-now Head of Reversible Quantum Computing
group at UFs CISE dept. - http//www.cise.ufl.edu/research/revcomp
Feynman
Some lineage
Knight
Sussman
Minsky
Frank
Fredkin
Toffoli
Margolus
6Who We Are, cont.
- DoRon Motter
- Undergrad in UF CISE dept., 1997-2000.
- Coursework in CS quantum mechanics.
- Sr. highest honors thesis w. Dr. Frank, 2000.
- Now a Masters student at U. Mich.
- Advisor Igor Markov, U. Mich.
- DARPA-funded project on quantum logic systhesis
7Overview of Talk
- 1. Optimally scalable QC models/architectures
- Universally maximally scalable (UMS) models
- Physical limits shape the ultimate model
- Appropriate programming models
- 2. Physics sim. algorithms on our QC model
- Many-particle Schrödinger equation
- Numerically stable classical reversible sims
- Quantum versions
- Quantum field theory
- 3. QC simulation on classical computers
- Visualization techniques, various optimizations
- Polynomial-space techniques
81. Optimally scalable quantum computer
architectures
- Universally maximally scalable (UMS) models
- Physical limits shape the ultimate model
- Appropriate programming models
9Nanocomputer systems engineering
- A key goal of my long-term research program
- Develop key foundations for a new discipline of
nanocomputer systems engineering suited for
meeting the challenges of computing at the
nanoscale. - And convey it to peers, teach it to students.
- The new field will integrate concerns methods
from a variety of disciplines - Physics of computing ? Algorithm design
- Systems engineering ? Electrical eng., etc.
- Computer architecture ? Quantum computing!
- Complexity theory
10Source ITRS 99
11½CV2 based on ITRS 99 figures for Vdd and
minimum transistor gate capacitance. T300 K
12Physical Computing Theory
- The study of theoretical models of computation
that are based on (or closely tied to) physics - Make no nonphysical assumptions!
- Includes the study of
- Fundamental physical limits of computing
- Physically-based models of computing
- Includes reversible and/or quantum models
- Ultimate (asymptotically optimal) models
- An asymptotically tight Churchs thesis
- Model-independent basis for complexity theory
- Basis for design of future nanocomputer
architectures - Asymptotic scaling of architectures algorithms
- Physically optimal algorithms
13Ultimate Models of Computing
- We would like models of computing that match the
real computational power of physics. - Not too weak, not too strong.
- Most traditional models of computing only match
physics to within polynomial factors. - Misleading asymptotic performance of algorithms.
- Not good enough to form the basis for a real
systems engineering optimization of
architectures. - Develop models of computing that are
- As powerful as physically possible on all
problems - Realistic within asymptotic constant factors
14Unstructured Search Problem
- Given a set S of N elements and a black-box
function fS?0,1, find an element x?S such that
f(x)1, if one exists (or if not, say so). - Any NP problem can be cast as an unstructured
search problem. - Not necessarily the optimal approach, however.
- Bounds on classical run-time
- ?(N) expected queries in worst case (0 or 1
solns) - Have to try N/2 elements on average before
finding soln. - Have to try all N if there is no solution.
- If elements are length-? bit strings,
- Expected trials is ?(2?) - exponential in ?.
Bad!
15Quantum Unstructured Search
- Minimum time to solve unstructured search problem
on a (serial) quantum computer is - ?(N1/2) queries (2?/2) (21/2)?
- Still exponential, but with a smaller base.
- The minimum of queries can be achieved using
Grovers algorithm.
16Classical Unstructured Search
- The classical serial algorithm takes ?(N) time.
- But Suppose we search in parallel!
- Have MltN processors running in parallel.
- Each searches a different subset of N/M elements
of the search space. - If processors are ballistic reversible
- Can cluster them in a dense mesh of diameter
?(M1/3). - Time accounting
- Computation time ?(N/M)
- Communication time ?(M1/3) (at lightspeed)
- Total T ? N/M M1/3 is minimized when M ?
N3/4 ? N1/4 Faster than Grovers
algorithm!
M1/3
M
17ClassicalQuantum Parallelism
- Similar setup to classical parallelism
- M processors, searching N/M items each.
- Except, each processor uses Grovers algorithm.
- Time accounting
- Computation T ? (N/M)1/2
- Communication T ? M1/3 (as before)
- Total T ? (N/M)1/2 M1/3
- Total is minimized when M?N 3/5
- Minimized total is T ? N1/5.
- I.e., quantum unstructured search is really only
N1/4/N1/5 N1/20 faster than classical!
18Scalability Maximal Scalability
- A multiprocessor architecture accompanying
performance model is scalable if - it can be scaled up to arbitrarily large
problem sizes, and/or arbitrarily large numbers
of processors, without the predictions of the
performance model breaking down. - An architecture ( model) is maximally scalable
for a given problem if - it is scalable and if no other scalable
architecture can claim asymptotically superior
performance on that problem - It is universally maximally scalable (UMS) if it
is maximally scalable on all problems! - I will briefly mention some characteristics of
architectures that are universally maximally
scalable
19Universal Maximum Scalability
- Existence proof for universally maximally
scalable (UMS) architectures - Physics itself is a universal maximally scalable
architecture because any real computer is
merely a special case of a physical system. - Obviously, no real computer can beat the
performance of physical systems in general. - Unfortunately, physics doesnt give us a very
simple or convenient programming model. - Comprehensive expertise at programming physics
means mastery of all physical engineering
disciplines chemical, electrical, mechanical,
optical, etc. - Wed like an easier programming model than this!
20Physics Constrains the Ultimate Model
21Limits of Quantum Computers
- Quantum computers remain subject to all the
fundamental limits previously mentioned! - Entropy density limit - only 2n distingable
states! - Contrary to press manglings, a quantum computer
cannot store exponentially large amounts of
arbitrary data! - Information propagation limit - at most c
- Bells theorem, teleportation, superluminal
wave velocities do not give gtc information
propagation - Quantum field theory is explicitly local
- Computation rate limit - at most 4E/h rate of
orthogonal transitions, given available energy E. - Non-orthogonal ones are faster, but accomplish
less work - Speedups are due to fewer ops needed, not faster
ops
22Simple UMS Architectures
- (I propose) any practical UMS architecture will
have the following features - Processing elements characterized by constant
parameters (independent of of processors) - Mesh-type message-passing interconnection
network, arbitrarily scalable in 2 dimensions - w. limited scalability in 3rd dimension.
- Processing elements that can be operated in a
highly reversible mode, at least up to some
limit. - Enables improved 3-d scalability, in a limited
regime - (In long term) Have capability for
quantum-coherent operation, for extra perf. on
some probs.
23Ideally Scalable Architectures
Conjecture A 2- or 3-D mesh multiprocessor with
a fixed-size memory hierarchy per node is an
optimal scalable computer systems design (for any
application).
Processing Node
Processing Node
Processing Node
Local memory hierarchy(optimal fixed size)
Local memory hierarchy(optimal fixed size)
Local memory hierarchy(optimal fixed size)
Processing Node
Processing Node
Processing Node
Local memory hierarchy(optimal fixed size)
Local memory hierarchy(optimal fixed size)
Local memory hierarchy(optimal fixed size)
Mesh interconnection network
24Some device parameters
- The following parameters are considered
fixed(for a given device/node technology) - (Maximum) number of bits of state per node
- (Minimum) node volume
- (Minimum) transition time (per ? transition)
- (Minimum) entropy generated per bit erased (?k ln
2) - (Minimum) static entropy generation rates
- For devices even just quiescently maintaining
state info - Related to energy leakage rates, decoherence
times - (Minimum) adiabatic frictional coefficient
- For devices undergoing reversible transitions
- (Maximum) quality factor Q of active transitions
- (Minimum) device cost
25System-level parameters
- The following parameters may be adjusted as the
problem size increases - Number of nodes utilized
- Arrangement of utilized nodes in x, y, z
- Spreading nodes out optimizes perf. on some
problems - Rate of change of an externally-applied clocking
signal (time-dependent potential) - Allows trading off adiabaticity vs. speed of
computation, as a function of the number of nodes
26Entropy coefficients of some reversible logic
gate operations
- From Frank, Ultimate theoretical models of
nanocomputers (Nanotechnology, 1998) - SCRL, circa 1997 1 b/Hz
- Optimistic reversible CMOS 10 b/kHz
- Merkles quantum FET 1.2 b/GHz
- Nanomechanical rod logic .07 b/GHz
- Superconducting PQ gate 25 b/THz
- Helical logic .01 b/THz
How low can you go? We dont really know!
27Thermodynamics Scalability
- The fastest parallel algorithms for many problems
ideally require a 3-D mesh topology. - Minimizes communication latencies between points
- But, entropy flux bounds imply entropy generation
rates can scale only proportionally to systems
2-D outer (convex hull) surface area. - Assuming upper bounds on temperature pressure
- So, can harness 3rd dimension only to the extent
that useful operations can be made reversible. - Optimizing efficiency requires a careful tradeoff
between performance, power, cost...
28(No Transcript)
29Reversible/Adiabatic CMOS
- Chips designed at MIT, 1996-1999
30Minimum Losses w. Leakage
Etot Eadia Eleak
Eleak Pleaktr
Eadia cE / tr
31Reversible Emulation - Ben89
k 2n 3
k 3n 2
32Bennett 89 alg. is not optimal
k 2n 3
k 3n 2
Just look at all the spacetime it wastes!!!
33Parallel Frank02 algorithm
- We can move the triangles closer together, to
eliminate the wasted spacetime. - Resulting algorithm is linear time for all n and
k and dominates Ben89 for time, spacetime,
energy!
k3n2
k2n3
Emulated time
k4n1
Real time
34Spacetime blowup
Energy saved
k
n
35Perf. scaling w. of devices
- If alg. is not limited by communications needs,
- Use irreversible processors spread in a 2-D
layer. - Remove entropy along perpendicular dimension.
- No entropy rate limits,
- so no speed advantage from reversibility.
- If alg. requires only local communication,latency
? cyc. time, in an NDNDND mesh, - Leak-free reversible machine perf. scales better!
- Irreversible tcyc ?(ND1/3)
- Reversible tcyc ?(ND1/4) ?(ND1/12) faster!
- To boost reversibility speedup by 10, one must
consider 1036-CPU machines (1.7 trillion moles
of CPUs!) - 1.7 trillion moles of H atoms weighs 1.7 million
metric tons! - A 100-m tall hill of H-atom sized CPUs!
36Lower bound on irrev. time
- Simulate Nproc ND3 cells for Nsteps ND steps.
- Consider a sequence of ND update steps.
- Final cell value depends on ND4 ops in time T.
- All ops must occur within radius r cT of cell.
- Surface area A ? T2, rate Rop ? T2 sustainable.
- Nops ? Rop T ? T3 needs to be at least ND4.
- ? T must be ?(ND4/3) to do all ND steps.
- Average time per step must be ?(ND1/3).
- Any irreversible machine (of any technology or
architecture) must obey this bound!
37Irreversible 3-D Mesh
38Reversible 3-D Mesh
39Non-local Communication
- Best computational task for reversibility
- Each processor must exchange messages with
another that is ND1/2 nodes away on each cycle - Unsure what real-world problem demands this
pattern! - In this case, reversible speedup scales with
number of CPUs to only the 1/18th power. - To boost reversibility speedup by 10, only
need 1018 (or 1.7 micromoles) of CPUs - If each was a 1-nm cluster of 100 C atoms, this
is only 2 mg mass, volume 1 mm3. - Current VLSI Need cost level of 25B before
you see a speedup.
40Open issues for reversible comp.
- Integrate realistic fundamental models of the
clocking system into the engineering analysis. - There is an open issue about the scalability of
clock distribution systems. - Exist quantum bounds on reusability of timing
signals. - Not yet clear if reversible clocking is scalable.
- Fortunately, self-timed reversible computing also
appears to be a possibility. - Not yet clear if this approach works above 1-D
models. - Simulation experiments planned to investigate
this. - Develop efficient physical realizations of
nano-scale bit-devices timing systems.
41Timing in Adiabatic Systems
- When multiple adiabatic devices interact, the
relative timing must be precise, in order to
ensure that adiabatic rules are met. - There are two basic approaches to timing
- Global (a.k.a. clocked, a.k.a. synchronous)
timing - Approach in nearly all conventional irreversible
CPUs - Basis for all practical adiabatic/quantum
computing mechanisms proposed to date - Local (a.k.a. self-timed, a.k.a. asynchronous)
timing - Implemented in a few commercial irreversible
chips. - Feynman 86 showed a self-timed serial reversible
computation was implementable in QM, in principle - Margolus 90 extended this to a 2-D model with
1-D of parallelism. - Will it work in 3-D?
42Global Timing
- Examples of adiabatic systems designed on the
basis of global, synchronous timing - Adiabatic CMOS with external power/clock rails
- Superconducting parametric quantron (Likharev)
- Adiabatic Quantum-Dot Cellular Automaton (Lent)
- Adiabatic mechanical logics (Merkle, Drexler)
- All proposed quantum computers
- But, a problem Synchronous timing may not scale!
- Work by Janzig others raises issues of possible
limits due to quantum uncertainty. Unresolved.
43Clock/Power Supply Desiderata
- Requirements for an adiabatic timing signal /
power supply - Generate trapezoidal waveform with very flat
high/low regions - Flatness limits Q of logic.
- Waveform during transitions is ideally linear,
- But this does not affect maximum Q, only energy
coefficient. - Operate resonantly with logic, with high Q.
- Power supply Q will limit overall system Q
- Reasonable cost, compared to logic it powers.
- If possible, scale Q ? t (cycle time)
- Required to be considered an adiabatic mechanism.
- May conflict w. inductor scaling laws!
- At the least, Q should be high at leakage-limited
speed
(Ideally,independentof t.)
44Supply concepts in my research
- Superpose several sinusoidal signals from
phase-synchronized oscillators at harmonics of
fundamental frequency - Weight these frequency components as per Fourier
transform of desired waveform - Create relatively high-L integrated inductors via
vertical, helical metal coils - Only thin oxide layers between turns
- Use mechanically oscillating, capacitive MEMS
structures in vacuo as high-Q (10k) oscillator - Use geometry to get desired wave shape directly
45Newer Supply Concepts
- Transmission-line-based adiabatic resonators.
- See transparency.
- MEMS-based resonant power supply
- See transparency, next slide
- Ideal adiabatic supplies - Can they exist?
- Idealized mechanical model See transparency.
- But, may be quantum limits to reusability/scalabil
ity of global timing signals. - This is a very fundamental issue!
46A MEMS Supply Concept
- Energy storedmechanically.
- Variable couplingstrength -gt customwave shape.
- Can reduce lossesthrough balancing,filtering.
- Issue How toadjust frequency?
47Programming Model Desiderata
- Should permit optimally efficient quantum
algorithms (constant-factor slowdowns only). - Should have reasonable constant factor overheads.
- Unit cell complexity should be kept low for ease
of design assembly. - Should provide a clear separation between program
and data, where appropriate. - Should be straightforward to program (and to
write compilers for).
48Candidate Programming Model
- Unit-cell capabilities
- A small number of n-qubit integer registers.
- Perform programmable 2- and 3- qubit ops on
selected data bits (or n-qubit words) - Classical digital ops CNOT, CCNOT, swaps, etc.
- 1-bit analog unitary ops
- w. precision up to the limit of the qubit device
technology - Treat imprecision like decoherence noise, correct
it away? - Flow of control
- For reversibility, could have 2 instruction
registers, which take turns executing loading
each other. - Data movement
- Streaming between neighboring unit cells.
49Node Architecture Sketch
Instructionregisters
Data pathto/fromneighbornode(in 2d or 3d)
I/O registers
Execution unit
Data registers
502. Simulating physical systems on our QC model
- Many-particle Schrödinger equation
- Numerically stable classical reversible sims
- Quantum equivalents
- Quantum field theory
51Simulating Wave Mechanics
- The basic problem situation
- Given
- A (possibly complex) initial wavefunction
in an N-dimensional position basis,
and - a (possibly complex and time-varying) potential
energy function , - a time t after (or before) t0,
- Compute
-
- Many practical physics applications...
52The Problem with the Problem
- An efficient technique (when possible)
- Convert V to the corresponding Hamiltonian H.
- Find the energy eigenstates of H.
- Project ? onto eigenstate basis.
- Multiply each component by .
- Project back onto position basis.
- Problem
- It may be intractable to find the eigenstates!
- We resort to numerical methods...
53History of Reversible Schrödinger Sim.
See http//www.cise.ufl.edu/mpf/sch
- Technique discovered by Ed Fredkin and student
William Barton at MIT in 1975. - Subsequently proved by Feynman to exactly
conserve a certain probability measure - Pt Rt2 It?1It1
- 1-D simulations in C/Xlib written by Frank at MIT
in 1996. Good behavior observed. - 1 2-D simulations in Java, and proof of
stability by Motter at UF in 2000. - User-friendly Java GUI by Holz at UF, 2002.
(Rreal, Iimag., ttime step index)
54Difference Equations
- Consider any system with state x that evolves
according to a diff. eq. that is 1st-order in
time x f(x) - Discretize time to finite scale ?t, and use a
difference equation instead x(t ?t) x(t)
?t f(x(t)) - Problem Behavior not always numerically stable.
- Errors can accumulate and grow exponentially.
55Centered Difference Equations
- Discretize derivatives in a symmetric fashion
- Leads to update rules like x(t ?t) x(t ?
?t) 2?t f(x(t)) - Problem States at odd- vs. even-numbered time
steps not constrainedto stay close to each other!
2?tf
x1
g
x2
g
x3
g
x4
56Centered Schrödinger Equation
- Schrödingers equation for 1 particle in 1-D
- Replace time ( also space) derivatives with
centered differences. - Centered difference equation has realpart at odd
times that depends only onimaginary part at even
times, vice-versa. - Drift not an issue - real imaginaryparts
represent different state components!
R1
g
?
I2
g
R3
g
I4
?
57Proof of Stability
- Technique is proved perfectly numerically stable
convergent assuming V is 0 and ?x2/?t gt ?/m
(an angular velocity) - Elements of proof
- Lax-Richmyer equivalence convergence?stability.
- Analyze amplitudes of Fourier-transformed basis
- Sufficient due to Parsevals relation
- Use theorem (cf. Strikwerda) equating stability
to certain conditions on the roots of an
amplification polynomial ?(g,?), which are
satisfied by our rule. - Empirically, technique looks perfectly stable
even for more complex potential energy funcs.
58Phenomena Observed in Model
- Perfect reversibility
- Wave packet momentum
- Conservation of probability mass
- Harmonic oscillator
- Tunnelling/reflection at potential energy
barriers - Interference fringes
- Diffraction
59Gaussian wave packet moving to the rightArray
of small sharp potential-energy barriers
60Initial reflection/refraction of wave packet
61A little later
62Aimed a little higher
63A faster-moving particle
64Interesting Features of this Model
- Can be implemented perfectly reversibly, with
zero asymptotic spacetime overhead - Every last bit is accounted for!
- As a result, algorithm can run adiabatically,
with power dissipation approaching zero - Modulo leakage frictional losses
- Can map it to a unitary quantum algorithm
- Direct mapping
- Classical reversible ops only, no quantum speedup
- Indirect (implicit) mapping
- Simulate p particles on kd lattice sites using pd
lg k qubits - Time per update step is order pd lg k instead of
kpd
65Implicit Mapping
- Use pd integer registers xj, each lg k qubits
long - Amplitude of joint state of all registers
represents amplitude of wavefunction point x - The difference equation term for dimension j
amounts to multiplication of state by matrix - (can be normalized to be) nearly unitary for
small ? - Idea Can approximate Dj using 1-qubit ops on
low-order bit of xi, plus CCNOTs to do carries.
xi
66Field Theory Systems
- Goal Simulate field theory for p particle types
in d-dimensional space over kd lattice sites - General approach At each lattice site, have p
integer qubit registers nj, denoting the
occupancy number of particle type j. - nj 0 or 1 for each type of fermion
- nj from 0 to nmax for bosons
- nmax determined by available total energy
- Use quantum LGCA model (type I)
- Interact particles at site using collision
operator - Includes particle creation/annihilation operators
- Stream particles between sites after collision
step
673. Quantum computer simulation on classical
computers
- Visualization techniques, various optimizations
- Polynomial-space techniques
68Simulation of QC Algorithms
- Visualization
- Project states onto 2-D/3-D spaces
- Corresponding to register pairs/triplets.
- Use HSV color space to represent amplitudes.
- Visualize gate ops with continuous color change.
- Simulation Efficiency
- Optimizations
- Track only states having non-zero amplitude.
- Linear-space simulations of n-qubit systems.
69Visualization Technique
- Illustration 3 stages of Shors algorithm
- Register value ? spatial position of pixel
- Phase angle ? pixel color hue.
- Magnitude ? pixel color saturation.
70Initial State
71After doing Hadamard transform on all bits of a
72After modular exponentiationbxa (mod N)
73State After Fourier Transform
74Efficient QC Simulations
- Task Simulate an n-qubit quantum computer.
- Maximally stupid approach
- Store a 2n-element vector
- Multiply it by a full 2n2n matrix for each gate
op - Some obvious optimizations
- Never store whole matrix (compute dynamically)
- Store only nonzero elements of state vector
- Especially helpful when qubits are highly
correlated - Do only constant work per nonzero vector element
- Scatter amplitude from each state to 1 or 2
successors - Drop small-probability-mass sets of states
- Linearity of QM implies no chaotic growth of
errors
75Linear-space quantum simulation
- A popular myth
- Simulating an n-qubit (or n-particle) quantum
system takes e?(n) space (as well as time). - The usual justification
- It takes e?(n) numbers even to represent a single
?(n)-dimensional state vector, in general. - The hole in that argument
- Can simulate the statistical behavior of a
quantum system w/o ever storing a state vector! - Result BQP ? PSPACE known since BV93...
- But practical poly-space sims are rarely described
76The Basic Idea
- Inspiration
- Feynmans path integral formulation of QED.
- Gives the amplitude of a given final
configuration by accumulating amplitude over all
paths from initial to final configurations. - Each path consists of only a single
?(n)-coordinate configuration at each time, not a
full wavefunction over the configuration space. - Can enumerate all paths, while only ever
representing one path at a time.
77Simulating Quantum Computations
- Given
- Any n-qubit quantum computation, expressed as a
sequence of 1-qubit gates and CNOT gates. - An initial state s0 which is just a basis state
in the classical bitwise basis, e.g. ?00000?. - Goal
- Generate a final basis state stochasically with
the same probability distribution as the quantum
computer would do.
U2
U3
U4
U1
78Matrix Representation
- Consider each gate as rank-2n unitary matrix
- Each CNOT application is a 0-1 (permutation)
matrix - a classical reversible bit-operation. - With appropriate row ordering, each Ui gate
application is block-diagonal, w. each 22 block
equal to Ui. - We need never represent these full matrices!
- The 1 or 2 nonzero entries in a given row can be
located computed immediately given the row id
(bit string) and Ui.
79The Linear-Space Algorithm
- Generate a random coin c?0,1.
- Initialize probability accumulator p?0.
- For each final n-bit string y at time t,
- Compute its amplitude ?(y) as follows
- Generate its possible 1 or 2 predecessor strings
x1 (and maybe x2) given the gate-op preceding t. - For each predecessor, compute its amplitude at
time t?1 recursively using this same algorithm, - unless t0, in which case ?1 if ?x?s0, 0
otherwise. - Add predecessor amplitudes, weighted by entries.
- Maybe output y, using roulette wheel algorithm
- Accumlate Pry into total p ? p ?(y)2
- Output y and halt if pgtc.
80A Further Optimization
- Dont even have to enumerate all final states!
- Instead Stochasically follow a trajectory.
- Basic idea
- Keep track of 1 current state its amplitude
?0. - For CNOTs Deterministically transform state.
- For Us
- Calculate amplitude ?1 of neighbor state w.
path-integral - Calculate amplitudes ?0 and ?1 after qubit op
- Choose 1 successor as new current state, using
?2 distrib.
u00
?0
?0
Current state
u10
Possiblesuccessors
u01
?1
?1
u11
Neighbor state
81Complexity Comparison
- To simulate t gate ops (c CNOTs u 1-bit unitary
ops) of an n-qubit quantum computer - Space Time
- Traditional method 2n t2n
- Path-integral method tn n2t
- (Actually, only the u unitary ops, not all t ops
or all n qubits, contribute to any of the
exponents here.) - Upshot
- Lower space usage can allow larger systems to be
simulated, for short periods. - Run time is competitive for case when t lt n
82Conclusion
- A grab-bag of tricks and techniques
- Outline of a research program is taking shape
- Quantum computing is really interesting...
- Now if only I can get someone to pay me to devote
my full time to studying it!
83Slides left over from USC talk
84Reversibility of Physics
- The universe is (apparently) a closed system
- Closed systems evolve via unitary transforms
- Apparent wavefunction collapse doesnt contradict
this (confirmed by work of Everett, Zurek, etc.) - Time-evolution of concrete state of universe (or
closed subsystems) is reversible - Invertible (bijective)
- Deterministic looking backwards in time
- Total info. (log of poss. states) doesnt
decrease - Can increase, though, if volume is increasing
- Information cannot be destroyed!
85Illustrating Landauers principle
Before bit erasure
After bit erasure
s0
0
0
s0
Nstates
sN-1
0
0
sN-1
Unitary(1-1)evolution
2Nstates
s0
sN
1
0
Nstates
0
sN-1
s2N-1
1
86Benefits of Reversible Computing
- Reduces energy/cooling costs of computing
- Improves performance per unit power consumed
- Given heat flux limits in the cooling system,
- Improves performance per unit convex hull area
- A faster machine in a given size box.
- For communication-intensive parallel algorithms,
- Improves performance, period!
- All these benefits are by small polynomial
factors in the integration scale the device
properties.
87Quantum Computing pros/cons
- Pros
- Removes an unnecessary restriction on the types
of quantum states ops usable for computation. - Opens up exponentially shorter paths to solving
some types of problems (e.g., factoring,
simulation) - Cons
- Sensitive, requires overhead for error
correction. - Also, still remains subject to fundamental
physical bounds on info. density, rate of state
change! - Myth A quantum memory can store an
exponentially large amount of data. - Myth A quantum computer can perform operations
at an exponentially faster rate than a classical
one.
88Some goals of my QC work
- Develop a UMS model of computation that
incorporates quantum computing. - Design simulate quantum computer architectures,
programming languages, etc. - Describe how to do the systems-engineering
optimization of quantum computers for various
problems of interest.
89Conclusion
- As we near the physical limits of computing,
- Further improvements will require an increasingly
sophisticated interdisciplinary integration of
concerns across many levels of engineering. - I am developing a principled nanocomputer systems
engineering methodology - And applying it to the problem of determining the
real cost-efficiency of new models of computing - Reversible computing
- Quantum computing
- Building the foundations of a new discipline that
will be critical in coming decades.