Title: Low%20Power%20Passive%20Equalizer%20Design%20for%20Computer-Memory%20Links
1Low Power Passive Equalizer Design for
Computer-Memory Links
Ling Zhang1, Wenjian Yu2, Yulei Zhang1, Renshen
Wang1, Alina Deutsch3, George A. Katopis3,
Daniel M. Dreps3, James Buckwalter1, Ernest
Kuh4, Chung-Kuan Cheng1 1UC San Diego, 2Tsinghua
Univ., 3IBM Research Labs, 4UC Berkeley
2Outline
- Introduction
- The CPU-memory links in IBM P6 system
- Equalization structures and schemes
- Simulated annealing optimization flow
- Experimental results
- Conclusion
3Introduction
- On-chip interconnects has Gbps data-rates
- M. Hashimoto, etc. in 2004 40Gbps(simulation),
10mm, 45nm - M.P. Flynn, etc. in 2005 14Gbps, 7.2mm, 180nm
- Package-level interconnects need improvements
- High bandwidth reducing inter-symbol
interference (ISI) - Low power using passive components
- Alleviate ISI equalization
- H.A.Affel in 1924 equalization of carrier
transmissions. - H.W.Bode in 1936 attenuation equalizer.
- E.Kuh in 1959 constant-R ladder
- R.Sun, etc. in 2005 adaptive passive T-junction
equalizer.
4IBM P6 system
- IBM introduced P6 system in 2007.
- Dual-core microprocessor
- 65nm SOI process
- Both speed and power are important for the P6 I/O
circuitry and interconnect. - For high performance applications, it operates at
over 5GHz. - For low power applications, it consumes less than
100W.
5Structure of the CPU-Memory link in IBM P6 system
-The channel is a 20-inch long differential pair,
operating at 6.4GHz. -The model takes all the
fan-out, connector, and via array discontinuities
into account.
6Eye diagram at each port (simulation)
7Our contributions
- We propose a set of passive equalizer schemes.
- We employ the schemes on the CPU-Memory link of
IBM P6 system and observe significant performance
improvement with little power overhead. - We compare and analyze different results of the
schemes. - We demonstrated that the equalization approach is
not sensitive to variations and crosstalk.
8Equalization structures
(a) T-junction, (b) parallel RC, (c) series RL
-T-junction and RC can be applied at both driver
and receiver sides, RL is used only at receiver
end. T-junction can be implemented on-chip or
on-package.
9The settings we use for each equalization
component
label Component
S RL NA Infinity
P RC 10 ohm RL
Tmc On-chip T-junction Z0 Z0
Tmp Off-chip T-junction Z0 Z0
Tuc On-chip T-junction 10 ohm RL
Tup Off-chip T-junction 10 ohm RL
M No equalizer Z0 Z0
Four different groups of equalization schemes we
studied
Driver Receiver
Group1 Match M, Tmc, Tmp Match M, Tmc, Tmp
Group2 Un-match P, Tuc, Tup Match M, Tmc, Tmp
Group3 Match M, Tuc, Tup Un-match P, Tuc, Tup, S
Group4 Un-match P, Tuc, Tup Un-match P, Tuc, Tup, S
10Optimization Flow
- Variables
-
- Object function (minimize)
-
- Constraints
- Simulated annealing method is used to find the
optimal solution.
11Experiment 1
- For all possible schemes
- Apply the optimization flow to find the opt.
solution. - Upper bound of R500 ohm
- Upper bound of C100pF
- Upper bound of L100nH
- Compare the results.
121. MM no eye 2. TM, MT Veye is 0.178-0.180V,
jitter is 22-23ps. 3. TT Veye is 0.195-0.196V,
jitter is 12-16ps.
13Summary of Group 1
- Using Tmc or Tmp at both sides is better than
using at one side only. - Tmc and Tmp are equivalent when used at driver.
- At receiver, Tmp has smaller jitter than Tmc.
14Transfer functions of Group 1
15Step responses of Group 1
16(No Transcript)
17Summary of Group 2
- Driver side
- Tup is better than Tuc larger eye and smaller
jitter. - Tuc is better than P larger eye and smaller
jitter. - Receiver side
- M largest jitter, TupM has largest eye.
- Tmp lowest cost function and lowest jitter. Veye
is slightly smaller than Tmc.
18Transfer functions of Group 2
19Transfer functions of Group 2
20Step responses of Group 2
21Step responses of Group 2
22(No Transcript)
23Summary of Group 3
- Driver side
- Tmc is the same as Tmp
- M is worse than T smaller eye opening and
larger jitter. - Receiver side
- P has smallest eye opening.
- Tuc has largest eye opening.
- Tup has lowest jitter.
- S has largest jitter.
24Transfer functions of Group 3
M S
Tmp S
M P
Tmc S
Tmp Tuc
Tmc Tuc
MTup
Tmp P
Tmc P
Tmc Tup
Tmp Tup
MTuc
25Transfer functions of Group 3
Tmc S
Tmc Tuc
Tmc P
Tmc Tup
Tmp S
Tmp Tuc
Tmp P
Tmp Tup
M S
M P
MTup
MTuc
26Step responses of Group 3
27Step responses of Group 3
Tmc S
Tmc Tuc
Tmc P
Tmc Tup
Tmp S
Tmp Tuc
Tmp P
Tmp Tup
M S
M P
MTup
MTuc
28-At driver side, Tup is better than Tuc , Tuc is
better than P. -At receiver side, Tup is similar
to Tuc, S has larger eye and larger jitter.
29Summary of Group 4
- Driver side
- Tup is better than Tuc, larger Veye and smaller
jitter. - P has largest jitter, PTuc has largest Veye, but
others Veye is smallest. - Receiver side
- P has smallest eye opening.
- S has largest jitter.
- TucTup, TupTup, TucTuc and TupTuc are very
similar.
30Transfer functions of Group 4
PS
PP
Tuc S
Tup S
Tuc Tuc
Tuc Tup
Tup Tuc
Tup Tup
Tuc P
PTuc
Tup P
P Tup
31Transfer functions of Group 4
Tuc S
Tuc Tuc
Tuc Tup
Tuc P
Tup S
Tup Tuc
Tup Tup
Tup P
PS
PP
PTuc
P Tup
32Step responses of Group 4
PS
PP
Tuc S
Tup S
Tuc Tuc
Tuc Tup
Tup Tuc
Tup Tup
Tuc P
PTuc
Tup P
P Tup
33Step responses of Group 4
Tuc S
Tuc Tuc
Tuc Tup
Tuc P
Tup S
Tup Tuc
Tup Tup
Tup P
PS
PP
PTuc
P Tup
34Summary of experiment 1
- Schemes in Group 1 have lower jitter because of
matching. - Schemes in Group 4 have larger eye-opening due to
reflections. - When used at receiver end, structure Tmc has
slightly lower jitter than Tmp, and structure Tuc
is very similar to Tup. - When used at receiver end, structure P has
smaller eye-opening, while S has larger jitter. - When used at driver side, structures Tmc is very
similar to Tmp, and structure Tup is slightly
better than Tuc with larger eye-opening and
smaller jitter.
35Experiment 2
- Equivalent/similar schemes are merged.
- Apply optimization flow on each scheme.
- Size limits on L, C are enforced
- Upper bound of L5nH
- Upper bound of C15pF
- Compare the results, power and eye-diagram
36Choose representative schemes -G1 MTmc
(smallest jitter)
37Choose representative schemes -MP (lowest
power) -PTuc (largest eye-opening) -Tup S
(smallest cost function)
38Eye diagrams at output
(b) MTmc Veye0.19V, Jitter18.9ps
(a) TupS Veye0.37V, Jitter19.3ps
(c) MP, Veye0.23V, Jitter24.5ps
(d) PTuc Veye0.39V, Jitter26.0ps
39Transfer functions of selected schemes
MM
PTuc
Tup S
MP
MTmc
40Step responses of selected schemes
MM
PTuc
MP
Tup S
MTmc
41Step responses of selected schemes
MM
PTuc
MP
Tup S
MTmc
42Eye diagrams at input
(a) TupS
(b) MTmc
(c) MP
(d) PTuc
43Eye diagrams at TXPKG
(a) TupS
(b) MTmc
(c) MP
(d) PTuc
44Eye diagrams at RXPKG
(a) TupS
(b) MTmc
(d) PTuc
(c) MP
45Input impedances of selected schemes
MP
MTmc
MM
PTuc
Tup S
-MP has high Zin ? low power -Tup S has low Zin
? high power
46Sensitivity comparison of selected schemes
Parameters are perturbed by .
47Eye diagrams at output with xtalk
(b) MTmc Veye0.19V, Jitter19.4ps
(a) TupS Veye0.33V, Jitter21.4ps
(c) MP, Veye0.24V, Jitter22.0ps
(d) PTuc Veye0.38V, Jitter26.2ps
48Conclusion
- Simple and effective passive equalizer schemes
are proposed. - SA flow is used to optimize the equalizer
parameters - For the CPU-Memory link of IBM P6 system
- Without equalizer eye is closed, power is 7.9mW
- Largest eye after equalization 0.39V with 26ps
jitter, 8.8mW power - Smallest jitter after equalization 19ps with
0.19V eye-opening, 7.9mW power - Significant performance improvement can be seen
with very little overhead on power.