Title: Soft Error Rate Determination for Nanometer CMOS VLSI Circuits
1 Soft Error Rate Determination for Nanometer CMOS
VLSI Circuits Masters DefenseFan Wang
- Thesis Advisor Dr. Vishwani D. Agrawal
- Thesis Committee Dr. Fa Foster Dai and Dr.
Victor P. Nelson
Department of Electrical and Computer
Engineering Auburn University, AL 36849 USA
2Outline
- Background
- Problem Statement
- Contributions
- Proposed soft error model
- Proposed soft error propagation through logic
- Experimental results
- Discussion of results
- Conclusion
3Motivation for This Work
- With the continuous downscaling of CMOS
technologies, the device reliability has become a
major bottleneck. - The sensitivity of electronic systems can
potentially become a major cause of soft
(non-permanent) failures. - The determination of soft error rate in logic
circuits is a complex problem. There is no
existing analysis method that comprehensively
considers all the factors that influence the soft
error rate.
4Background
- Certain behaviors in the state of the art
electronic circuits caused by random factors. - Single event upset (SEU) is a non-permanent or
transient error. - Definition from NASA Thesaurus
- Single Event Upset (SEU) Radiation-induced
errors in microelectronic circuits caused when
charged particles also, high energy particles
(usually from the radiation belts or from cosmic
rays) lose energy by ionizing the medium through
which they pass, leaving behind a wake of
electron-hole pairs.
5What is Soft Error
- A fault is the cause of errors. Faults can be
permanent (hardware fault) or non-permanent. - A non-permanent fault is a non-destructive fault
and falls into two categories - Transient faults caused by environmental
conditions like temperature, humidity, pressure,
voltage, power supply, vibrations, fluctuations,
electromagnetic interference, ground loops,
cosmic rays and alpha particles. - Intermittent faults caused by non-environmental
conditions like loose connections, aging
components, critical timing, interconnect
coupling, resistive or capacitive variations and
noise in the system. - An error caused by a non-permanent fault is a
soft error. - With advances in manufacturing, soft errors
caused by cosmic rays and alpha particles remain
the dominant causes of failures in electronic
systems.
6Soft Error Rate (SER) in Specific Applications
- Figure of Merit
- Failures In Time (FIT) Number of failures per
109 device hours - MTTF (Mean Time To Failure) 1 year MTTF
109/(24365) FIT 114,155 FIT - SER of contemporary commercial chips is
controlled to within 1001000 FIT - Most hard failure mechanisms produce error rate
on the order of 1100 FIT - Programmable logic SER is almost 100 times larger
than combinational logic
7Soft Error Rate (SER) for SRAM-Based FPGA
- Effects of smaller design rules and lower supply
voltages - Radiation chamber measurement of SER at altitude
of 10km at 60N (Sweden)
FPGA (Xilinx) XC4010E XC4010XL
Process 0.60µ 0.35µ
Vcc 5V 3.3V
1 SEU every 1106 hours 2.8105 hours
Projecting through 3 design rule shrinks and 2
voltage reductions we get 1 SEU every 28.2 hours
M. Ohlsson, P. Dyreklev, K. Johansson and P.
Alfke, Neutron Single Event Upsets in SRAM-Based
FPGAs, Proc. IEEE Nuclear Space Radiation
Effects Conference, 1998. C. E. Stroud, FPGA
Architectures and Operation for Tolerating SEUs,
VLSI Design Test Seminar, Auburn University,
January 31, 2007.
8Reliability Requirements
Commodity flash memory reliability requirements
Year 2007 2010 2013 2016
Density (megabit) 1024 2048 4096 8192
Maximum data rate (MHz) 166 200 250 300
MTTF (hours) 4020 4654 5388 6237
FIT 2.487x105 2.149x105 1.856x105 1.603x105
from 2002 International Technology Roadmap for
Semiconductors ITRS.
FIT 109/MTTF
9Single Event Transient (SET)
- SET is caused by the generation of charge due to
a high-energy particle passing through a
sensitive node. - Each SET has its unique characteristics like
polarity, waveform, amplitude, duration, etc.,
depending on particle impact location, particle
energy, device technology, device supply voltage
and output load. - An off transistor struck by a heavy ion with
high enough LET in the junction area is most
sensitive to SEU. - Specifically, the channel region of an off-NMOS
transistor and the drain region of an off-PMOS
transistor are sensitive regions.
Linear Energy Transfer (LET) is a measure of the
energy transferred to the device per unit length
as an ionizing particle travels through material.
Unit MeV-cm2/mg.
10Measured Environmental Data
- Typical ground-level total neutron flux
56.5cm-2s-1. - J. F. Ziegler, .Terrestrial cosmic rays,. IBM
Journal of Research and Development, vol. 40, no.
1, pp. 19.39, 1996. - Particle energy distribution at ground-level
- For both 0.5µm and 0.35µm CMOS technology
at ground level, the largest population has an
LET of 20 MeV-cm2/mg or less. Particles with
energy greater than 30 MeV-cm2/mg are exceedingly
rare. - K. J. Hass and J. W. Ambles, .Single Event
Transients in Deep Submicron CMOS, Proc. 42nd
Midwest Symposium on Circuits and Systems, vol.
1, 1999.
Probability density
0 15 30
Linear energy transfer (LET), MeV-cm2/mg
11Details of SET Generation
- (a) Along the path traverses, the particle
produces a dense radial distribution of
electron-hole pairs. - (b) Outside the depletion region the
non-equilibrium charge distribution induces a
temporary funnel-shaped potential distortion
along the trajectory of the event (drift
component). - (c) Funnel collapses, diffusion component then
dominates the collection process until all excess
carriers have been collected, recombined, or
diffused away from the junction area. - (d) Current vs. Time to illustrate the charge
collection and SET generation.
12SET in CMOS Inverter
For example, in ami12 technology, when the
output load capacitance is 100fF and the
cumulative collected charge is 0.65pC, the
amplitude of the voltage pulse is 0.65pC/100fF
0.65 x10-12C/100 x10-15F 0.65V .
13Original Contributions of This Research
14Problem Statement
- Given background environment data
- Neutron flux
- Background LET distribution
- Those two factors are location dependent.
- Given circuit characteristics
- Technology
- Circuit netlist
- Circuit node sensitive region data
- Those three factors depend on the circuit.
- Estimate neutron caused soft error rate in
standard FIT units.
15Proposed Soft Error Model
- Single event effect exists as single event
transient. - An SET has its unique characteristics like
polarity, waveform, amplitude and duration. - Environmental neutrons come from cascaded
interactions when galactic cosmic rays traverse
earths atmosphere.
16Error Occurrence Rate
- Environmental neutron flux is N/cm2-s, where N is
the number of particles. - Each neutron particle bear different energy when
it interacts with silicon. - Not all particles with enough energy will cause
an error. There is some probability P per hit for
a given particle energy.
For a circuit node with sensitive region A (cm2)
and a given particle energy the SER probability
per hit is P. If neutron flux rate is N/cm2-s,
then the soft error occurrence rate at this node
is (A x P x N)/s
17Single Event Transient (SET)
- For a circuit node a soft error occurs as a
transient signal whose width depends on the
energy of the striking neutron. - The transient width determines whether it can
propagate through logic gates. Transient pulse
width is the interval between Vdd/2 points. - The LET probability density function determines
the transient width density statistics.
- Typical charge collection depth L is 2µm for bulk
silicon. - An ionizating particle with 1MeV-cm2/mg deposits
about 10.8fC charge along each micron on its
track. t a is collection time constant and tB is
ion-track establishment time constant. Typical
value for t a and tB is 1.64x10-10 and 5x10-11
respectively.
18Summarizing
- We model the soft error with two parameters
- Occurrence rate
- Single event transient width
- Next, we propose a propagation algorithm for the
modeled soft error transient pulses.
19Pulse Widths Probability Density Propagation
- X, Y are random variables
- X input pulse width, Y output pulse width
- fX(x) probability density function of X
- fY (y) probability density function of Y
- Given function g Yg(X)
- Propagation function through a sensitized gate
- g Ygp W/L, nW/L, Cload, technology
- Assume g is differentiable and an increasing
function of X, so g and g-1 exist. Then,
X
Y
1
20Propagation Rule
- We use a linear 3-interval piecewise linear
propagation model to approximate the non-linear
function g. - Three-intervals
- Non-propagation, if Din tp.
- Propagation with attenuation, iftp lt Din lt 2tp.
- Propagation with no attenuation, if Din ? 2tp.
- Where
- Din input pulse width
- Dout output pulse width
- tp gate input output delay
Dout Y
tp
2tp
0
Din X
21Determination of Model Parameter
- We simulated a CMOS inverter using HSPICE
- This CMOS inverter is in TSMC035 technology, with
nmos W/L ratio 0.6µ/0.24µ and pmos W/L ratio
1.08µ/0.24µ. - The proposed 3-interval piecewise linear equation
is approximated as
22- Pulse Width Density Propagation Through a CMOS
Inverter
23Validating Propagation Model Using HSPICE
Simulation
- Simulation of a CMOS inverter in TSMC035
technology with load capacitance 10fF
24Logic SEU Occurrence Rate Propagation
- Because all pulse widths are greater than or
equal to 0, so we have
- In fX(x) to fY(y) conversion, there is a fraction
of pulses being filtered out or attenuated due to
electrical masking. We define electrical masking
ration (EMR) as
25Soft error occurrence rate calculation for
generic gate
26Experimental Results for ISCAS85 Circuits
- Assume probability of SEU per particle hit is
10-4. - Assume the SET width density per circuit node
follows normal distribution with mean µ 150 and
standard deviation s 50 for ground level
environment. - At ground level, total neutron flux is 56.5
m-2s-1. - Circuit are in TSMC035 technology and sensitive
region per node is 10 µm2. - For a circuit with n primary outputs and m nodes,
we calculate the SER as
27SER Results on Workstation Sun Fire 280R
Circuit PIs POs Gates CPU s FIT/gate/output
C17 5 2 6 0.01 0.3679
C432 36 7 160 0.04 1.0563
C499 41 32 202 0.14 0.2188
C880 60 26 383 0.08 0.3882
C1908 33 25 880 1.14 0.7427
C2670 233 140 1193 0.77 0.2882
C5315 178 123 2307 2.78 0.5572
C7552 207 108 3512 10.82 0.6652
28SER Results for Inverter Chains
Circuit PIs POs Gates CUP (s) FIT/gate
Inv2 1 1 2 0.00 0.2819
Inv5 1 1 5 0.00 0.5388
Inv10 1 1 10 0.00 0.9654
Inv20 1 1 20 0.00 1.1819
Inv50 1 1 50 0.00 4.3780
Inv100 1 1 100 0.04 8.6473
29Methods Comparison
Factors Considered LET Spec. Re-cov. Fanout Sensitive region Occurance rate Vectorsapplied Location altitude Circuit Tech. SET degrad.
Our work Yes No Yes Yes No Yes Yes Yes
Rao et at. 1 Yes No No No Yes Yes Yes Yes
Rajaraman et al. 2 No No No No Yes No No Yes
Asadi-Tahoori 3 No No No Yes No No No No
Zhang-Shanbhag4 Yes No Yes Yes Yes Yes Yes No
Rejimon-Bhanja 5 No No No Yes Yes No No No
30Experimental Results Comparison
Circuit PI PO Gates Our approach Our approach Rao et al. 1 Rao et al. 1 Rajaraman et al2 Rajaraman et al2
Circuit PI PO Gates CPU s FIT CPU s FIT CPU min. Error Prob.
C432 36 7 160 0.04 1.18x103 lt0.01 1.75x10-5 108 0.0725
C499 41 32 202 0.14 1.41x103 0.01 6.26x10-5 216 0.0041
C880 60 26 383 0.08 3.86x103 0.01 6.07x10-5 102 0.0188
C1908 33 25 880 1.14 1.63x104 0.01 7.50x10-5 1073 0.0011
Computing Platform Computing Platform Computing Platform Computing Platform Sun Fire 280R Sun Fire 280R Pentium 2.4 GHz Pentium 2.4 GHz Sun Fire v210 Sun Fire v210
Circuit Technology Circuit Technology Circuit Technology Circuit Technology TSMC035 TSMC035 Std. 0.13 µm Std. 0.13 µm 70nm BPTM 70nm BPTM
Altitude Altitude Altitude Altitude Ground Ground Ground Ground N/A N/A
BPTM Berkley Predictive Technology Model
31More Result Comparison
Measured Data Measured Data Logic Circuit SER Estimation Ground Level Logic Circuit SER Estimation Ground Level
Devices SER (FIT/Mbit) Our Work Rao et al. 1
0.13µ SRAMs 6 10,000 to 100,000 1,000 to 10,000 1x10-5 to 8x10-5
SRAMs, 0.25µ and below 7 10,000 to 100,000 1,000 to 10,000 1x10-5 to 8x10-5
1 Gbit memory in 0.25µ 8 4,200 1,000 to 10,000 1x10-5 to 8x10-5
The altitude is not mentioned for these data.
32Discussion
- We take the energy of neutron to be the key
factor to induce SEU. In real cases, there can
also be secondary particles generated through
interaction with neutrons. - Estimating sensitive regions in silicon is a hard
task. Also, the polarity of SET should be taken
into account. - Because on the earth surface, typical error rates
are very small, their measurement is time
consuming and can produce large discrepancy. This
motivates the use of analytical methods. - For example, a circuit may experience 1
SEU in 6 months (4320 hours), equals 231,480 FIT.
It is also likely that the circuit has 0 SEU in
these 6 months, so the measured SER is 0 FIT.
33Discussion Continued
- Fan-out stems should be considered. Two
situations can arise - When an SET goes through a large fan-out, the
large load capacitance can eliminate the SET, or - If it is not canceled by the fan-out node, it
will go through multiple fan-out paths to
increase the SER. - It is highly recommended to have more field tests
for logic circuits. - None of these SER approaches consider the process
variation effects on SER.
34Conclusion
- SER in logic and memory chips will continue to
increase as devices become more sensitive to soft
errors at sea level. - By modeling the soft errors by two parameters,
the occurrence rate and single event transient
pulse width density, we are able to effectively
account for the electrical masking of circuit. - Our approach considers more factors and thus
gives more realistic soft error rate estimation.
35Publications related to this work
- F. Wang and V. D. Agrawal, Single Event Upset
An Embedded Tutorial, in Proc. 21st IEEE
International Conference on VLSI Design, January
2008, pp. 429-434. - F. Wang and V. D. Agrawal, Soft Error Rate
Determination for Nanometer CMOS VLSI Circuits,
in Proc. 40th IEEE Southeastern Symposium on
System Theory, March 16-18, 2008, Paper TA1. - F. Wang and V. D. Agrawal, Probabilistic Soft
Error Rate Estimation from Statistical SEU
Parameters, in Proc. 17th IEEE North Atlantic
Test Workshop, May 2008. - Unpublished work
- F. Wang and V. D. Agrawal, Soft Error
Considerations for Computer Web Servers.
36References
- 1 R. R. Rao, K. Chopra, D. Blaauw, and D.
Sylvester, An Efficient Static Algorithm for
Computing the Soft Error Rates of Combinational
Circuits," Proceedings of the conference on
Design automation and test in Europe
Proceedings, pp. 164-169, 2006. - 2 R. Rajaraman, J. S. Kim, N. Vijaykrishnan,
Y. Xie, and M. J. Irwin, SEAT-LA A Soft Error
Analysis Tool for Combinational Logic," VLSI
Design, 2006 19th International Conference on,
2006, pp. 499-502. - 3 G. Asadi and M. B. Tahoori, An Accurate
SER Estimation Method Based on Propagation
Probability, Proc. Design Automation and Test in
Europe Conf,2005, pp. 306-307. - 4 M. Zhang and N. R. Shanbhag, A soft error
rate analysis (SERA) methodology," in IEEE/ACM
International Conference on Computer Aided
Design, ICCAD-2004, 2004, pp. 111-118. - 5 T. Rejimon and S. Bhanja, An Accurate
Probabilistic Model for Error Detection," in 18th
International Conference on VLSI Design, 2005,
pp.717-722. - 6 J. Graham, Soft errors a problem as SRAM
geometries shrink,http//www.ebnews.com/story/OEG
20020128S0079, ebn, 28 Jan 2002. - 7 Wingyu Leung Fu-Chieh Hsu Jones, M. E.,
"The ideal SoC memory 1T-SRAMTM," Proc.13th
Annual IEEE International on ASIC/SOC Conference,
vol., no., pp.32-36, 2000 - 8 Report, Soft Errors in Electronic
Memory-A White Paper," Technical report, Tezzaron
Semiconductor, 2004.
37Thank You . . .