Title: Thermal Via Allocation for 3D ICs Considering Temporally and Spatially Variant Thermal Power
1Thermal Via Allocation for 3D ICs Considering
Temporally and Spatially Variant Thermal Power
- Tanay Karnik
- Circuit Research Lab
- Intel, USA
- Hao Yu, Yiyu Shi and Lei He
- Electrical Engineering Dept.
- UCLA, USA
Partially supported by NSF and UC-MICRO fund from
Intel
2New Solution for High-performance Integration
- 2D SoC design has limited density and
interconnect performance
- Potential solution 3D Integration
Banerjee-SaraswartIEEE01 - Fabrication Technologies Chip-level Wafer
Bonding or Die-level Silicon Epitaxial Growth - Inter-layer via plays a crucial role in
signaling, power delivery and heat-removal
3Thermal Challenges in 3D ICs
- Temperature increases along third dimension
- Inter-layer dielectric layers are poor thermal
conductors
- High temperature affects interconnect and device
reliability and leads to variations to timing
- Thermal analysis and thermal-aware design for 3D
ICs becomes a need
4Via Planning Problem
- Motivation
- Inter-layer vias are good thermal-conductor to
remove heat - Inter-layer vias take additional chip area and
routing resource - Previous work
- Iterative via planning during placement
Goplen-SapatnekarISPD05 - Multilevel alternating direction via planning
during routing Zhang-CongICCAD05 - Both use steady-state analysis and assume a
maximum-thermal power, and may lead to
over-design
- Primary contributions of our work
- Minimize a thermal violation integral considering
transient temperature - Develop an efficient sensitivity-driven
sequential programming with use of macromodel
5Outline
- Background and Problem Formulation
- Structured and Parameterized Macromodel
- Sequential Optimization
- Experimental Results
- Conclusions
6Thermal Model Overview
- Electric and thermal duality
Temperature Voltage state variables (x(t))
Thermal-Power Input Current sources (u(t))
Thermal conductance Electrical conductance (G)
Thermal capacitance Electrical capacitance (C)
- Electric and thermal systems can be described in
MNA (modified nodal analysis) equation
- Via conductance gi and capacitance ci are both
proportional to size Ai or density (Ai/a) (a is
unit via area) - It can be parametrically added into MNA equation
7Steady State Model and Analysis
- Active-device and inter-dielectric layers are
discretized into tiles - Tiles connected by thermal resistance
- Heat sources modeled as time-invariant current
sources
- Steady-state temperature can be obtained by
directly solving a time-invariant linear equation
R
8Transient Model and Analysis
- Tiles connected by thermal resistance and thermal
capacitance - Heat sources modeled as time-variant current
sources
RC
- Transient temperature can be obtained by solving
a time-variant linear equation
9Thermal Power Variation and Analysis
- Different workloads and dynamic power management
introduces temporally and spatially power
variations - Thermal power is the runtime averaging of
cycle-accurate power, and is not a constant
spatially and temporally
- Steady-state analysis needs to assume a maximum
thermal power simultaneously for all regions - It seldom happens that each part of the chip
achieves their maximum simultaneously, and can
result in an over-design
- Transient analysis is accurate but time-consuming
- It calls for more accurate yet efficient
transient thermal simulation during the design
automation
10Thermal Violation Integral
- Thermal violation is temperature overshoot for a
long enough period, so maximum temperature is not
a good Figure of Merit (FOM) - Thermal-violation integral as FOM fk(A) is more
accurate - Time-domain transient temperature (y) integral
over defined ceiling temperature (Tceiling) for a
long enough period (t0 tp) at ith tile
- FOM f(A) for a group (K) of critical tiles
- A is a via density vector
11Problem Formulation
- Find a via density vector A to minimize the
thermal violation integral under global/local
routing congestion constraints - Two keys to efficiently solve this problem
- Efficient models to transient response, and its
first-order and second-order sensitivity with
respect to via density - Efficient yet effective mathematic programming
Global constraint
Local constraint
12Outline
- Background and Problem Formulation
- Structured and Parameterized Macromodel
- Sequential Optimization
- Experimental Results
- Conclusions
13Macromodel by Moment Matching
small linear network
large linear network
- Krylov-subspace based projection can reduce model
size and preserve accuracy by matching moments of
inputs Odabasioglu-Celik-PileggiTCAD98 - Flat projection does not preserve block matrix
structure such as sparsity - Reduced macromodel does not contain sensitivity
information for design automation
14Parameterization (I)
- The inserted location is described by adjacent
matrix X
- The via density (Ai) is parameterized and added
into MNA
Need to separate sensitivity from nominal response
15Parameterization (II)
- Expand state variables x(A1,AK,s) by Taylor
expansion w.r.t. Ai (up to second order) - x(0), x(1), and x(2) are nominal values,
first-order and second-order sensitivities
- Expanded system has lower-triangular structure
- System size is enlarged and needs to be reduced
by projection - Traditional flat projection can not separate the
nominal state variables and their sensitivities
Li-PileggiICCAD04 - This can be solved by a structure-preserved
projection Yu-He-TanBMAS05
16Structured Projection (I)
- Block-diagonally partition the projection matrix
by the size of nominal state-variable,
first-order sensitivity, and second-order
sensitivity
- Use structured projection can result in a reduced
triangular system with nominal value and
sensitivities to be solved independently
17Structured Projection (II)
- Time-domain transient response can be solved
using Backward-Euler method
- Nominal response, and sensitivity can be solved
separately and efficiently - The reduced model is sparse
- There is only one LU-factorization of the reduced
diagonal block G0(1/h)C0
- Generated sensitivities can be used in any
gradient based optimization
18Outline
- Background and Problem Formulation
- Structured and Parameterized Macromodel
- Sequential Optimization
- Experimental Results
- Conclusions
19Sequential Approximation of Objective Function
- The objective function f(A) could be approximated
- Find (?A) to minimize flp or fqp during each step
- The objective function becomes semi-definite when
integration is approximated by a discretized
summation VisweswariahTCAD00 - Sequential programming converges for
convex-programming problems, and still has good
convergence in semi-definite problems
20Sensitivity Calculation
- Direct sensitivity calculation for objective
function
- Structured and parameterized reduction provides
an efficient calculation of both nominal value
and sensitivity - The via density vector A can be efficiently
updated during each iteration
- The computation cost could be further reduced
when an adjoint Lagrangian method is used to
calculate sensitivity VisweswariahTCAD00
21Outline
- Background and Problem Formulation
- Structured and Parameterized Macromodel
- Sequential Optimization
- Experimental Results
- Conclusions
22Experiment Settings
- A modest 3D stacking with 1-heat-sink,
2-die-layer, 2-dielectric-layer is assumed, each
extracted as RC mesh interconnected by RC-pair
for via
- Clock gating is assumed with a period of 250ms
- Reduction algorithm assumes SIMO
(single-input-multiple-output) reduction when the
number of inputs is large
- Compare our method (SP-Macro) with Steady-state
solution
23Accuracy of Reduced Macromodel
- Transient temperature responses of exact and
SP-MACRO models at port 3, 18, and 58 of top
layer with step-response input - The responses of macromodels are visually
identical to those exact models
24Optimization Profile by SQP
- Temperature reduction at selected location during
the procedure of via-allocation by SQP - The allocated via results in a transient
temperature meeting the targeted ceiling
temperature 52C
25Temperature Map
- Temperature maps before and after the via
allocation at the top layer - The maximum temperature before allocation is
about 150C - The temperature after allocation meets the
targeted ceiling temperature 52C
26Allocated-via and Runtime Comparison
Total/ critical tile Total via Constraint Original/ ceiling T Steady-state by direct solution Steady-state by direct solution Steady-state by direct solution Transient by SP-MACRO Transient by SP-MACRO Transient by SP-MACRO Transient by SP-MACRO
Solve-dc(s) Solve-tran(s) Allo-via Redu-ckt(s) Solve-sens(s) Qp/lp- plan (s) Allo-via
256/30 704 120/40 1.64 10.27 440 0.12 0.19 0.15 360
1024/60 2818 120/40 12.62 130.12 2281 1.08 0.96 0.42 1609
4096/80 5980 140/50 341.13 3872.98 5620 12.92 6.28 1.92 3217
8192/100 8218 140/50 7809.12 NA 8021 46.27 16.92 8.98 4382
16384/120 18000 160/60 NA NA 17600 120.89 101.23 23.65 9280
32768/200 24000 160/60 NA NA 23800 262.12 257.21 42.75 11660
- Compared to steady-state solution
- SP-MACRO has smaller simulation and planning time
when increasing circuit size - It reduces the runtime by 126X
- SP-MACRO is more accurate to predict the via
insertion - It reduces the inserted via number by 2.04X
27Conclusions
- Via planning based on the transient thermal
analysis reduces via umber by 2.04x compared to
the steady-state thermal analysis - An efficient via planning algorithm is developed
- Structured and parameterized model reduction
provides both nominal values and sensitivities - Sequential linear/quadratic programming minimizes
the thermal-violation integral
- SP-MACRO is further extended for
- Simultaneous power and thermal integrity driven
via planning Yu-Ho-HeICCAD06