2. Modeling of Chip and Package Power Network
• Chip power network (CPN) is modeled as a RC network with
independent current sources:
• RC network models the RC parasitic of on-chip power metals and vias.
• Independent current sources models the power consumption of cells.
• Package power network (PPN) is modeled as RLC network.
• PPN might also include controlled sources.
3. Chip/Package Power Network Co-Analysis
• A hierarchical simulation flow for time-domain chip and package power
network co-analysis.
• Frequency-domain macromodeling of CPN.
• Reduce run time of PPN analysis and design.
• Inductive voltage drop (L * di/dt).
• Resonant frequency analysis.
FFT
Resonant frequency of CPN & PPN Voltage drop of CPN & PPN
4. Time-Domain Nodal Formulation of RLC
Network
• A RLC network with independent current sources can be formulated by
the following differential equations:
• G is conductance matrix resulting from resistors.
• C is admittance matrix resulting from capacitors and inductors.
• x(t) is a vector consists of nodal voltages and inductor currents.
• b(t) is a vector of independent current sources.
• Solve the differential equations by Backward-Euler (BE) method:
• (G+C/h) is coefficient matrix and h is simulation time step.
• For each time step, solve linear equations Ax = b.
5. Basics of Solving Linear Equations
• The linear equations Ax = b can be solved by 3 steps:
• Step 1: LU factorization
• A = LU L is nxn lower triangular matrix
U is nxn upper triangular matrix
• Step 2: Forward substitution
• LUx = b → Ly = b
• Step 3: Backward substitution
• Ux = y
• If matrix A is symmetric positive definite (SPD), it can be decomposed
by Cholesky factorization A = LLT
.
• Cholesky factorization is 2X faster than LU factorization.
6. Hierarchical Analysis of Chip/Package Power
Network
• CPN is modeled as RC network. → coefficient matrix is SPD
• PPN is modeled as RLC network. → coefficient matrix is asymmetric
• If CPN and PPN are connected and solved together (flat flow):
• The coefficient matrix is asymmetric.
• The coefficient matrix must be decomposed by LU Factorization.
• Solve CPN and PPN by a hierarchical simulation flow [1]:
• Step 0: Choleksy factorization of CPN coefficient matrix
• Step 1: Generate time-domain model of CPN.
• Step 2: Solve PPN with CPN model by general matrix solver.
• Step 3: Solve CPN by SPD matrix solver.
• Since the run time of step 0 dominates, the hierarchical flow is
faster than the flat flow.
7. Hierarchical Analysis of Chip/Package Power
Network
• Nodal formulation of CPN with m ports and n internal nodes:
• vp
= [vp1
, vp2
, … , vpm
]T
is (m x 1) vector of port voltages.
• vI
is (n x 1) vector of internal node voltages.
• ip
= [ip1
, ip2
, … , ipm
]T
is (m x 1) vector of port currents.
• i1
and i2
are(m x 1) and (n x 1) vectors of internal current sources.
• GP
is (m x m) matrix of RC connecting to port nodes.
• GI
is (n x n) matrix of RC connecting to internal nodes
• Gc
is (m x n) matrix of RC connecting port and internal nodes.
CPN
( RC + Current Sources)
vp1
vp2
vpm
ip1
ip2
ipm
8. Hierarchical Analysis of Chip/Package Power
Network
• Step 1: Generate time-domain macromodel of CPN:
• A is (m x m) port admittance matrix.
• S is (m x 1) port current vector (Norton short-circuit currents).
• A and S together represent the time-domain macro-model of CPN.
• The coefficient matrix of CPN is SPD and can be decomposed by Cholesky
factorization:
9. Hierarchical Analysis of Chip/Package Power
Network
• Step 2: Simulate PPN with CPN macromodel to get port voltages:
• The coefficient matrix is asymmetric and must be factorized by LU.
• The dimension of coefficient matrix is much smaller than that of CPN.
• Solving (PPN + CPN macromdoel ) is much faster than solving CPN!
CPN
(Admittance matrix A)
s1
PPN
(RLC)s2
sm
.
.
.
10. Hierarchical Analysis of Chip/Package Power
Network
• Step 3: Simulate CPN with port currents as boundary conditions:
• Solve port currents Ip by the following formula:
• Solve CPN with Ip as boundary conditions:
CPN
( RC + Current Sources)
vp1
vp2
vpm
ip1
ip2
ipm
11. Hierarchical Analysis of Chip/Package Power
Network
• Summary of computational steps:
• Step 0: Cholesky factorization of the coefficient matrix of CPN:
• Step 1: Generate time-domain macromodel of CPN:
• Step 2: Solve PPN and CPN macromodel to get port voltages vp:
• Step 3: Solve CPN with Ip = Avp + S as boundary conditions:
• Since Cholesky factorization still applies in the hierarchical flow, it is faster
than the flat flow, in which only LU factorization applies.
12. Frequency-Domain Nodal Formulation of CPN
• Apply Laplace transform on time-domain nodal formulation:
•All independent current sources are removed.
• Derive frequency-domain admittance matrix Y(s):
• Simplify the computation of Y(s) by congruence transformation.
13. Congruence Transformation
• The congruence transformation of a square matrix A by matrix X:
• Consider a nonsingular square matrix X with dimension (m+n):
• I is (m x m) identity matrix.
• X1
is (n x m) matrix.
• X2
is (n x n) nonsingular matrix.
• The inverse matrix X-1
:
14. Congruence Transformation
• Apply congruence transform on matrix G and C by matrix X:
• Vp and ip are NOT affected by the congruence transforms on G and C.
• The resulting admittance matrix Y’(s) is identical to Y(s).
• After congruence transformations of matrix G and C by matrix X, the
admittance matrix Y(s) is preserved!
15. Pole Analysis via Congruence Transformations
• Pole Analysis via Congruence Transformations (PACT) was proposed by
Kevin Kerns et al. [2] to reduce multiport RC network.
• The algorithm applies two congruence transformations on matrix G
and C to simplify admittance matrix Y(s).
• The 1st
congruence transformation matrix X1
:
• Since coefficient matrix GI is symmetric positive definite, it can be
decomposed by Cholesky factorization: GI
= LLT
.
• The first congruence transformation matrix X1
is defined as:
16. Pole Analysis via Congruence Transformations
• Apply congruence transformation on G and C by X1
:
• Derive admittance matrix Y(s):
• The poles of Y(s) are the roots of the following equations:
17. Pole Analysis via Congruence Transformations
• The 2nd
congruence transformation matrix X2:
• Let U denote a square matrix whose columns are the eigenvectors of
and form the 2nd
congruence transformation matrix:
• Apply congruence transformation on G’ and C’ by X2
:
• is a diagonal matrix consists of the eigenvalues of .
18. Pole Analysis via Congruence Transformations
• Derive the admittance matrix Y(s) :
• ri is the i-th row of CC”.
• λi is the eigenvalues of CI’.
• To derive ri and λi, we need to compute the eigenvalues and
eigenvectors of matrix CI’.
19. Invariant Subspace
• Let A be a (n x n) square matrix and S be a subspace of Rn
:
• If Ax Є S for any x Є S , the subspace S is invariant under matrix A.
• Let Sk
be a k-dimensional subspace spanned by linearly
independent vectors {s1
, s2
, …, sk
} and matrix S = [s1
s2
… sk
]:
• Sk
is invariant under A if and only if there exists a (k x k) matrix B
such that AS = SB.
• Every eigenvalue of B is an eigenvalue of A.
• If v is an eigenvector of B with eigenvalue λ, Sv is an eigenvalue of A
with eigenvalue λ.
20. Krylov Subspace
• For a (n x n) matrix A and nonzero vector x Є Rn
, the subspace spanned
by vectors {x, Ax, A2
x, A3
x, …} is called Krylov subspace.
• Let λ1, λ2, … and λn be eigenvalues of A with λ1 > λ2 > … > λn.
• Let v1
, v2
, … and vn
be the corresponding eigenvectors.
• Represent vector x as a linear combination of eigenvectors:
• As k increases, vector Ak
x and Ak+1
x become more linearly depedendent.
• Assume Ak
x and Ak+1
x are linear dependent, the subspace spanned by
vectors x , Ax, … Ak
x is invariant under A.
• We can find the invariant subspace of A from Krylov subspace.
21. Arnoldi Iteration
• Given a (n x n) matrix A and nonzero vector x Є Rn
, Arnoldi iteration
derives an orthonormal basis from Krylov subspace [3] as follows:
• Step 1:
• Step 2:
• Step j+1:
• The iterations stop if hj+1,j = 0.
22. Arnoldi Iteration
• Arnoldi iteration with (k+1) steps can be represented in matrix form:
• Uk is (n x k) matrix whose columns are u1, u2, …, and uk.
• Hk is (k x k) matrix whose (I, j) entry is hij.
•
• Assume hk+1,k is close to zero:
• The eigenvalues of Hk are equal to the eigenvalues of A.
• If v is an eigenvector of Hk
with eigenvalue λ, Uk
v is an eigenvalue of A with
eigenvalue λ.
23. Computational Steps of PACT Algorithm
• Computations of 1st
congruence transformation:
• Step 1: Compute
• Step 1.1: Cholesky Factorization of matrix GI
• Step 1.2: Solve (GI
A = Gc) to get matrix A
• Step 1.3: Compute matrix Gp’
• Step 2: Compute
• Step 2.1: Compute matrix B
• Step 2.2: Compute matrix Cp’
24. Computational Steps of PACT Algorithm
• Computations of 1st
congruence transformation:
• Step 3: Compute
• Solve (LCc’ = B) to get matrix Cc’
• Step 4: Solve k largest eigenpairs of matrix CI’ by Arnoldi iterations:
• For (i = 1, 2, …., k+1) {
Compute
- Solve (LT
w = ui) to get vector w
- Compute vector y = CIw
- Solve (Lz = y) to get vector z
Orthonormalize CI’ui against u1, u2, … ui
}
• When Arnoldi iteration stops, we get
25. Computational Steps of PACT Algorithm
• Computations of 2nd
congruence transformation:
• Step 5: Compute the eigenpairs of (k x k) matrix Hk
:
• The eigenvalues λ1
, λ2
, … and λk
of CI
’ are equal to the eigenvalues of
Hk
.
• If v1
, v2
, … and vk
are the eigenvectors of matrix Hk,
the corrsponding
eigenvectors of matrix CI
’ are Uk
v1
, Uk
v2
, … Uk
vk
.
• Step 6: Let (n x k) matrix U = [Uk
v1
Uk
v2
… Uk
vk
], compute matrix
26. Simulation Results
# of Ports # of Internal
Nodes
Run Time (sec) Disk Usage (M
bytes)
Max
Error *
Port current
generation
(500 steps)
MOR
Design A
(vdd)
4 2,198,073 658 109 440 2.2%
Design B
(vdd)
8 7,101,589 3,392 784 1,648 8.8%
Design C
(vdd)
44 64,875,371 13,685 13,371 45,672 6.2%
Design C
(gnd)
44 121,069,896 34,160 30,361 81,358 9.8%
* The error of peak voltage drop/rise between reduced and full P/G network
27. Simulation Results
• Choose one port and overlap the voltage waveforms derived from hierarchical
analysis and frequency-domain model, which is shown in green and yellow
respectively.
• Design A: vdd
2.2%
2.1%
2.2%
2.2%
|Error|
31. References
[1] Min Zhao, R. V. Panda, S. S. Sapatnekar, D. Blaauw, “Hierarchical analysis of
power distribution networks”, IEEE trans. On Computer Aided Design of
Integrated Circuits and Systems, vol. 21, No. 2, pp. 159-168, Aug. 2002.
[2] K. J. Kerns and A. T. Yang, “Stable and efficient reduction of large, multiport RC
network by pole analysis via congruence transformations,” IEEE trans. On
Computer Aided Design of Integrated Circuits and Systems, vol. 16, No. 7, pp.
734-744, July 1997.
[3] W. E. Arnoldi, “The principle of minimized iteration in the solution of the
matrix eigenvalue problem,” Quart. Appl. Math., vol. 9, pp. 17-29, 1951