This is the draft slides we use for DAC 2014 presentation.
Abstract: We proposed MATEX, a distributed framework for transient simulation of power distribution networks (PDNs). MATEX utilizes matrix exponential kernel with Krylov subspace approximations to solve differential equations of linear circuit. First, the whole simulation task is divided into subtasks based on decompositions of current sources, in order to reduce the computational overheads. Then these subtasks are distributed to different computing nodes and processed in parallel. Within each node, after the matrix factorization at the beginning of simulation, the adaptive time stepping solver is performed without extra matrix re-factorizations. MATEX overcomes the stiffness hinder of previous matrix exponential-based circuit simulator by rational Krylov subspace method, which leads to larger step sizes with smaller dimensions of Krylov subspace bases and highly accelerates the whole computation. MATEX outperforms both traditional fixed and adaptive time stepping methods, e.g., achieving around 13X over the trapezoidal framework with fixed time step for the IBM power grid benchmarks.
1. The document proposes an algorithmic framework for large-scale circuit simulation using exponential integrators. It uses exponential Rosenbrock methods and an invert Krylov subspace approach to efficiently compute the matrix exponential-vector product to solve the circuit equations explicitly without needing Newton-Raphson iterations.
2. The framework was shown to accurately simulate benchmark circuits while achieving speedups over traditional approaches. It can handle large-scale, strongly coupled circuits that traditional methods have difficulty with.
3. Future work includes exploring parallelization opportunities to further accelerate the method using multicore/many-core systems and developing additional tools based on the proposed derivatives-based approach.
Principal component analysis and matrix factorizations for learning (part 1) ...zukun
This document discusses principal component analysis (PCA) and matrix factorizations for learning. It provides an overview of PCA and singular value decomposition (SVD), their history and applications. PCA and SVD are widely used techniques for dimensionality reduction and data transformation. The document also discusses how PCA relates to other methods like spectral clustering and correspondence analysis.
Artificial Neural Networks Lect8: Neural networks for constrained optimizationMohammed Bennamoun
This document summarizes a lecture on using neural networks for constrained optimization problems. It introduces the Boltzmann machine and continuous Hopfield nets, which are neural network architectures that can find solutions to constrained problems like the Traveling Salesman Problem (TSP). The Boltzmann machine uses a probabilistic update procedure and simulated annealing to search for optimal solutions represented by the network weights, which encode the problem constraints. Its architecture has units arranged in rows and columns with weights that encourage at most one unit active per row and column. The algorithm iteratively proposes state changes and accepts or rejects them probabilistically based on consensus function changes.
This document discusses mathematical modeling for control systems engineering. It introduces mathematical models as simplified representations of physical systems using assumptions. There are two common types of mathematical models: transfer functions in the frequency domain and state equations in the time domain. The document outlines steps for creating mathematical models using laws of physics and engineering, and describes various modeling techniques including transfer functions, Laplace transforms, and partial fraction expansions. It emphasizes that mathematical modeling permits representing physical systems as separate entities that can be algebraically manipulated.
I am Charles B. I am a Computer Science Homework Help Expert at eduassignmenthelp.com.. I hold a Master's Degree in computer science, Texas University, USA. I have been helping students with their homework for the past 7 years. I solve assignments related to Computer Science.
Visit eduassignmenthelp.com or email info@eduassignmenthelp.com . You can also call on +1 678 648 4277 for any assistance with Computer Science assignments.
This document provides an overview of phonons and lattice dynamics as well as tips for using the phonopy software package. It discusses the theory of phonons in crystals and the harmonic and quasi-harmonic approximations. It also outlines the workflow for using phonopy to calculate forces, construct the dynamical matrix, and post-process results to obtain phonon dispersions, densities of states, and thermal properties. Helpful tips are provided for optimizing VASP settings for force calculations and manipulating phonopy settings and output files.
Area-Delay Efficient Binary Adders in QCAIJERA Editor
In this paper, a novel quantum-dot cellular automata (QCA) adder design is presented that decrease the number
of QCA cells compared to previously method designs. The proposed one-bit QCA adder is based on a new
algorithm that requires only three majority gates and two inverters for the QCA addition. A novel 128-bit adder
designed in QCA was implemented. It achieved speed performances higher than all the existing. QCA adders,
with an area requirement comparable with the low RCA and CFA established. The novel adder operates in the
RCA functional, but it could propagate a carry signal through a number of cascaded MGs significantly lower
than conventional RCA adders. In adding together, because of the adopted basic logic and layout strategy, the
number of clock cycles required for completing the explanation was limited. As transistors reduce in size more
and more of them can be accommodated in a single die, thus increasing chip computational capabilities.
However, transistors cannot find much smaller than their current size. The quantum-dot cellular automata
approach represents one of the possible solutions in overcome this physical limit, even though the design of
logic modules in QCA is not forever straightforward.
1. The document proposes an algorithmic framework for large-scale circuit simulation using exponential integrators. It uses exponential Rosenbrock methods and an invert Krylov subspace approach to efficiently compute the matrix exponential-vector product to solve the circuit equations explicitly without needing Newton-Raphson iterations.
2. The framework was shown to accurately simulate benchmark circuits while achieving speedups over traditional approaches. It can handle large-scale, strongly coupled circuits that traditional methods have difficulty with.
3. Future work includes exploring parallelization opportunities to further accelerate the method using multicore/many-core systems and developing additional tools based on the proposed derivatives-based approach.
Principal component analysis and matrix factorizations for learning (part 1) ...zukun
This document discusses principal component analysis (PCA) and matrix factorizations for learning. It provides an overview of PCA and singular value decomposition (SVD), their history and applications. PCA and SVD are widely used techniques for dimensionality reduction and data transformation. The document also discusses how PCA relates to other methods like spectral clustering and correspondence analysis.
Artificial Neural Networks Lect8: Neural networks for constrained optimizationMohammed Bennamoun
This document summarizes a lecture on using neural networks for constrained optimization problems. It introduces the Boltzmann machine and continuous Hopfield nets, which are neural network architectures that can find solutions to constrained problems like the Traveling Salesman Problem (TSP). The Boltzmann machine uses a probabilistic update procedure and simulated annealing to search for optimal solutions represented by the network weights, which encode the problem constraints. Its architecture has units arranged in rows and columns with weights that encourage at most one unit active per row and column. The algorithm iteratively proposes state changes and accepts or rejects them probabilistically based on consensus function changes.
This document discusses mathematical modeling for control systems engineering. It introduces mathematical models as simplified representations of physical systems using assumptions. There are two common types of mathematical models: transfer functions in the frequency domain and state equations in the time domain. The document outlines steps for creating mathematical models using laws of physics and engineering, and describes various modeling techniques including transfer functions, Laplace transforms, and partial fraction expansions. It emphasizes that mathematical modeling permits representing physical systems as separate entities that can be algebraically manipulated.
I am Charles B. I am a Computer Science Homework Help Expert at eduassignmenthelp.com.. I hold a Master's Degree in computer science, Texas University, USA. I have been helping students with their homework for the past 7 years. I solve assignments related to Computer Science.
Visit eduassignmenthelp.com or email info@eduassignmenthelp.com . You can also call on +1 678 648 4277 for any assistance with Computer Science assignments.
This document provides an overview of phonons and lattice dynamics as well as tips for using the phonopy software package. It discusses the theory of phonons in crystals and the harmonic and quasi-harmonic approximations. It also outlines the workflow for using phonopy to calculate forces, construct the dynamical matrix, and post-process results to obtain phonon dispersions, densities of states, and thermal properties. Helpful tips are provided for optimizing VASP settings for force calculations and manipulating phonopy settings and output files.
Area-Delay Efficient Binary Adders in QCAIJERA Editor
In this paper, a novel quantum-dot cellular automata (QCA) adder design is presented that decrease the number
of QCA cells compared to previously method designs. The proposed one-bit QCA adder is based on a new
algorithm that requires only three majority gates and two inverters for the QCA addition. A novel 128-bit adder
designed in QCA was implemented. It achieved speed performances higher than all the existing. QCA adders,
with an area requirement comparable with the low RCA and CFA established. The novel adder operates in the
RCA functional, but it could propagate a carry signal through a number of cascaded MGs significantly lower
than conventional RCA adders. In adding together, because of the adopted basic logic and layout strategy, the
number of clock cycles required for completing the explanation was limited. As transistors reduce in size more
and more of them can be accommodated in a single die, thus increasing chip computational capabilities.
However, transistors cannot find much smaller than their current size. The quantum-dot cellular automata
approach represents one of the possible solutions in overcome this physical limit, even though the design of
logic modules in QCA is not forever straightforward.
This document summarizes a method for calculating the sensitivity matrix that defines the linear relationship between circuit parameters and poles/response of an RLC network. The sensitivity matrix enables efficient statistical analysis and yield predictions. It is obtained by taking derivatives of the poles and transfer function, which are calculated from the eigenvalues and eigenvectors of the network's state equation. An example RLC circuit demonstrates calculating the sensitivity matrix and using it to predict yield based on Monte Carlo simulations.
The document describes an experiment using MATLAB to implement the Newton-Raphson load flow method to analyze a 3-bus power system network. The Newton-Raphson method approximates non-linear power flow equations using Taylor series expansion, allowing faster convergence compared to other methods. The experiment specifies bus voltages, real/reactive power demands and generations, and solves for reactive power output using a tolerance of 0.01 power mismatch. The MATLAB code is run and the load flow solution is obtained.
The document discusses neural networks based on competition. It describes three fixed-weight competitive neural networks: Maxnet, Mexican Hat, and Hamming Net. Maxnet uses winner-take-all competition where only the neuron with the largest activation remains active. The Mexican Hat network enhances the activation of neurons receiving a stronger external signal by applying positive weights to nearby neurons and negative weights to those further away. An example demonstrates how the Mexican Hat network increases contrast over iterations.
The document discusses power flow analysis and solutions using the Gauss-Seidel method. It describes setting up the bus admittance matrix and node-voltage equations based on impedance values between nodes. The Gauss-Seidel method is then used to iteratively solve the nonlinear power flow equations to determine bus voltages and power flows by updating the solution for one variable at a time. Instructions are provided on applying the method to different bus types including slack, PQ and PV buses.
Wang-Landau Monte Carlo simulation is a method for calculating the density of states function which can then be used to calculate thermodynamic properties like the mean value of variables. It improves on traditional Monte Carlo methods which struggle at low temperatures due to complicated energy landscapes with many local minima separated by large barriers. The Wang-Landau algorithm calculates the density of states function directly rather than relying on sampling configurations, allowing it to overcome barriers and fully explore the configuration space even at low temperatures.
1) The document presents designs for reversible logic gates and their applications in low power circuits. It proposes an improved design for a reversible programmable logic array (RPLA) using multiplexer and Feynman gates that is more efficient than existing designs.
2) It also proposes a method for structuring a reversible arithmetic logic unit (ALU) using reversible logic gates instead of traditional gates, achieving the same functionality with reduced information loss.
3) The RPLA design is demonstrated by implementing reversible 1-bit full adders and subtractors. Simulation results show the proposed design optimizes the number of reversible gates used.
This document summarizes research on using algebraic multigrid (AMG) methods to solve equations modeling porous media flow. The key points are:
1) A spectral element-based AMG method is used to build a coarse level that represents important components in the problem's near-nullspace added by high contrast in properties.
2) Directly applying standard AMG to the resulting coarse problem is ineffective since it has a different structure than assumed.
3) A "three-level" approach is taken where the coarse problem is transformed to match assumptions of a standard AMG, which enables accurate and scalable solution of problems with millions of unknowns.
The document provides an overview of MATLAB including its basic elements and functions. It discusses MATLAB's interface, variables, basic arithmetic and matrix operations, plotting functions, programming with MATLAB using control structures like if/else and for loops, and creating user-defined functions. Examples are provided throughout to demonstrate various MATLAB commands.
The document discusses using neural networks to solve the traveling salesman problem (TSP). It describes Hopfield neural networks and how they can be used for optimization problems like TSP. It then discusses a concurrent neural network approach that requires fewer neurons (N(logN)) than Hopfield for TSP. The document compares the performance of concurrent neural networks to Hopfield networks and self-organizing maps on TSP test cases, finding concurrent networks converge faster but are less reliable than self-organizing maps.
An overview of the Phonopy (and Phono3py) lattice-dynamics codes, covering features, examples, applications and troubleshooting (2014 presentation updated for 2015).
This document discusses different graph kernel methods including shortest path kernel, graphlet kernel, and Weisfeiler-Lehman kernel. It outlines the algorithms for each kernel and describes how they are used to compute similarity between graphs. An experiment is described that tests the performance of each kernel on different types of graph datasets using 10-fold SVM classification. The graphlet kernel achieved the highest accuracy while shortest path kernel had the lowest. Graphlet kernel also had the highest computational time complexity.
The document discusses computational electromagnetics and the finite element method. It provides 7 steps for the finite element method: 1) divide the problem domain into sub-domains, 2) approximate the potential for each element, 3) find the potential for each element in terms of end points, 4) find the energy for each element, 5) find the total energy, 6) obtain the general solution, and 7) obtain a unique solution by applying boundary conditions. The finite element method is useful for problems with complex geometries and boundary conditions that cannot be solved analytically.
The document discusses the Fast Decoupled Load Flow (FDLF) method for solving load flow problems. FDLF is based on the Newton-Raphson method but further simplifies the load flow equations by assuming that active power changes are more sensitive to voltage angle changes and reactive power changes are more sensitive to voltage magnitude changes. This allows the Jacobian matrix to be separated into two square submatrices related to voltage angle and magnitude. FDLF requires fewer iterations than Newton-Raphson, has higher reliability, and is faster and uses less storage. The method is physically justifiable and can be used in optimization studies involving multiple load flow solutions.
International Journal of Computational Engineering Research(IJCER)ijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Load Flow Analysis of Jamshoro Thermal Power Station (JTPS) Pakistan Using MA...sunny katyara
This article summarizes a study analyzing the load flow of Jamshoro Thermal Power Station (JTPS) in Pakistan using MATLAB programming. The study models the power plant and transmission network in MATLAB to calculate active and reactive power flows, line losses, voltage profiles and angles at different buses. This provides information for efficient scheduling and future planning of the power system. MATLAB code was developed using the Gauss-Siedel iterative method to solve the load flow equations. The results provide voltage magnitudes and angles at each bus and active/reactive power flows on each transmission line. This analysis can help optimize the economic operation and future expansion of the JTPS power system.
Researchers like Landauer and Bennett have shown that every bit of information lost will generate kTlog2 joules of
energy, whereas the energy dissipation would not occur, if computation is carried out in a reversible way. k is
Boltzmann’s constant and T is absolute temperature at which computation is performed. Thus reversible circuits will be
the most important one of the solutions of heat dissipation in Future circuit design. Reversible computing is motivated
by the Von Neumann Landauer (VNL) principle, a theorem of modern physics telling us that ordinary irreversible logic
operation which destructively overwrite previous outputs)in cur a fundamental physics) that performance on most
applications within realistic power constraints might still continue increasing indefinitely. Reversible logic is also a
core part of the quantum circuit model
Cost Efficient PageRank Computation using GPU : NOTESSubhajit Sahu
This document describes research on efficiently computing PageRank using GPUs. The key points are:
1. PageRank is computed using an iterative power method, which can be sped up using GPU parallelization. Common operations like sparse matrix-vector multiplication and vector operations are implemented using CUDA libraries.
2. Experiments show the parallel GPU implementation significantly outperforms the serial CPU implementation in terms of time taken for convergence, especially for large web graphs. Faster convergence is also achieved for higher damping factors.
3. Periodic use of Aitken extrapolation can further accelerate convergence by refining the vectors between power iterations. Results show reductions in the number of iterations required for convergence compared to the standard power method
Manifold Blurring Mean Shift algorithms for manifold denoising, presentation,...Florent Renucci
(General) To retrieve a clean dataset by deleting outliers.
(Computer Vision) the recovery of a digital image that has been contaminated by additive white Gaussian noise.
We propose an efficient algorithmic framework for time domain circuit simulation using exponential integrators. This work addresses several critical issues exposed by previous matrix exponential based circuit simulation research, and makes it capable of simulating stiff nonlinear circuit system at a large scale. In this framework, the system’s nonlinearity is treated with exponential Rosenbrock-Euler formulation. The matrix exponential and vector product is computed using invert Krylov subspace method. Our proposed method has several distinguished advantages over conventional formulations (e.g., the well-known backward Euler with Newton-Raphson method). The matrix factorization is performed only for the conductance/resistance matrix G, without being performed for the combinations of the capacitance/inductance matrix C and matrix G, which are used in traditional implicit formulations. Furthermore, due to the explicit nature of our formulation, we do not need to repeat LU decompositions when adjusting the length of time steps for error controls. Our algorithm is better suited to solving tightly coupled post-layout circuits in the pursuit for full-chip simulation. Our experimental results validate the advantages of our framework.
This document summarizes a method for calculating the sensitivity matrix that defines the linear relationship between circuit parameters and poles/response of an RLC network. The sensitivity matrix enables efficient statistical analysis and yield predictions. It is obtained by taking derivatives of the poles and transfer function, which are calculated from the eigenvalues and eigenvectors of the network's state equation. An example RLC circuit demonstrates calculating the sensitivity matrix and using it to predict yield based on Monte Carlo simulations.
The document describes an experiment using MATLAB to implement the Newton-Raphson load flow method to analyze a 3-bus power system network. The Newton-Raphson method approximates non-linear power flow equations using Taylor series expansion, allowing faster convergence compared to other methods. The experiment specifies bus voltages, real/reactive power demands and generations, and solves for reactive power output using a tolerance of 0.01 power mismatch. The MATLAB code is run and the load flow solution is obtained.
The document discusses neural networks based on competition. It describes three fixed-weight competitive neural networks: Maxnet, Mexican Hat, and Hamming Net. Maxnet uses winner-take-all competition where only the neuron with the largest activation remains active. The Mexican Hat network enhances the activation of neurons receiving a stronger external signal by applying positive weights to nearby neurons and negative weights to those further away. An example demonstrates how the Mexican Hat network increases contrast over iterations.
The document discusses power flow analysis and solutions using the Gauss-Seidel method. It describes setting up the bus admittance matrix and node-voltage equations based on impedance values between nodes. The Gauss-Seidel method is then used to iteratively solve the nonlinear power flow equations to determine bus voltages and power flows by updating the solution for one variable at a time. Instructions are provided on applying the method to different bus types including slack, PQ and PV buses.
Wang-Landau Monte Carlo simulation is a method for calculating the density of states function which can then be used to calculate thermodynamic properties like the mean value of variables. It improves on traditional Monte Carlo methods which struggle at low temperatures due to complicated energy landscapes with many local minima separated by large barriers. The Wang-Landau algorithm calculates the density of states function directly rather than relying on sampling configurations, allowing it to overcome barriers and fully explore the configuration space even at low temperatures.
1) The document presents designs for reversible logic gates and their applications in low power circuits. It proposes an improved design for a reversible programmable logic array (RPLA) using multiplexer and Feynman gates that is more efficient than existing designs.
2) It also proposes a method for structuring a reversible arithmetic logic unit (ALU) using reversible logic gates instead of traditional gates, achieving the same functionality with reduced information loss.
3) The RPLA design is demonstrated by implementing reversible 1-bit full adders and subtractors. Simulation results show the proposed design optimizes the number of reversible gates used.
This document summarizes research on using algebraic multigrid (AMG) methods to solve equations modeling porous media flow. The key points are:
1) A spectral element-based AMG method is used to build a coarse level that represents important components in the problem's near-nullspace added by high contrast in properties.
2) Directly applying standard AMG to the resulting coarse problem is ineffective since it has a different structure than assumed.
3) A "three-level" approach is taken where the coarse problem is transformed to match assumptions of a standard AMG, which enables accurate and scalable solution of problems with millions of unknowns.
The document provides an overview of MATLAB including its basic elements and functions. It discusses MATLAB's interface, variables, basic arithmetic and matrix operations, plotting functions, programming with MATLAB using control structures like if/else and for loops, and creating user-defined functions. Examples are provided throughout to demonstrate various MATLAB commands.
The document discusses using neural networks to solve the traveling salesman problem (TSP). It describes Hopfield neural networks and how they can be used for optimization problems like TSP. It then discusses a concurrent neural network approach that requires fewer neurons (N(logN)) than Hopfield for TSP. The document compares the performance of concurrent neural networks to Hopfield networks and self-organizing maps on TSP test cases, finding concurrent networks converge faster but are less reliable than self-organizing maps.
An overview of the Phonopy (and Phono3py) lattice-dynamics codes, covering features, examples, applications and troubleshooting (2014 presentation updated for 2015).
This document discusses different graph kernel methods including shortest path kernel, graphlet kernel, and Weisfeiler-Lehman kernel. It outlines the algorithms for each kernel and describes how they are used to compute similarity between graphs. An experiment is described that tests the performance of each kernel on different types of graph datasets using 10-fold SVM classification. The graphlet kernel achieved the highest accuracy while shortest path kernel had the lowest. Graphlet kernel also had the highest computational time complexity.
The document discusses computational electromagnetics and the finite element method. It provides 7 steps for the finite element method: 1) divide the problem domain into sub-domains, 2) approximate the potential for each element, 3) find the potential for each element in terms of end points, 4) find the energy for each element, 5) find the total energy, 6) obtain the general solution, and 7) obtain a unique solution by applying boundary conditions. The finite element method is useful for problems with complex geometries and boundary conditions that cannot be solved analytically.
The document discusses the Fast Decoupled Load Flow (FDLF) method for solving load flow problems. FDLF is based on the Newton-Raphson method but further simplifies the load flow equations by assuming that active power changes are more sensitive to voltage angle changes and reactive power changes are more sensitive to voltage magnitude changes. This allows the Jacobian matrix to be separated into two square submatrices related to voltage angle and magnitude. FDLF requires fewer iterations than Newton-Raphson, has higher reliability, and is faster and uses less storage. The method is physically justifiable and can be used in optimization studies involving multiple load flow solutions.
International Journal of Computational Engineering Research(IJCER)ijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Load Flow Analysis of Jamshoro Thermal Power Station (JTPS) Pakistan Using MA...sunny katyara
This article summarizes a study analyzing the load flow of Jamshoro Thermal Power Station (JTPS) in Pakistan using MATLAB programming. The study models the power plant and transmission network in MATLAB to calculate active and reactive power flows, line losses, voltage profiles and angles at different buses. This provides information for efficient scheduling and future planning of the power system. MATLAB code was developed using the Gauss-Siedel iterative method to solve the load flow equations. The results provide voltage magnitudes and angles at each bus and active/reactive power flows on each transmission line. This analysis can help optimize the economic operation and future expansion of the JTPS power system.
Researchers like Landauer and Bennett have shown that every bit of information lost will generate kTlog2 joules of
energy, whereas the energy dissipation would not occur, if computation is carried out in a reversible way. k is
Boltzmann’s constant and T is absolute temperature at which computation is performed. Thus reversible circuits will be
the most important one of the solutions of heat dissipation in Future circuit design. Reversible computing is motivated
by the Von Neumann Landauer (VNL) principle, a theorem of modern physics telling us that ordinary irreversible logic
operation which destructively overwrite previous outputs)in cur a fundamental physics) that performance on most
applications within realistic power constraints might still continue increasing indefinitely. Reversible logic is also a
core part of the quantum circuit model
Cost Efficient PageRank Computation using GPU : NOTESSubhajit Sahu
This document describes research on efficiently computing PageRank using GPUs. The key points are:
1. PageRank is computed using an iterative power method, which can be sped up using GPU parallelization. Common operations like sparse matrix-vector multiplication and vector operations are implemented using CUDA libraries.
2. Experiments show the parallel GPU implementation significantly outperforms the serial CPU implementation in terms of time taken for convergence, especially for large web graphs. Faster convergence is also achieved for higher damping factors.
3. Periodic use of Aitken extrapolation can further accelerate convergence by refining the vectors between power iterations. Results show reductions in the number of iterations required for convergence compared to the standard power method
Manifold Blurring Mean Shift algorithms for manifold denoising, presentation,...Florent Renucci
(General) To retrieve a clean dataset by deleting outliers.
(Computer Vision) the recovery of a digital image that has been contaminated by additive white Gaussian noise.
We propose an efficient algorithmic framework for time domain circuit simulation using exponential integrators. This work addresses several critical issues exposed by previous matrix exponential based circuit simulation research, and makes it capable of simulating stiff nonlinear circuit system at a large scale. In this framework, the system’s nonlinearity is treated with exponential Rosenbrock-Euler formulation. The matrix exponential and vector product is computed using invert Krylov subspace method. Our proposed method has several distinguished advantages over conventional formulations (e.g., the well-known backward Euler with Newton-Raphson method). The matrix factorization is performed only for the conductance/resistance matrix G, without being performed for the combinations of the capacitance/inductance matrix C and matrix G, which are used in traditional implicit formulations. Furthermore, due to the explicit nature of our formulation, we do not need to repeat LU decompositions when adjusting the length of time steps for error controls. Our algorithm is better suited to solving tightly coupled post-layout circuits in the pursuit for full-chip simulation. Our experimental results validate the advantages of our framework.
This document summarizes Andrew Myers' presentation on controlling numerical error in particle-in-cell simulations of collisionless dark matter. Standard PIC methods do not converge for cosmology applications. Two modifications are discussed: regularization, which involves replacing cloud-in-cell interpolation with higher-order kernels; and adaptive remapping to reduce noise from particle discreteness. While these techniques improve arithmetic intensity and convergence for plasma simulations, evidence suggests they may not significantly reduce errors for cosmology simulations of dark matter.
Linear regression [Theory and Application (In physics point of view) using py...ANIRBANMAJUMDAR18
Machine-learning models are behind many recent technological advances, including high-accuracy translations of the text and self-driving cars. They are also increasingly used by researchers to help in solving physics problems, like Finding new phases of matter, Detecting interesting outliers
in data from high-energy physics experiments, Founding astronomical objects are known as gravitational lenses in maps of the night sky etc. The rudimentary algorithm that every Machine Learning enthusiast starts with is a linear regression algorithm. In statistics, linear regression is a linear approach to modelling the relationship between a scalar response (or dependent variable) and one or more explanatory variables (or independent
variables). Linear regression analysis (least squares) is used in a physics lab to prepare the computer-aided report and to fit data. In this article, the application is made to experiment: 'DETERMINATION OF DIELECTRIC CONSTANT OF NON-CONDUCTING LIQUIDS'. The entire computation is made through Python 3.6 programming language in this article.
Tensor Spectral Clustering is an algorithm that generalizes graph partitioning and spectral clustering methods to account for higher-order network structures. It defines a new objective function called motif conductance that measures how partitions cut motifs like triangles in addition to edges. The algorithm represents a tensor of higher-order random walk transitions as a matrix and computes eigenvectors to find a partition that minimizes the number of motifs cut, allowing networks to be clustered based on higher-order connectivity patterns. Experiments on synthetic and real networks show it can discover meaningful partitions by accounting for motifs that capture important structural relationships.
1) The document discusses fault analysis and short circuit studies, which are important for designing protective schemes for power systems. It defines different types of faults like symmetrical, shunt, series, and explains their characteristics.
2) Bus impedance matrix calculation methods like bus building algorithm are explained. The algorithm allows modification of the matrix when network changes without complete rebuilding.
3) An example demonstrates using the bus building algorithm to determine the bus impedance matrix for a sample network. Assumptions and applications of short circuit studies in protection are also summarized.
Virus, Vaccines, Genes and Quantum - 2020-06-18Aritra Sarkar
This document discusses using a quantum computer to simulate DNA-based vaccines by indexing and aligning short DNA reads to a reference genome. It describes superimposing the reference genome segmented into short reads and evolving via controlled operations to the Hamming distance against the short read. The maximum probability entry indicates the alignment index. Steps include 1) superposing the indexed reference segments, 2) evolving via controlled operations to the Hamming distance, and 3) finding the maximum probability entry indicating the alignment index.
A chaotic particle swarm optimization (cpso) algorithm for solving optimal re...Alexander Decker
This document presents a chaotic particle swarm optimization (CPSO) algorithm for solving the multi-objective reactive power dispatch problem. The CPSO algorithm aims to avoid premature convergence by fusing ergodic and stochastic chaos. It formulates reactive power dispatch as an optimization problem with two objectives: minimizing real power losses and maximizing static voltage stability margin. The CPSO is tested on the IEEE 30 bus system and is shown to reduce power losses and maximize voltage stability more than other algorithms.
A chaotic particle swarm optimization (cpso) algorithm for solving optimal re...Alexander Decker
This document presents a chaotic particle swarm optimization (CPSO) algorithm for solving the multi-objective reactive power dispatch problem. The CPSO algorithm aims to avoid premature convergence by fusing ergodic and stochastic chaos. It formulates reactive power dispatch as an optimization problem with two objectives: minimizing real power losses and maximizing static voltage stability margin. The CPSO is tested on the IEEE 30 bus system and is shown to reduce power losses and maximize voltage stability more than other algorithms.
Approaches to online quantile estimationData Con LA
Data Con LA 2020
Description
This talk will explore and compare several compact data structures for estimation of quantiles on streams, including a discussion of how they balance accuracy against computational resource efficiency. A new approach providing more flexibility in specifying how computational resources should be expended across the distribution will also be explained. Quantiles (e.g., median, 99th percentile) are fundamental summary statistics of one-dimensional distributions. They are particularly important for SLA-type calculations and characterizing latency distributions, but unlike their simpler counterparts such as the mean and standard deviation, their computation is somewhat more expensive. The increasing importance of stream processing (in observability and other domains) and the impossibility of exact online quantile calculation together motivate the construction of compact data structures for estimation of quantiles on streams. In this talk we will explore and compare several such data structures (e.g., moment-based, KLL sketch, t-digest) with an eye towards how they balance accuracy against resource efficiency, theoretical guarantees, and desirable properties such as mergeability. We will also discuss a recent variation of the t-digest which provides more flexibility in specifying how computational resources should be expended across the distribution. No prior knowledge of the subject is assumed. Some familiarity with the general problem area would be helpful but is not required.
Speaker
Joe Ross, Splunk, Principal Data Scientist
- The document details a state space solver approach for analog mixed-signal simulations using SystemC. It models analog circuits as sets of linear differential equations and solves them using the Runge-Kutta method of numerical integration.
- Two examples are provided: a digital voltage regulator simulation and a digital phase locked loop simulation. Both analog circuits are modeled in state space and simulated alongside a digital design to verify mixed-signal behavior.
- The state space approach allows modeling analog circuits without transistor-level details, improving simulation speed over traditional mixed-mode simulations while still capturing system-level behavior.
The document describes a self-study course on finite element methods for structural analysis developed by Dr. Naveen Rastogi. The course is intended as a 3-credit, semester-long course for senior undergraduate and graduate engineering students. It covers foundational concepts of finite element analysis including shape functions, element stiffness matrices, and static/dynamic/thermal analysis of structures. Practice problems are provided to help students gain a better understanding of the subject matter.
Fundamentals of quantum computing part i revPRADOSH K. ROY
This document provides an introduction to the fundamentals of quantum computing. It discusses computational complexity classes such as P and NP and essential matrix algebra concepts like Hermitian, unitary, and normal matrices. It also contrasts the classical and quantum worlds. In the quantum world, quantum systems can exist in superposition states and qubits can represent more than just binary 0s and 1s. The document introduces the concept of a qubit register and how multiple qubits can be represented using tensor products. It discusses characteristics of quantum systems like superposition, Born's rule for probabilities, and the measurement postulate which causes wavefunction collapse.
- Chaos is the aperiodic behavior in deterministic systems that exhibits sensitivity to initial conditions. Systems can be categorized as deterministic, stochastic, or chaotic.
- Chaotic communication systems use chaotic signals as carriers due to their noise-like properties like broad spectrum and self-synchronization. Digital messages have been successfully sent over long distances using chaotic modulation.
- Examples of chaotic systems include the Lorenz, Rossler, and Chen systems which are continuous dynamical systems, and the logistic map and tent map which are discrete dynamical systems. Chaotic systems demonstrate properties like sensitivity to initial conditions and system parameters.
Using Machine Learning to Measure the Cross Section of Top Quark Pairs in the...m.a.kirn
Malina Kirn's 2011-09-06 University of Maryland Scientific Computation dissertation defense. Using neural networks and grid computing to measure top quark pair production cross section at the Compact Muon Solenoid detector at the Large Hadron Collider.
This paper proposes using a nonlinear autoregressive exogenous (NARX) feedback neural network for reference current generation in a 3-phase shunt active filter. The NARX network processes current quantities and a unit vector template to estimate the fundamental active component of the load current and the reference source currents. This allows determining the load current components for all three phases in a single step, without repeating calculations for each phase. Simulation results show the proposed method allows the shunt active filter to ensure sinusoidal and balanced supply currents at unity power factor under different load and supply conditions, with total harmonic distortion well within limits.
Hardware Acceleration for Machine LearningCastLabKAIST
This document provides an overview of a lecture on hardware acceleration for machine learning. The lecture will cover deep neural network models like convolutional neural networks and recurrent neural networks. It will also discuss various hardware accelerators developed for machine learning, including those designed for mobile/edge and cloud computing environments. The instructor's background and the agenda topics are also outlined.
Reduction of Active Power Loss byUsing Adaptive Cat Swarm Optimizationijeei-iaes
This paper presents, an Adaptive Cat Swarm Optimization (ACSO) for solving reactive power dispatch problem. Cat Swarm Optimization (CSO) is one of the new-fangled swarm intelligence algorithms for finding the most excellent global solution. Because of complication, sometimes conventional CSO takes a lengthy time to converge and cannot attain the precise solution. For solving reactive power dispatch problem and to improve the convergence accuracy level, we propose a new adaptive CSO namely ‘Adaptive Cat Swarm Optimization’ (ACSO). First, we take account of a new-fangled adaptive inertia weight to velocity equation and then employ an adaptive acceleration coefficient. Second, by utilizing the information of two previous or next dimensions and applying a new-fangled factor, we attain to a new position update equation composing the average of position and velocity information. The projected ACSO has been tested on standard IEEE 57 bus test system and simulation results shows clearly about the high-quality performance of the planned algorithm in tumbling the real power loss.
Machine learning ppt and presentation codesharma239172
Principal Component Analysis (PCA) is a technique for dimensionality reduction that projects high-dimensional data onto a lower-dimensional space in a way that maximizes variance. It works by finding the directions (principal components) along which the variance of the data is highest. These principal components become the new axes of the reduced space. PCA involves computing the covariance matrix of the data, performing eigendecomposition on the covariance matrix to obtain its eigenvectors, and projecting the data onto the top K eigenvectors corresponding to the largest eigenvalues, where K is the target dimensionality. This projection both reduces dimensionality and maximizes retained variance.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Sinan KOZAK
Sinan from the Delivery Hero mobile infrastructure engineering team shares a deep dive into performance acceleration with Gradle build cache optimizations. Sinan shares their journey into solving complex build-cache problems that affect Gradle builds. By understanding the challenges and solutions found in our journey, we aim to demonstrate the possibilities for faster builds. The case study reveals how overlapping outputs and cache misconfigurations led to significant increases in build times, especially as the project scaled up with numerous modules using Paparazzi tests. The journey from diagnosing to defeating cache issues offers invaluable lessons on maintaining cache integrity without sacrificing functionality.
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
Comparative analysis between traditional aquaponics and reconstructed aquapon...bijceesjournal
The aquaponic system of planting is a method that does not require soil usage. It is a method that only needs water, fish, lava rocks (a substitute for soil), and plants. Aquaponic systems are sustainable and environmentally friendly. Its use not only helps to plant in small spaces but also helps reduce artificial chemical use and minimizes excess water use, as aquaponics consumes 90% less water than soil-based gardening. The study applied a descriptive and experimental design to assess and compare conventional and reconstructed aquaponic methods for reproducing tomatoes. The researchers created an observation checklist to determine the significant factors of the study. The study aims to determine the significant difference between traditional aquaponics and reconstructed aquaponics systems propagating tomatoes in terms of height, weight, girth, and number of fruits. The reconstructed aquaponics system’s higher growth yield results in a much more nourished crop than the traditional aquaponics system. It is superior in its number of fruits, height, weight, and girth measurement. Moreover, the reconstructed aquaponics system is proven to eliminate all the hindrances present in the traditional aquaponics system, which are overcrowding of fish, algae growth, pest problems, contaminated water, and dead fish.
Batteries -Introduction – Types of Batteries – discharging and charging of battery - characteristics of battery –battery rating- various tests on battery- – Primary battery: silver button cell- Secondary battery :Ni-Cd battery-modern battery: lithium ion battery-maintenance of batteries-choices of batteries for electric vehicle applications.
Fuel Cells: Introduction- importance and classification of fuel cells - description, principle, components, applications of fuel cells: H2-O2 fuel cell, alkaline fuel cell, molten carbonate fuel cell and direct methanol fuel cells.
The Python for beginners. This is an advance computer language.
MATEX @ DAC14
1. 1.
Computer Science & Engineering Dept.,
University of California, San Diego, CA
2. Facebook Inc., Menlo Park, CA
MATEX: A Distributed Framework of Transient Simulation for Power Distribution Networks
* Email: zhuangh@ucsd.edu
Hao Zhuang1*, Shih-Hung Weng2, Jeng-HauLin1,
Chung-KuanCheng1
3. Linear differential equations
퐂퐂̇퐱퐱푡푡=−퐆퐆퐆(푡푡)+퐁퐁퐁퐁(푡푡)
Tens of millions or billions unknowns
Problem Formulation for PDN Transient Simulation
퐂퐂:capacitance/inductance matrix
퐆퐆:conductance matrix
퐱퐱(푡푡):voltage/current vector
퐁퐁: input selection matrix
퐮
푡푡:input current sources (vector)
PDN structure
RLC model
3
4. Previous Work
Time step size ℎis determined by
Input transition distances defines the upper bound of the time step, e.g. ℎ2=min(ℎ1,ℎ2,ℎ3)
Stiffness of systems
Local truncation error(LTE) ℎ1
ℎ2
ℎ3
A pulse input example
Low order approximations, e.g. Trapezoidal method (TR) 퐂퐂 ℎ+퐆퐆 2퐱퐱푡푡+ℎ=퐂퐂 ℎ−퐆퐆 2퐱퐱푡푡+퐁퐁퐮푡푡+ℎ+퐮(푡푡) 2
TRwith fixed time-step ℎwas used by the top solvers in TAU’12 power grid (PG) simulation contest
Efficient for IBM PG Benchmarks
Only one matrix factorization for transient stepping
Process forward and backward substitutions to calculate 퐱퐱푡푡+ℎ
4
6. Advantage in Accuracy
Reference solution
With the same h, Matrix Exponential method can reaches
the reference solution, while Backward Euler cannot.
6
7. Not 풆풆퐀퐀, but 풆풆퐀퐀퐯퐯[Weng, et. al. IEEE TCAD 2012]
Compute풆풆퐀퐀is very expensive, when 퐀퐀is large!
풆풆퐀퐀퐯퐯: Matrix Exponential and Vector Product (MEVP)
Efficiently approximated via Krylov subspace (MEXP)
Standard Krylov subspace 푲푲풎풎퐀퐀,퐯퐯=퐯퐯,퐀퐀퐀,퐀퐀ퟐퟐ퐯퐯,…,퐀퐀풎풎−ퟏퟏ퐯퐯
Basis Generation: 퐕퐕풎풎=퐯퐯ퟏퟏ,퐯퐯ퟐퟐ,⋯,퐯퐯풎풎
Arnoldiprocess and Matrix reduction:
퐀퐀퐀풎풎=퐕퐕풎풎퐇퐇풎풎+풉풉풎풎+ퟏퟏ,풎풎퐯퐯풎풎+ퟏퟏ풆풆풎풎퐓퐓
MEVP is computed by
풆풆퐀퐀퐯퐯≈퐯퐯ퟐퟐ퐕퐕풎풎풆풆퐇퐇풎풎풆풆ퟏퟏ
Time stepping only by scaling h,
풆풆ℎ퐀퐀퐯퐯≈퐯퐯ퟐퟐ퐕퐕풎풎풆풆ℎ퐇퐇풎풎풆풆ퟏퟏ
7
8. Algorithm of Computing 퐱퐱(푡푡+ℎ)
PDN is a linear system, so that the
input matrices 퐗퐗ퟐퟐ, 퐋퐋, 퐔퐔 do not change.
퐥퐥퐥_퐝퐝퐝퐝퐝퐝퐝퐝퐝퐝퐝퐝 퐗퐗ퟏퟏ is done only once
for the whole simulation.
퐗퐗ퟏퟏ
퐗퐗ퟐퟐ
퐋퐋,퐔퐔
MEXP
퐂퐂
퐆퐆
퐥퐥퐥_퐝퐝퐝퐝퐝퐝퐝퐝퐝퐝퐝퐝(퐗퐗퐗)
8
9.
PDNs are usually highly stiff circuits
Generalized eigenvalues spread in a wide range within spectrum of A. (퐀퐀=−퐂퐂−ퟏퟏ퐆퐆)
Requires Standard Krylovsubspace to build a very large number of bases to approximate MEVP.
Problem #1: Stiff PDN Circuits
9
11. Standard Krylov subspace (MEXP)
(a) Standard Krylov Basis (MEXP):
푲푲풎풎 퐀퐀, 퐯퐯 = 퐯퐯, 퐀퐀퐀, 퐀퐀ퟐퟐ퐯퐯,…, 퐀퐀풎풎−ퟏퟏ퐯퐯
Im
Re
0
(a)
Eigenvalues of A: small magnitude of real components
Eigenvalues of A: large magnitude of real components
퐀퐀=−퐂퐂−ퟏퟏ퐆퐆
11
12. Standard Krylov subspace (MEXP)
(a) Standard Krylov Basis (MEXP):
푲푲풎풎 퐀퐀, 퐯퐯 = 퐯퐯, 퐀퐀퐀, 퐀퐀ퟐퟐ퐯퐯,…, 퐀퐀풎풎−ퟏퟏ퐯퐯
Im
Re
0
(a)
•Fast mode of dynamical behavior of circuits.
•Standard Krylovbasis tends to capture these eigenvalues with large magnitude.
Eigenvalues of A: small magnitude of real components
Eigenvalues of A: large magnitude of real components 12
13. Standard Krylov subspace (MEXP)
(a) Standard Krylov Basis (MEXP):
푲푲풎풎 퐀퐀, 퐯퐯 = 퐯퐯, 퐀퐀퐀, 퐀퐀ퟐퟐ퐯퐯,…, 퐀퐀풎풎−ퟏퟏ퐯퐯
Im
Re
0
(a)
•These eigenvalues defines the major dynamical behavior of circuits.
•Demand more bases in order to characterize these eigenvalues
Eigenvalues of A: small magnitude of real components
Eigenvalues of A: large magnitude of real components 13
14. Inverted Krylov subspace (I-MATEX)
(a) Standard Krylov Basis (MEXP):
푲푲풎풎 퐀퐀, 퐯퐯 = 퐯퐯, 퐀퐀퐀, 퐀퐀ퟐퟐ퐯퐯,…, 퐀퐀풎풎−ퟏퟏ퐯퐯
(b) Inverted Krylov Basis (I-MATEX)
푲푲풎풎 퐀퐀−ퟏퟏ, 퐯퐯 = 퐯퐯, 퐀퐀−ퟏퟏ퐯퐯, 퐀퐀−ퟐퟐ 퐯퐯,…, 퐀퐀−풎풎+ퟏퟏ퐯퐯
Im
Re
Im
Re
0
0
(a)
(b)
Eigenvalues of A: small magnitude of real components
Eigenvalues of A: large magnitude of real components 14
15. Inverted Krylov subspace (I-MATEX)
(a) Standard Krylov Basis (MEXP):
푲푲풎풎 퐀퐀, 퐯퐯 = 퐯퐯, 퐀퐀퐀, 퐀퐀ퟐퟐ퐯퐯,…, 퐀퐀풎풎−ퟏퟏ퐯퐯
(b) Inverted Krylov Basis (I-MATEX)
푲푲풎풎 퐀퐀−ퟏퟏ, 퐯퐯 = 퐯퐯, 퐀퐀−ퟏퟏ퐯퐯, 퐀퐀−ퟐퟐ 퐯퐯,…, 퐀퐀−풎풎+ퟏퟏ퐯퐯
Im
Re
Im
Re
0
0
(a)
(b)
Inverted Krylov subspace is more likely to capture these “important” eigenvalues
Eigenvalues of A: small magnitude of real components
Eigenvalues of A: large magnitude of real components 15
16. Rational Krylov subspace (R-MATEX)
(a) Standard Krylov Basis (MEXP):
푲푲풎풎 퐀퐀, 퐯퐯 = 퐯퐯, 퐀퐀퐀, 퐀퐀ퟐퟐ퐯퐯,…, 퐀퐀풎풎−ퟏퟏ퐯퐯
(c) Rational Krylov Basis (R-MATEX)
푲푲풎풎 (퐈퐈 − 훾훾퐀퐀)−ퟏퟏ, 퐯퐯 = 퐯퐯, (퐈퐈 − 훾훾퐀퐀)−ퟏퟏ퐯퐯, (퐈퐈 − 훾훾퐀퐀)−ퟐퟐ 퐯퐯,…, (퐈퐈 − 훾훾퐀퐀)−풎풎+ퟏퟏ퐯퐯
Im
Re
Im
Re
Eigenvalues of A: small magnitude of real components
Eigenvalues of A: large magnitude of real components
0
0
(a)
(c)
•Rational Krylov is still likely to capture these “important” eigenvalues
•More robust numerical property
16
17. Error trend of R-MATEX
Directly compute 푒푒ℎ퐀퐀
MEVP via R-MATEX
푒푒푒푒푒푒푒푒푒푒=|푒푒ℎ퐀퐀퐯퐯−퐕퐕퐦푒푒ℎ퐇퐇퐦푒푒1|vs. m vs. h
Error
17
18. Same Algorithm with Different Input Matrices
Still only one 퐋퐋, 퐔퐔 = 퐥퐥퐥_퐝퐝퐝퐝퐝퐝퐝퐝퐝퐝퐝퐝(퐗퐗퐗)
퐗퐗ퟏퟏ
퐗퐗ퟐퟐ
퐇퐇풎풎
MEXP
퐂퐂
퐆퐆
퐇퐇풎풎
I-MATEX
퐆퐆
퐂퐂
퐇퐇퐇푚푚−1
R-MATEX
퐂퐂+휸휸휸
퐂퐂
(퐈퐈−෩퐇퐇 푚푚−1)/휸휸
18
20. Problem #2: Initial Vector Change
MEVP=푒푒퐀퐀퐯퐯
Once 퐯퐯changes, we need to compute 푲푲풎풎for MEVP.
initial vector of
푲푲풎풎(퐈퐈−훾훾퐀퐀)−ퟏퟏ,퐯퐯
20
21. Problem #2: Initial Vector Change
changes when input sources cannot keep the previous trend
MEVP=푒푒퐀퐀퐯퐯
Once 퐯퐯changes, we need to compute 푲푲풎풎for MEVP.
In circuit solver,
퐱퐱푡푡+ℎ=푒푒ℎ퐀퐀(퐱퐱푡푡+퐅퐅푡푡,ℎ)−퐏퐏푡푡,ℎ
where
퐅퐅푡푡,ℎ=퐀퐀−ퟏퟏ퐛퐛푡푡+퐀퐀−ퟐퟐ퐛퐛푡푡+ℎ−퐛퐛푡푡 ℎ
initial vector of
푲푲풎풎(퐈퐈−훾훾퐀퐀)−ퟏퟏ,퐯퐯
initial vector 21
22. Problem #2: Initial Vector Change
MEVP=푒푒퐀퐀퐯퐯
Once 퐯퐯changes, we need to compute 푲푲풎풎for MEVP.
퐅퐅푡푡,ℎ=퐀퐀−ퟏퟏ퐛퐛푡푡+퐀퐀−ퟐퟐ퐛퐛푡푡+ℎ−퐛퐛푡푡 ℎ
initial vector of
푲푲풎풎(퐈퐈−훾훾퐀퐀)−ퟏퟏ,퐯퐯
A pulse input example,
•the dash lines are places
where initial vector changes
•“transition spot”
changes when input sources cannot keep the previous trend
22
23. Problem #2: Initial Vector Change
changes when input sources cannot keep the previous trend
MEVP=푒푒퐀퐀퐯퐯
Once 풗풗changes, we need to compute 푲푲풎풎for MEVP.
In circuit solver,
퐱퐱푡푡+ℎ=푒푒ℎ퐀퐀(퐱퐱푡푡+퐅퐅푡푡,ℎ)−퐏퐏푡푡,ℎ
where
퐅퐅푡푡,ℎ=퐀퐀−ퟏퟏ퐛퐛푡푡+퐀퐀−ퟐퟐ퐛퐛푡푡+ℎ−퐛퐛푡푡 ℎ
initial vector of
푲푲풎풎(퐈퐈−훾훾퐀퐀)−ퟏퟏ,퐯퐯
initial vector
Many input current sources in PDN make the initial vector change frequently, which triggers Krylovsubspace generations and consumes runtime (trouble maker).
23
25. Input sources, the trouble maker
A PDN with three input
current sources.
25
26. Input sources, the trouble maker
A PDN with three input
current sources.
26
27. Input sources, the trouble maker
Some definitions
Local Transition Spot (LTS):foroneinputsource,its transitionspots.
Global Transition Spot (GTS):theunionof all LTS.
Snapshot:foroneinputsource,the spot in GTS but not in LTS.
A PDN with three input
current sources.
27
28. Input sources, the trouble maker
Some definitions
Local Transition Spot (LTS):for one input source, its transition spots.
Global Transition Spot (GTS):theunionof all LTS
Snapshot:foroneinputsource,the spot in GTS but not in LTS
A PDN with three input
current sources.
Simulating circuit with input sources as a whole, GTS triggers Krylov subspace generations.
28
29. Input sources, the trouble maker
How about simulating the
circuit with individual
source, then sum them up
later by superposition?
A PDN with three input current sources.
Some definitions
Local Transition Spot (LTS):foroneinputsource,its transitionspots.
Global Transition Spot (GTS):theunionof all LTS.
Snapshot:foroneinputsource,the spot in GTS but not in LTS.
29
30. Reduce the Krylov subspace generation chances and reuse subspace
For one input source, LTS is much smaller than GTS.
Meanwhile, the snapshot is needed to keep track for later superposition.
Compute snapshot without extra Krylov subspace generations.
30
31. Reduce the Krylov subspace generation chances and reuse subspace
Given an previous solution x(t)
퐱퐱 푡푡
31
32. Reduce the Krylov subspace generation chances and reuse subspace
To compute the solution at snapshot 퐱퐱푡푡+ℎ1and 퐱퐱푡푡+ℎ2without Krylov subspace generations
퐱퐱 푡푡 + ℎ1
퐱퐱푡푡+ℎ2
ℎ1
ℎ2
32
33. Reduce the Krylov subspace generation chances and reuse subspace
Generate 퐕퐕퐦and 퐇퐇퐦at t
퐕퐕퐦, 퐇퐇풎풎
푡푡
33
34. Reduce the Krylov subspace generation chances and reuse subspace
Use 퐕퐕퐦,퐇퐇퐦and scaling hto h1, and h2for MEVP, until reach the next LTS
No matrix factorizations during this adaptive stepping! 퐱퐱 푡푡 + ℎ2 = ||퐯퐯||퐕퐕퐦푒푒ℎ2퐇퐇푚푚풆풆ퟏퟏ − 푷푷(푡푡, ℎퟐퟐ)
ℎ2
퐱퐱푡푡+ℎ1=||퐯퐯||퐕퐕퐦푒푒ℎ1퐇퐇푚푚풆풆ퟏퟏ−푷푷(풕풕,ℎ1)
ℎ1
퐕퐕퐦,퐇퐇풎풎
34
41. Contributions
New time-integration kernel is applied with improved Krylovsubspace-based MEVP approximations for PDNs
Adaptive time stepping without matrix re-factorization during the transient (stepping) simulation
This feature cannot be achieved in low order approximation strategy, e.g., trapezoidal (TR), due to the explicitly embeddedℎin 퐂퐂 ℎ+퐆퐆 2
Distributed computing framework
Decompose simulation task based on LTS, then do superposition using GTS and snapshot to form the final solution.
Explore the advantages of large time stepping, also reduce and reuse Krylov subspaces.
Results of IBM PG benchmarks
Compared to TR with fixed time step (10ps), the speedup of transient stepping is 13Xon average.
41