This document discusses strategies for parallelizing spectral methods. Spectral methods are global in nature due to their use of global basis functions, making them challenging to parallelize on fine-grained architectures. However, the document finds that spectral methods can be effectively parallelized. The main computational steps in spectral methods are the calculation of differential operators on functions and solving linear systems, both of which can exploit parallelism. Domain decomposition techniques may also help parallelize computations over non-Cartesian domains.
On Projected Newton Barrier Methods for Linear Programming and an Equivalence...SSA KPI
This document describes projected Newton barrier methods for solving linear programming problems. It begins by reviewing classical barrier function methods for nonlinear programming which apply a logarithmic transformation to inequality constraints. For linear programs, the transformed problem can be solved using a "projected Newton barrier" method. This method is shown to be equivalent to Karmarkar's projective method for a particular choice of the barrier parameter. Details are then given of a specific barrier algorithm and its implementation, along with numerical results on test problems. Implications for future developments in linear programming are discussed.
Performance Assessment of Polyphase Sequences Using Cyclic Algorithmrahulmonikasharma
Polyphase Sequences (known as P1, P2, Px, Frank) exist for a square integer length with good auto correlation properties are helpful in the several applications. Unlike the Barker and Binary Sequences which exist for certain length and exhibits a maximum of two digit merit factor. The Integrated Sidelobe level (ISL) is often used to define excellence of the autocorrelation properties of given Polyphase sequence. In this paper, we present the application of Cyclic Algorithm named CA which minimizes the ISL (Integrated Sidelobe Level) related metric which in turn improve the Merit factor to a greater extent is main thing in applications like RADAR, SONAR and communications. To illustrate the performance of the P1, P2, Px, Frank sequences when cyclic Algorithm is applied. we presented a number of examples for integer lengths. CA(Px) sequence exhibits the good Merit Factor among all the Polyphase sequences that are considered.
(If visualization is slow, please try downloading the file.)
Part 2 of a tutorial given in the Brazilian Physical Society meeting, ENFMC. Abstract: Density-functional theory (DFT) was developed 50 years ago, connecting fundamental quantum methods from early days of quantum mechanics to our days of computer-powered science. Today DFT is the most widely used method in electronic structure calculations. It helps moving forward materials sciences from a single atom to nanoclusters and biomolecules, connecting solid-state, quantum chemistry, atomic and molecular physics, biophysics and beyond. In this tutorial, I will try to clarify this pathway under a historical view, presenting the DFT pillars and its building blocks, namely, the Hohenberg-Kohn theorem, the Kohn-Sham scheme, the local density approximation (LDA) and generalized gradient approximation (GGA). I would like to open the black box misconception of the method, and present a more pedagogical and solid perspective on DFT.
COMPUTATIONAL PERFORMANCE OF QUANTUM PHASE ESTIMATION ALGORITHMcscpconf
A quantum computation problem is discussed in this paper. Many new features that make quantum computation superior to classical computation can be attributed to quantum coherence
effect, which depends on the phase of quantum coherent state. Quantum Fourier transform algorithm, the most commonly used algorithm, is introduced. And one of its most important
applications, phase estimation of quantum state based on quantum Fourier transform, is presented in details. The flow of phase estimation algorithm and the quantum circuit model are
shown. And the error of the output phase value, as well as the probability of measurement, is analysed. The probability distribution of the measuring result of phase value is presented and the computational efficiency is discussed.
COMPUTATIONAL PERFORMANCE OF QUANTUM PHASE ESTIMATION ALGORITHMcsitconf
A quantum computation problem is discussed in this paper. Many new features that make
quantum computation superior to classical computation can be attributed to quantum coherence
effect, which depends on the phase of quantum coherent state. Quantum Fourier transform
algorithm, the most commonly used algorithm, is introduced. And one of its most important
applications, phase estimation of quantum state based on quantum Fourier transform, is
presented in details. The flow of phase estimation algorithm and the quantum circuit model are
shown. And the error of the output phase value, as well as the probability of measurement, is
analysed. The probability distribution of the measuring result of phase value is presented and
the computational efficiency is discussed.
This document summarizes work exploring the use of CUDA GPUs and Cell processors to accelerate a gravitational wave source-modelling application called the EMRI Teukolsky code. The code models gravitational waves generated by a small compact object orbiting a supermassive black hole. The authors implemented the code on a Cell processor and Nvidia GPU using CUDA. They were able to achieve over an order of magnitude speedup compared to a CPU implementation by leveraging the parallelism of these hardware accelerators.
THE RESEARCH OF QUANTUM PHASE ESTIMATION ALGORITHMIJCSEA Journal
This document discusses phase estimation in quantum computing. It begins by introducing quantum Fourier transforms and how they are important for algorithms like Shor's algorithm. It then describes the phase estimation algorithm in detail, including how it uses two registers to estimate the phase of a quantum state and how the inverse quantum Fourier transform improves this estimate. Simulation results are presented that show the probability distribution of the estimated phase converging to the true value and how the probability of success increases with more qubits while computational costs rise polynomially. The paper concludes that the optimal number of qubits balances high success probability and low costs for phase estimation.
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...IJDKP
Quantum clustering (QC), is a data clustering algorithm based on quantum mechanics which is
accomplished by substituting each point in a given dataset with a Gaussian. The width of the Gaussian is a
σ value, a hyper-parameter which can be manually defined and manipulated to suit the application.
Numerical methods are used to find all the minima of the quantum potential as they correspond to cluster
centers. Herein, we investigate the mathematical task of expressing and finding all the roots of the
exponential polynomial corresponding to the minima of a two-dimensional quantum potential. This is an
outstanding task because normally such expressions are impossible to solve analytically. However, we
prove that if the points are all included in a square region of size σ, there is only one minimum. This bound
is not only useful in the number of solutions to look for, by numerical means, it allows to to propose a new
numerical approach “per block”. This technique decreases the number of particles by approximating some
groups of particles to weighted particles. These findings are not only useful to the quantum clustering
problem but also for the exponential polynomials encountered in quantum chemistry, Solid-state Physics
and other applications.
On Projected Newton Barrier Methods for Linear Programming and an Equivalence...SSA KPI
This document describes projected Newton barrier methods for solving linear programming problems. It begins by reviewing classical barrier function methods for nonlinear programming which apply a logarithmic transformation to inequality constraints. For linear programs, the transformed problem can be solved using a "projected Newton barrier" method. This method is shown to be equivalent to Karmarkar's projective method for a particular choice of the barrier parameter. Details are then given of a specific barrier algorithm and its implementation, along with numerical results on test problems. Implications for future developments in linear programming are discussed.
Performance Assessment of Polyphase Sequences Using Cyclic Algorithmrahulmonikasharma
Polyphase Sequences (known as P1, P2, Px, Frank) exist for a square integer length with good auto correlation properties are helpful in the several applications. Unlike the Barker and Binary Sequences which exist for certain length and exhibits a maximum of two digit merit factor. The Integrated Sidelobe level (ISL) is often used to define excellence of the autocorrelation properties of given Polyphase sequence. In this paper, we present the application of Cyclic Algorithm named CA which minimizes the ISL (Integrated Sidelobe Level) related metric which in turn improve the Merit factor to a greater extent is main thing in applications like RADAR, SONAR and communications. To illustrate the performance of the P1, P2, Px, Frank sequences when cyclic Algorithm is applied. we presented a number of examples for integer lengths. CA(Px) sequence exhibits the good Merit Factor among all the Polyphase sequences that are considered.
(If visualization is slow, please try downloading the file.)
Part 2 of a tutorial given in the Brazilian Physical Society meeting, ENFMC. Abstract: Density-functional theory (DFT) was developed 50 years ago, connecting fundamental quantum methods from early days of quantum mechanics to our days of computer-powered science. Today DFT is the most widely used method in electronic structure calculations. It helps moving forward materials sciences from a single atom to nanoclusters and biomolecules, connecting solid-state, quantum chemistry, atomic and molecular physics, biophysics and beyond. In this tutorial, I will try to clarify this pathway under a historical view, presenting the DFT pillars and its building blocks, namely, the Hohenberg-Kohn theorem, the Kohn-Sham scheme, the local density approximation (LDA) and generalized gradient approximation (GGA). I would like to open the black box misconception of the method, and present a more pedagogical and solid perspective on DFT.
COMPUTATIONAL PERFORMANCE OF QUANTUM PHASE ESTIMATION ALGORITHMcscpconf
A quantum computation problem is discussed in this paper. Many new features that make quantum computation superior to classical computation can be attributed to quantum coherence
effect, which depends on the phase of quantum coherent state. Quantum Fourier transform algorithm, the most commonly used algorithm, is introduced. And one of its most important
applications, phase estimation of quantum state based on quantum Fourier transform, is presented in details. The flow of phase estimation algorithm and the quantum circuit model are
shown. And the error of the output phase value, as well as the probability of measurement, is analysed. The probability distribution of the measuring result of phase value is presented and the computational efficiency is discussed.
COMPUTATIONAL PERFORMANCE OF QUANTUM PHASE ESTIMATION ALGORITHMcsitconf
A quantum computation problem is discussed in this paper. Many new features that make
quantum computation superior to classical computation can be attributed to quantum coherence
effect, which depends on the phase of quantum coherent state. Quantum Fourier transform
algorithm, the most commonly used algorithm, is introduced. And one of its most important
applications, phase estimation of quantum state based on quantum Fourier transform, is
presented in details. The flow of phase estimation algorithm and the quantum circuit model are
shown. And the error of the output phase value, as well as the probability of measurement, is
analysed. The probability distribution of the measuring result of phase value is presented and
the computational efficiency is discussed.
This document summarizes work exploring the use of CUDA GPUs and Cell processors to accelerate a gravitational wave source-modelling application called the EMRI Teukolsky code. The code models gravitational waves generated by a small compact object orbiting a supermassive black hole. The authors implemented the code on a Cell processor and Nvidia GPU using CUDA. They were able to achieve over an order of magnitude speedup compared to a CPU implementation by leveraging the parallelism of these hardware accelerators.
THE RESEARCH OF QUANTUM PHASE ESTIMATION ALGORITHMIJCSEA Journal
This document discusses phase estimation in quantum computing. It begins by introducing quantum Fourier transforms and how they are important for algorithms like Shor's algorithm. It then describes the phase estimation algorithm in detail, including how it uses two registers to estimate the phase of a quantum state and how the inverse quantum Fourier transform improves this estimate. Simulation results are presented that show the probability distribution of the estimated phase converging to the true value and how the probability of success increases with more qubits while computational costs rise polynomially. The paper concludes that the optimal number of qubits balances high success probability and low costs for phase estimation.
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...IJDKP
Quantum clustering (QC), is a data clustering algorithm based on quantum mechanics which is
accomplished by substituting each point in a given dataset with a Gaussian. The width of the Gaussian is a
σ value, a hyper-parameter which can be manually defined and manipulated to suit the application.
Numerical methods are used to find all the minima of the quantum potential as they correspond to cluster
centers. Herein, we investigate the mathematical task of expressing and finding all the roots of the
exponential polynomial corresponding to the minima of a two-dimensional quantum potential. This is an
outstanding task because normally such expressions are impossible to solve analytically. However, we
prove that if the points are all included in a square region of size σ, there is only one minimum. This bound
is not only useful in the number of solutions to look for, by numerical means, it allows to to propose a new
numerical approach “per block”. This technique decreases the number of particles by approximating some
groups of particles to weighted particles. These findings are not only useful to the quantum clustering
problem but also for the exponential polynomials encountered in quantum chemistry, Solid-state Physics
and other applications.
Backstepping Controller Synthesis for Piecewise Polynomial Systems: A Sum of ...Behzad Samadi
This document outlines a method for synthesizing backstepping controllers for piecewise polynomial (PWP) systems using sum of squares (SOS) optimization. It introduces PWP systems in strict feedback form and formulates controller synthesis as a convex feasibility problem by constructing SOS Lyapunov functions. An example application to a single-link flexible-joint robot partitions variables to create regions and designs a stabilizing PWP controller.
Fundamentals of quantum computing part i revPRADOSH K. ROY
This document provides an introduction to the fundamentals of quantum computing. It discusses computational complexity classes such as P and NP and essential matrix algebra concepts like Hermitian, unitary, and normal matrices. It also contrasts the classical and quantum worlds. In the quantum world, quantum systems can exist in superposition states and qubits can represent more than just binary 0s and 1s. The document introduces the concept of a qubit register and how multiple qubits can be represented using tensor products. It discusses characteristics of quantum systems like superposition, Born's rule for probabilities, and the measurement postulate which causes wavefunction collapse.
FEEDBACK LINEARIZATION AND BACKSTEPPING CONTROLLERS FOR COUPLED TANKSieijjournal
This paper investigates the usage of some sophisticated and advanced nonlinear control algorithms in order to control a nonlinear Coupled Tanks System. The first control procedure is called the Feedback linearisation control (FLC), this type of control has been found a successful in achieving a global exponential asymptotic stability, with very short time response, no significant overshooting is recorded and with a negligible norm of the error. The second control procedure is the approaches of Back stepping control (BC) which is a recursive procedure that interlaces the choice of a Lyapunov function with the design of feedback control, from simulation results it shown that this method preserves tracking, robust control and it can often solve stabilization problems with less restrictive conditions may been countered in other methods. Finally both of the proposed control schemes guarantee the asymptoticstability of the closed loop system meeting trajectory tracking objectives.
This document discusses computing canonical labelings of digraphs. It begins by reviewing key concepts like digraphs, adjacency matrices, and isomorphisms. It notes that while many algorithms exist for undirected graphs, computing canonical labelings of digraphs remains challenging. The document then presents several new theoretical concepts for digraph canonical labeling, including mix diffusion degree sequences. It proposes using these concepts to systematically compute canonical labelings and proves several theorems to guide the algorithm. It describes four algorithms for calculating the canonical labeling of a digraph and notes the algorithms have been preliminarily verified through software testing.
This document summarizes a study on the effect of parameters of a geometric multigrid method on CPU time for solving one-dimensional problems related to heat transfer and fluid flow. The parameters studied include coarsening ratio of grids, number of inner iterations, number of grid levels, and tolerances. Finite difference methods were used to discretize partial differential equations for problems involving Poisson, advection-diffusion, and heat transfer equations. Comparisons were made between multigrid and single grid methods like Gauss-Seidel and TDMA. Results confirmed some literature findings and presented some new results on the effect of parameters on CPU time.
Computer Science
Active and Programmable Networks
Active safety systems
Ad Hoc & Sensor Network
Ad hoc networks for pervasive communications
Adaptive, autonomic and context-aware computing
Advance Computing technology and their application
Advanced Computing Architectures and New Programming Models
Advanced control and measurement
Aeronautical Engineering,
Agent-based middleware
Alert applications
Automotive, marine and aero-space control and all other control applications
Autonomic and self-managing middleware
Autonomous vehicle
Biochemistry
Bioinformatics
BioTechnology(Chemistry, Mathematics, Statistics, Geology)
Broadband and intelligent networks
Broadband wireless technologies
CAD/CAM/CAT/CIM
Call admission and flow/congestion control
Capacity planning and dimensioning
Changing Access to Patient Information
Channel capacity modelling and analysis
Civil Engineering,
Cloud Computing and Applications
Collaborative applications
Communication application
Communication architectures for pervasive computing
Communication systems
Computational intelligence
Computer and microprocessor-based control
Computer Architecture and Embedded Systems
Computer Business
Computer Sciences and Applications
Computer Vision
Computer-based information systems in health care
Computing Ethics
Computing Practices & Applications
Congestion and/or Flow Control
Content Distribution
Context-awareness and middleware
Creativity in Internet management and retailing
Cross-layer design and Physical layer based issue
Cryptography
Data Base Management
Data fusion
Data Mining
Data retrieval
Data Storage Management
Decision analysis methods
Decision making
Digital Economy and Digital Divide
Digital signal processing theory
Distributed Sensor Networks
Drives automation
Drug Design,
Drug Development
DSP implementation
E-Business
E-Commerce
E-Government
Electronic transceiver device for Retail Marketing Industries
Electronics Engineering,
Embeded Computer System
Emerging advances in business and its applications
Emerging signal processing areas
Enabling technologies for pervasive systems
Energy-efficient and green pervasive computing
Environmental Engineering,
Estimation and identification techniques
Evaluation techniques for middleware solutions
Event-based, publish/subscribe, and message-oriented middleware
Evolutionary computing and intelligent systems
Expert approaches
Facilities planning and management
Flexible manufacturing systems
Formal methods and tools for designing
Fuzzy algorithms
Fuzzy logics
GPS and location-based app
A NEW PARALLEL ALGORITHM FOR COMPUTING MINIMUM SPANNING TREEijscmc
Computing the minimum spanning tree of the graph is one of the fundamental computational problems. In
this paper, we present a new parallel algorithm for computing the minimum spanning tree of an undirected
weighted graph with n vertices and m edges. This algorithm uses the cluster techniques to reduce the
number of processors by fraction 1/f (n) and the parallel work by the fraction O ( 1 lo g ( f ( n )) ),where f (n) is an
arbitrary function. In the case f (n) =1, the algorithm runs in logarithmic-time and use super linear work on
EREWPRAM model. In general, the proposed algorithm is the simplest one.
On selection of periodic kernels parameters in time series predictioncsandit
This document discusses parameter selection for periodic kernels used in time series prediction. Periodic kernels are a type of kernel function used in kernel regression to perform nonparametric time series prediction. The document examines how the parameters of two periodic kernels - the first periodic kernel (FPK) and second periodic kernel (SPK) - influence prediction error. It presents an easy methodology for finding parameter values based on grid search. This methodology was tested on benchmark and real datasets and showed satisfactory results.
Development, Optimization, and Analysis of Cellular Automaton Algorithms to S...IRJET Journal
This document summarizes research on using cellular automaton algorithms to solve stochastic partial differential equations (SPDEs). It proposes a finite-difference method to approximate an SPDE modeling a random walk with angular diffusion. A Monte Carlo algorithm is also developed for comparison. Analysis finds a moderate correlation between the two methods, suggesting the finite-difference approach is reasonably accurate. It also identifies an inverse-square relationship between variables, linking to a foundational stochastic analysis concept. The research concludes the finite-difference method shows promise for approximating SPDEs while considering boundary conditions.
A Counterexample to the Forward Recursion in Fuzzy Critical Path Analysis Und...ijfls
This document presents a counterexample demonstrating that the fuzzy forward recursion method for determining critical paths does not always produce results consistent with the extension principle when discrete fuzzy sets are used to represent activity durations.
The document first provides background on fuzzy sets and critical path analysis. It then presents a proposition stating that the membership function for fuzzy critical path lengths can be determined by taking the maximum of the minimum membership values across all activity durations in each configuration.
The document goes on to present a counterexample using a simple series-parallel network with 18 configurations. It shows that applying the fuzzy forward recursion produces a different membership value for one critical path length compared to directly applying the extension principle. This difference proves the fuzzy forward
Low Power Adaptive FIR Filter Based on Distributed ArithmeticIJERA Editor
This paper aims at implementation of a low power adaptive FIR filter based on distributed arithmetic (DA) with
low power, high throughput, and low area. Least Mean Square (LMS) Algorithm is used to update the weight
and decrease the mean square error between the current filter output and the desired response. The pipelined
Distributed Arithmetic table reduces switching activity and hence it reduces power. The power consumption is
reduced by keeping bit-clock used in carry-save accumulation much faster than clock of rest of the operations.
We have implemented it in Quartus II and found that there is a reduction in the total power and the core dynamic
power by 31.31% and 100.24% respectively when compared with the architecture without DA table
This poster was created in LaTeX on a Dell Inspiron laptop with a Linux Fedora Core 4 operating system. The background image and the animation snapshots are dxf meshes of elastic waveform solutions, rendered on a Windows machine using 3D Studio Max.
International Journal of Mathematics and Statistics Invention (IJMSI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJMSI publishes research articles and reviews within the whole field Mathematics and Statistics, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Computational Method to Solve the Partial Differential Equations (PDEs)Dr. Khurram Mehboob
This document discusses various computational methods for solving partial differential equations (PDEs) using MATLAB. It begins by introducing three types of PDEs - elliptic, parabolic, and hyperbolic - and provides examples of each. It then describes explicit methods like the Forward Time Centered Space (FTCS) method, Lax method, and Crank-Nicolson (CTCS) method for solving the advection equation. The document provides MATLAB code implementing these methods for a test case of solving the advection equation modeling a square wave.
HEATED WIND PARTICLE’S BEHAVIOURAL STUDY BY THE CONTINUOUS WAVELET TRANSFORM ...cscpconf
Nowadays Continuous Wavelet Transform (CWT) as well as Fractal analysis is generally used for the Signal and Image processing application purpose. Our current work extends the field of application in case of CWT as well as Fractal analysis by applying it in case of the agitated wind particle’s behavioral study. In this current work in case of the agitated wind particle, we have mathematically showed that the wind particle’s movement exhibits the “Uncorrelated” characteristics during the convectional flow of it. It is also demonstrated here by the Continuous Wavelet Transform (CWT) as well as the Fractal analysis with matlab 7.12 version
(If visualization is slow, please try downloading the file.)
Part 1 of a tutorial given in the Brazilian Physical Society meeting, ENFMC. Abstract: Density-functional theory (DFT) was developed 50 years ago, connecting fundamental quantum methods from early days of quantum mechanics to our days of computer-powered science. Today DFT is the most widely used method in electronic structure calculations. It helps moving forward materials sciences from a single atom to nanoclusters and biomolecules, connecting solid-state, quantum chemistry, atomic and molecular physics, biophysics and beyond. In this tutorial, I will try to clarify this pathway under a historical view, presenting the DFT pillars and its building blocks, namely, the Hohenberg-Kohn theorem, the Kohn-Sham scheme, the local density approximation (LDA) and generalized gradient approximation (GGA). I would like to open the black box misconception of the method, and present a more pedagogical and solid perspective on DFT.
This document proposes a stochastic modeling approach to analyze the time-domain variability of general linear systems with uncertain parameters. It uses a polynomial chaos expansion of the scattering parameters to build an "augmented system" described by a deterministic matrix. The Galerkin projection method is used to relate the polynomial chaos coefficients of the input/output port signals. A Vector Fitting algorithm then generates a stable and passive state-space model of the augmented system. This allows time-domain variability analysis to be performed with one simulation, demonstrating computational efficiency over conventional Monte Carlo methods. The approach is validated on a microstrip bandstop filter with random width and permittivity parameters.
A New Neural Network For Solving Linear Programming ProblemsJody Sullivan
This document summarizes a research paper that proposes a new type of neural network for solving linear programming problems in real time. The key points are:
1) The paper introduces a novel energy function that transforms linear programming into a system of nonlinear differential equations, allowing the problem to be solved online by a simplified neural network.
2) The proposed neural network architecture contains only one single artificial neuron with adaptive synaptic weights, making it suitable for low-cost analog VLSI implementations.
3) Extensive computer simulations demonstrate the correctness and performance of the proposed neural network approach for solving linear programming problems in real time.
This paper proposes a method for adapting the dictionary elements in kernel-based nonlinear adaptive filtering algorithms. The dictionary contains a subset of input vectors that are used to approximate the nonlinear system. Typically, elements are added to the dictionary but never removed or adapted. The proposed method considers dictionary elements as adjustable model parameters that can be optimized to minimize the instantaneous output error, while maintaining coherence to control complexity. Gradient-based adaptation is derived for polynomial and radial basis kernels. Dictionary adaptation is incorporated into Kernel Recursive Least Squares, Kernel Normalized Least Mean Squares, and Kernel Affine Projection algorithms. Experiments on simulated and real data demonstrate that dictionary adaptation can reduce error or dictionary size compared to non-adaptive methods.
Erwin. e. obermayer k._schulten. k. _1992: self-organising maps_stationary st...ArchiLab 7
This document summarizes research investigating how the shape of neighborhood functions affects convergence rates and the presence of metastable states in Kohonen's self-organizing feature map algorithm. The key findings are:
1) For neighborhood functions that are convex over a large interval, there exist no metastable states, while other functions allow metastable states regardless of parameters.
2) For Gaussian functions, there is a threshold width above which metastable states cannot exist.
3) Convergence is fastest using functions that are convex over a large range but differ greatly between neighbors, such as Gaussian functions with width near the number of neurons. Metastable states and neighborhood function shape strongly influence convergence time.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The numerical solution of Huxley equation by the use of two finite difference methods is done. The first one is the explicit scheme and the second one is the Crank-Nicholson scheme. The comparison between the two methods showed that the explicit scheme is easier and has faster convergence while the Crank-Nicholson scheme is more accurate. In addition, the stability analysis using Fourier (von Neumann) method of two schemes is investigated. The resulting analysis showed that the first scheme
is conditionally stable if, r ≤ 2 − aβ∆t , ∆t ≤ 2(∆x)2 and the second
scheme is unconditionally stable.
Backstepping Controller Synthesis for Piecewise Polynomial Systems: A Sum of ...Behzad Samadi
This document outlines a method for synthesizing backstepping controllers for piecewise polynomial (PWP) systems using sum of squares (SOS) optimization. It introduces PWP systems in strict feedback form and formulates controller synthesis as a convex feasibility problem by constructing SOS Lyapunov functions. An example application to a single-link flexible-joint robot partitions variables to create regions and designs a stabilizing PWP controller.
Fundamentals of quantum computing part i revPRADOSH K. ROY
This document provides an introduction to the fundamentals of quantum computing. It discusses computational complexity classes such as P and NP and essential matrix algebra concepts like Hermitian, unitary, and normal matrices. It also contrasts the classical and quantum worlds. In the quantum world, quantum systems can exist in superposition states and qubits can represent more than just binary 0s and 1s. The document introduces the concept of a qubit register and how multiple qubits can be represented using tensor products. It discusses characteristics of quantum systems like superposition, Born's rule for probabilities, and the measurement postulate which causes wavefunction collapse.
FEEDBACK LINEARIZATION AND BACKSTEPPING CONTROLLERS FOR COUPLED TANKSieijjournal
This paper investigates the usage of some sophisticated and advanced nonlinear control algorithms in order to control a nonlinear Coupled Tanks System. The first control procedure is called the Feedback linearisation control (FLC), this type of control has been found a successful in achieving a global exponential asymptotic stability, with very short time response, no significant overshooting is recorded and with a negligible norm of the error. The second control procedure is the approaches of Back stepping control (BC) which is a recursive procedure that interlaces the choice of a Lyapunov function with the design of feedback control, from simulation results it shown that this method preserves tracking, robust control and it can often solve stabilization problems with less restrictive conditions may been countered in other methods. Finally both of the proposed control schemes guarantee the asymptoticstability of the closed loop system meeting trajectory tracking objectives.
This document discusses computing canonical labelings of digraphs. It begins by reviewing key concepts like digraphs, adjacency matrices, and isomorphisms. It notes that while many algorithms exist for undirected graphs, computing canonical labelings of digraphs remains challenging. The document then presents several new theoretical concepts for digraph canonical labeling, including mix diffusion degree sequences. It proposes using these concepts to systematically compute canonical labelings and proves several theorems to guide the algorithm. It describes four algorithms for calculating the canonical labeling of a digraph and notes the algorithms have been preliminarily verified through software testing.
This document summarizes a study on the effect of parameters of a geometric multigrid method on CPU time for solving one-dimensional problems related to heat transfer and fluid flow. The parameters studied include coarsening ratio of grids, number of inner iterations, number of grid levels, and tolerances. Finite difference methods were used to discretize partial differential equations for problems involving Poisson, advection-diffusion, and heat transfer equations. Comparisons were made between multigrid and single grid methods like Gauss-Seidel and TDMA. Results confirmed some literature findings and presented some new results on the effect of parameters on CPU time.
Computer Science
Active and Programmable Networks
Active safety systems
Ad Hoc & Sensor Network
Ad hoc networks for pervasive communications
Adaptive, autonomic and context-aware computing
Advance Computing technology and their application
Advanced Computing Architectures and New Programming Models
Advanced control and measurement
Aeronautical Engineering,
Agent-based middleware
Alert applications
Automotive, marine and aero-space control and all other control applications
Autonomic and self-managing middleware
Autonomous vehicle
Biochemistry
Bioinformatics
BioTechnology(Chemistry, Mathematics, Statistics, Geology)
Broadband and intelligent networks
Broadband wireless technologies
CAD/CAM/CAT/CIM
Call admission and flow/congestion control
Capacity planning and dimensioning
Changing Access to Patient Information
Channel capacity modelling and analysis
Civil Engineering,
Cloud Computing and Applications
Collaborative applications
Communication application
Communication architectures for pervasive computing
Communication systems
Computational intelligence
Computer and microprocessor-based control
Computer Architecture and Embedded Systems
Computer Business
Computer Sciences and Applications
Computer Vision
Computer-based information systems in health care
Computing Ethics
Computing Practices & Applications
Congestion and/or Flow Control
Content Distribution
Context-awareness and middleware
Creativity in Internet management and retailing
Cross-layer design and Physical layer based issue
Cryptography
Data Base Management
Data fusion
Data Mining
Data retrieval
Data Storage Management
Decision analysis methods
Decision making
Digital Economy and Digital Divide
Digital signal processing theory
Distributed Sensor Networks
Drives automation
Drug Design,
Drug Development
DSP implementation
E-Business
E-Commerce
E-Government
Electronic transceiver device for Retail Marketing Industries
Electronics Engineering,
Embeded Computer System
Emerging advances in business and its applications
Emerging signal processing areas
Enabling technologies for pervasive systems
Energy-efficient and green pervasive computing
Environmental Engineering,
Estimation and identification techniques
Evaluation techniques for middleware solutions
Event-based, publish/subscribe, and message-oriented middleware
Evolutionary computing and intelligent systems
Expert approaches
Facilities planning and management
Flexible manufacturing systems
Formal methods and tools for designing
Fuzzy algorithms
Fuzzy logics
GPS and location-based app
A NEW PARALLEL ALGORITHM FOR COMPUTING MINIMUM SPANNING TREEijscmc
Computing the minimum spanning tree of the graph is one of the fundamental computational problems. In
this paper, we present a new parallel algorithm for computing the minimum spanning tree of an undirected
weighted graph with n vertices and m edges. This algorithm uses the cluster techniques to reduce the
number of processors by fraction 1/f (n) and the parallel work by the fraction O ( 1 lo g ( f ( n )) ),where f (n) is an
arbitrary function. In the case f (n) =1, the algorithm runs in logarithmic-time and use super linear work on
EREWPRAM model. In general, the proposed algorithm is the simplest one.
On selection of periodic kernels parameters in time series predictioncsandit
This document discusses parameter selection for periodic kernels used in time series prediction. Periodic kernels are a type of kernel function used in kernel regression to perform nonparametric time series prediction. The document examines how the parameters of two periodic kernels - the first periodic kernel (FPK) and second periodic kernel (SPK) - influence prediction error. It presents an easy methodology for finding parameter values based on grid search. This methodology was tested on benchmark and real datasets and showed satisfactory results.
Development, Optimization, and Analysis of Cellular Automaton Algorithms to S...IRJET Journal
This document summarizes research on using cellular automaton algorithms to solve stochastic partial differential equations (SPDEs). It proposes a finite-difference method to approximate an SPDE modeling a random walk with angular diffusion. A Monte Carlo algorithm is also developed for comparison. Analysis finds a moderate correlation between the two methods, suggesting the finite-difference approach is reasonably accurate. It also identifies an inverse-square relationship between variables, linking to a foundational stochastic analysis concept. The research concludes the finite-difference method shows promise for approximating SPDEs while considering boundary conditions.
A Counterexample to the Forward Recursion in Fuzzy Critical Path Analysis Und...ijfls
This document presents a counterexample demonstrating that the fuzzy forward recursion method for determining critical paths does not always produce results consistent with the extension principle when discrete fuzzy sets are used to represent activity durations.
The document first provides background on fuzzy sets and critical path analysis. It then presents a proposition stating that the membership function for fuzzy critical path lengths can be determined by taking the maximum of the minimum membership values across all activity durations in each configuration.
The document goes on to present a counterexample using a simple series-parallel network with 18 configurations. It shows that applying the fuzzy forward recursion produces a different membership value for one critical path length compared to directly applying the extension principle. This difference proves the fuzzy forward
Low Power Adaptive FIR Filter Based on Distributed ArithmeticIJERA Editor
This paper aims at implementation of a low power adaptive FIR filter based on distributed arithmetic (DA) with
low power, high throughput, and low area. Least Mean Square (LMS) Algorithm is used to update the weight
and decrease the mean square error between the current filter output and the desired response. The pipelined
Distributed Arithmetic table reduces switching activity and hence it reduces power. The power consumption is
reduced by keeping bit-clock used in carry-save accumulation much faster than clock of rest of the operations.
We have implemented it in Quartus II and found that there is a reduction in the total power and the core dynamic
power by 31.31% and 100.24% respectively when compared with the architecture without DA table
This poster was created in LaTeX on a Dell Inspiron laptop with a Linux Fedora Core 4 operating system. The background image and the animation snapshots are dxf meshes of elastic waveform solutions, rendered on a Windows machine using 3D Studio Max.
International Journal of Mathematics and Statistics Invention (IJMSI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJMSI publishes research articles and reviews within the whole field Mathematics and Statistics, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Computational Method to Solve the Partial Differential Equations (PDEs)Dr. Khurram Mehboob
This document discusses various computational methods for solving partial differential equations (PDEs) using MATLAB. It begins by introducing three types of PDEs - elliptic, parabolic, and hyperbolic - and provides examples of each. It then describes explicit methods like the Forward Time Centered Space (FTCS) method, Lax method, and Crank-Nicolson (CTCS) method for solving the advection equation. The document provides MATLAB code implementing these methods for a test case of solving the advection equation modeling a square wave.
HEATED WIND PARTICLE’S BEHAVIOURAL STUDY BY THE CONTINUOUS WAVELET TRANSFORM ...cscpconf
Nowadays Continuous Wavelet Transform (CWT) as well as Fractal analysis is generally used for the Signal and Image processing application purpose. Our current work extends the field of application in case of CWT as well as Fractal analysis by applying it in case of the agitated wind particle’s behavioral study. In this current work in case of the agitated wind particle, we have mathematically showed that the wind particle’s movement exhibits the “Uncorrelated” characteristics during the convectional flow of it. It is also demonstrated here by the Continuous Wavelet Transform (CWT) as well as the Fractal analysis with matlab 7.12 version
(If visualization is slow, please try downloading the file.)
Part 1 of a tutorial given in the Brazilian Physical Society meeting, ENFMC. Abstract: Density-functional theory (DFT) was developed 50 years ago, connecting fundamental quantum methods from early days of quantum mechanics to our days of computer-powered science. Today DFT is the most widely used method in electronic structure calculations. It helps moving forward materials sciences from a single atom to nanoclusters and biomolecules, connecting solid-state, quantum chemistry, atomic and molecular physics, biophysics and beyond. In this tutorial, I will try to clarify this pathway under a historical view, presenting the DFT pillars and its building blocks, namely, the Hohenberg-Kohn theorem, the Kohn-Sham scheme, the local density approximation (LDA) and generalized gradient approximation (GGA). I would like to open the black box misconception of the method, and present a more pedagogical and solid perspective on DFT.
This document proposes a stochastic modeling approach to analyze the time-domain variability of general linear systems with uncertain parameters. It uses a polynomial chaos expansion of the scattering parameters to build an "augmented system" described by a deterministic matrix. The Galerkin projection method is used to relate the polynomial chaos coefficients of the input/output port signals. A Vector Fitting algorithm then generates a stable and passive state-space model of the augmented system. This allows time-domain variability analysis to be performed with one simulation, demonstrating computational efficiency over conventional Monte Carlo methods. The approach is validated on a microstrip bandstop filter with random width and permittivity parameters.
A New Neural Network For Solving Linear Programming ProblemsJody Sullivan
This document summarizes a research paper that proposes a new type of neural network for solving linear programming problems in real time. The key points are:
1) The paper introduces a novel energy function that transforms linear programming into a system of nonlinear differential equations, allowing the problem to be solved online by a simplified neural network.
2) The proposed neural network architecture contains only one single artificial neuron with adaptive synaptic weights, making it suitable for low-cost analog VLSI implementations.
3) Extensive computer simulations demonstrate the correctness and performance of the proposed neural network approach for solving linear programming problems in real time.
This paper proposes a method for adapting the dictionary elements in kernel-based nonlinear adaptive filtering algorithms. The dictionary contains a subset of input vectors that are used to approximate the nonlinear system. Typically, elements are added to the dictionary but never removed or adapted. The proposed method considers dictionary elements as adjustable model parameters that can be optimized to minimize the instantaneous output error, while maintaining coherence to control complexity. Gradient-based adaptation is derived for polynomial and radial basis kernels. Dictionary adaptation is incorporated into Kernel Recursive Least Squares, Kernel Normalized Least Mean Squares, and Kernel Affine Projection algorithms. Experiments on simulated and real data demonstrate that dictionary adaptation can reduce error or dictionary size compared to non-adaptive methods.
Erwin. e. obermayer k._schulten. k. _1992: self-organising maps_stationary st...ArchiLab 7
This document summarizes research investigating how the shape of neighborhood functions affects convergence rates and the presence of metastable states in Kohonen's self-organizing feature map algorithm. The key findings are:
1) For neighborhood functions that are convex over a large interval, there exist no metastable states, while other functions allow metastable states regardless of parameters.
2) For Gaussian functions, there is a threshold width above which metastable states cannot exist.
3) Convergence is fastest using functions that are convex over a large range but differ greatly between neighbors, such as Gaussian functions with width near the number of neurons. Metastable states and neighborhood function shape strongly influence convergence time.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The numerical solution of Huxley equation by the use of two finite difference methods is done. The first one is the explicit scheme and the second one is the Crank-Nicholson scheme. The comparison between the two methods showed that the explicit scheme is easier and has faster convergence while the Crank-Nicholson scheme is more accurate. In addition, the stability analysis using Fourier (von Neumann) method of two schemes is investigated. The resulting analysis showed that the first scheme
is conditionally stable if, r ≤ 2 − aβ∆t , ∆t ≤ 2(∆x)2 and the second
scheme is unconditionally stable.
TEST GENERATION FOR ANALOG AND MIXED-SIGNAL CIRCUITS USING HYBRID SYSTEM MODELSVLSICS Design
In this paper we propose an approach for testing time-domain properties of analog and mixed-signal circuits. The approach is based on an adaptation of a recently developed test generation technique for hybrid systems and a new concept of coverage for such systems. The approach is illustrated by its application to some benchmark circuits.
Test Generation for Analog and Mixed-Signal Circuits Using Hybrid System Mode...VLSICS Design
In this paper we propose an approach for testing time-domain properties of analog and mixed-signal circuits. The approach is based on an adaptation of a recently developed test generation technique for hybrid systems and a new concept of coverage for such systems. The approach is illustrated by its application to some benchmark circuits.
Robust Image Denoising in RKHS via Orthogonal Matching PursuitPantelis Bouboulis
We present a robust method for the image denoising task based on kernel ridge regression and sparse modeling. Added noise is assumed to consist of two parts. One part is impulse noise assumed to be sparse (outliers), while the other part is bounded noise. The noisy image is divided into small regions of interest, whose pixels are regarded as points of a two-dimensional surface. A kernel based ridge regression method, whose parameters are selected adaptively, is employed to fit the data, whereas the outliers are detected via the use of the increasingly popular orthogonal matching pursuit (OMP) algorithm. To this end, a new variant of the OMP rationale is employed that has the additional advantage to automatically terminate, when all outliers have been selected.
APPLICATION OF PARTICLE SWARM OPTIMIZATION TO MICROWAVE TAPERED MICROSTRIP LINEScseij
This document discusses using Particle Swarm Optimization (PSO) to design a tapered microstrip transmission line to match an arbitrary load to a 50Ω line. PSO was used to optimize the impedances of a three section tapered line to minimize reflections. Simulations found impedances that gave good matching at 5GHz. PSO converged to solutions in under 1000 iterations. This demonstrates PSO's effectiveness in solving multi-objective microwave engineering optimization problems.
Application of particle swarm optimization to microwave tapered microstrip linescseij
Application of metaheuristic algorithms has been of continued interest in the field of electrical engineering
because of their powerful features. In this work special design is done for a tapered transmission line used
for matching an arbitrary real load to a 50Ω line. The problem at hand is to match this arbitray load to 50
Ω line using three section tapered transmission line with impedances in decreasing order from the load. So
the problem becomes optimizing an equation with three unknowns with various conditions. The optimized
values are obtained using Particle Swarm Optimization. It can easily be shown that PSO is very strong in
solving this kind of multiobjective optimization problems.
The document discusses the Least-Mean Square (LMS) algorithm. It begins by introducing LMS as the first linear adaptive filtering algorithm developed by Widrow and Hoff in 1960. It then describes the filtering structure of LMS, modeling an unknown dynamic system using a linear neuron model and adjusting weights based on an error signal. Finally, it summarizes the LMS algorithm, outlines its virtues like computational simplicity and robustness, and notes its primary limitation is slow convergence for high-dimensional problems.
An Algorithm For Vector Quantizer DesignAngie Miller
The document presents an algorithm for designing vector quantizers. The algorithm is efficient, intuitive, and can be used for quantizers with general distortion measures and large block lengths. It is based on Lloyd's approach but does not require differentiation, making it applicable even when the data distribution has discrete components. The algorithm finds quantizers that meet necessary optimality conditions. Examples show it converges well and finds near-optimal quantizers for memoryless Gaussian sources. It is also used successfully to quantize LPC speech parameters with a complicated distortion measure.
The document presents research on using a binary reproducing kernel Hilbert space (RKHS) approach to solve a Wick-type stochastic Korteweg-de Vries (KdV) equation with variable coefficients. It introduces the stochastic KdV equation model and discusses previous work analyzing it. The research aims to formulate white noise functional solutions for the stochastic KdV equations by applying Hermite transform, white noise theory, and binary RKHS. It explores representing the exact solution in a reproducing kernel space and investigating uniform convergence of approximate solutions.
Haar wavelet method for solving coupled system of fractional order partial d...nooriasukmaningtyas
This paper deal with the numerical method, based on the operational matrices of the Haar wavelet orthonormal functions approach to approximate solutions to a class of coupled systems of time-fractional order partial differential equations (FPDEs.). By introducing the fractional derivative of the Caputo sense, to avoid the tedious calculations and to promote the study of wavelets to beginners, we use the integration property of this method with the aid of the aforesaid orthogonal matrices which convert the coupled system under some consideration into an easily algebraic system of Lyapunov or Sylvester equation type. The advantage of the present method, including the simple computation, computer-oriented, which requires less space to store, timeefficient, and it can be applied for solving integer (fractional) order partial differential equations. Some specific and illustrating examples have been given; figures are used to show the efficiency, as well as the accuracy of the, achieved approximated results. All numerical calculations in this paper have been carried out with MATLAB.
Oscar Nieves (11710858) Computational Physics Project - Inverted PendulumOscar Nieves
This document describes a numerical simulation of an inverted pendulum system created in MATLAB using a 4th order Runge-Kutta algorithm. The simulation models an inverted pendulum attached to a horizontally moving cart. Forces like air drag and friction are included, and parameters like mass, pendulum length, and initial conditions can be varied. Small changes to initial conditions can lead to large differences in motion, demonstrating the system's chaotic behavior. The document also outlines the methodology for adapting the Runge-Kutta algorithm to solve systems of coupled differential equations.
The document describes efficient algorithms for projecting a vector onto the l1-ball (sum of absolute values being less than a threshold). It presents two methods: 1) An exact projection algorithm that runs in expected O(n) time, where n is the dimension. 2) A method for vectors with k perturbed elements outside the l1-ball, which projects in O(k log n) time. It demonstrates these algorithms outperform interior point methods on various learning tasks, providing models with high sparsity.
The document describes efficient algorithms for projecting a vector onto the l1-ball (sum of absolute values) constraint. It presents two methods: 1) An exact projection algorithm that runs in O(n) expected time, where n is the dimension. 2) A method for vectors with k perturbed elements outside the l1-ball, which projects in O(k log n) time. It demonstrates these algorithms outperform interior point methods on various learning tasks, providing models with high sparsity.
A Mixed Binary-Real NSGA II Algorithm Ensuring Both Accuracy and Interpretabi...IJECEIAES
In this work, a Neuro-Fuzzy Controller network, called NFC that implements a Mamdani fuzzy inference system is proposed. This network includes neurons able to perform fundamental fuzzy operations. Connections between neurons are weighted through binary and real weights. Then a mixed binaryreal Non dominated Sorting Genetic Algorithm II (NSGA II) is used to perform both accuracy and interpretability of the NFC by minimizing two objective functions; one objective relates to the number of rules, for compactness, while the second is the mean square error, for accuracy. In order to preserve interpretability of fuzzy rules during the optimization process, some constraints are imposed. The approach is tested on two control examples: a single input single output (SISO) system and a multivariable (MIMO) system.
The document discusses optimizing a variational quantum classifier (VQC) using the quantum natural simultaneous perturbation stochastic algorithm (QN-SPSA). It provides background on VQCs, describing their typical circuit architecture including an encoding circuit, ansatz, measurement, and optimization. It also reviews the classical SPSA optimizer and describes how the QN-SPSA approximates the quantum natural gradient using fewer evaluations than other natural gradient methods by estimating the quantum Fisher information matrix. The purpose of the work is to explore how many steps it takes for the QN-SPSA to converge on a small-scale VQC compared to similar optimizers like SPSA.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
Gas agency management system project report.pdfKamal Acharya
The project entitled "Gas Agency" is done to make the manual process easier by making it a computerized system for billing and maintaining stock. The Gas Agencies get the order request through phone calls or by personal from their customers and deliver the gas cylinders to their address based on their demand and previous delivery date. This process is made computerized and the customer's name, address and stock details are stored in a database. Based on this the billing for a customer is made simple and easier, since a customer order for gas can be accepted only after completing a certain period from the previous delivery. This can be calculated and billed easily through this. There are two types of delivery like domestic purpose use delivery and commercial purpose use delivery. The bill rate and capacity differs for both. This can be easily maintained and charged accordingly.
Generative AI Use cases applications solutions and implementation.pdfmahaffeycheryld
Generative AI solutions encompass a range of capabilities from content creation to complex problem-solving across industries. Implementing generative AI involves identifying specific business needs, developing tailored AI models using techniques like GANs and VAEs, and integrating these models into existing workflows. Data quality and continuous model refinement are crucial for effective implementation. Businesses must also consider ethical implications and ensure transparency in AI decision-making. Generative AI's implementation aims to enhance efficiency, creativity, and innovation by leveraging autonomous generation and sophisticated learning algorithms to meet diverse business challenges.
https://www.leewayhertz.com/generative-ai-use-cases-and-applications/
AI for Legal Research with applications, toolsmahaffeycheryld
AI applications in legal research include rapid document analysis, case law review, and statute interpretation. AI-powered tools can sift through vast legal databases to find relevant precedents and citations, enhancing research accuracy and speed. They assist in legal writing by drafting and proofreading documents. Predictive analytics help foresee case outcomes based on historical data, aiding in strategic decision-making. AI also automates routine tasks like contract review and due diligence, freeing up lawyers to focus on complex legal issues. These applications make legal research more efficient, cost-effective, and accessible.
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...IJECEIAES
Medical image analysis has witnessed significant advancements with deep learning techniques. In the domain of brain tumor segmentation, the ability to
precisely delineate tumor boundaries from magnetic resonance imaging (MRI)
scans holds profound implications for diagnosis. This study presents an ensemble convolutional neural network (CNN) with transfer learning, integrating
the state-of-the-art Deeplabv3+ architecture with the ResNet18 backbone. The
model is rigorously trained and evaluated, exhibiting remarkable performance
metrics, including an impressive global accuracy of 99.286%, a high-class accuracy of 82.191%, a mean intersection over union (IoU) of 79.900%, a weighted
IoU of 98.620%, and a Boundary F1 (BF) score of 83.303%. Notably, a detailed comparative analysis with existing methods showcases the superiority of
our proposed model. These findings underscore the model’s competence in precise brain tumor localization, underscoring its potential to revolutionize medical
image analysis and enhance healthcare outcomes. This research paves the way
for future exploration and optimization of advanced CNN models in medical
imaging, emphasizing addressing false positives and resource efficiency.
Software Engineering and Project Management - Introduction, Modeling Concepts...Prakhyath Rai
Introduction, Modeling Concepts and Class Modeling: What is Object orientation? What is OO development? OO Themes; Evidence for usefulness of OO development; OO modeling history. Modeling
as Design technique: Modeling, abstraction, The Three models. Class Modeling: Object and Class Concept, Link and associations concepts, Generalization and Inheritance, A sample class model, Navigation of class models, and UML diagrams
Building the Analysis Models: Requirement Analysis, Analysis Model Approaches, Data modeling Concepts, Object Oriented Analysis, Scenario-Based Modeling, Flow-Oriented Modeling, class Based Modeling, Creating a Behavioral Model.
Comparative analysis between traditional aquaponics and reconstructed aquapon...bijceesjournal
The aquaponic system of planting is a method that does not require soil usage. It is a method that only needs water, fish, lava rocks (a substitute for soil), and plants. Aquaponic systems are sustainable and environmentally friendly. Its use not only helps to plant in small spaces but also helps reduce artificial chemical use and minimizes excess water use, as aquaponics consumes 90% less water than soil-based gardening. The study applied a descriptive and experimental design to assess and compare conventional and reconstructed aquaponic methods for reproducing tomatoes. The researchers created an observation checklist to determine the significant factors of the study. The study aims to determine the significant difference between traditional aquaponics and reconstructed aquaponics systems propagating tomatoes in terms of height, weight, girth, and number of fruits. The reconstructed aquaponics system’s higher growth yield results in a much more nourished crop than the traditional aquaponics system. It is superior in its number of fruits, height, weight, and girth measurement. Moreover, the reconstructed aquaponics system is proven to eliminate all the hindrances present in the traditional aquaponics system, which are overcrowding of fish, algae growth, pest problems, contaminated water, and dead fish.
Build the Next Generation of Apps with the Einstein 1 Platform.
Rejoignez Philippe Ozil pour une session de workshops qui vous guidera à travers les détails de la plateforme Einstein 1, l'importance des données pour la création d'applications d'intelligence artificielle et les différents outils et technologies que Salesforce propose pour vous apporter tous les bénéfices de l'IA.
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...IJECEIAES
Climate change's impact on the planet forced the United Nations and governments to promote green energies and electric transportation. The deployments of photovoltaic (PV) and electric vehicle (EV) systems gained stronger momentum due to their numerous advantages over fossil fuel types. The advantages go beyond sustainability to reach financial support and stability. The work in this paper introduces the hybrid system between PV and EV to support industrial and commercial plants. This paper covers the theoretical framework of the proposed hybrid system including the required equation to complete the cost analysis when PV and EV are present. In addition, the proposed design diagram which sets the priorities and requirements of the system is presented. The proposed approach allows setup to advance their power stability, especially during power outages. The presented information supports researchers and plant owners to complete the necessary analysis while promoting the deployment of clean energy. The result of a case study that represents a dairy milk farmer supports the theoretical works and highlights its advanced benefits to existing plants. The short return on investment of the proposed approach supports the paper's novelty approach for the sustainable electrical system. In addition, the proposed system allows for an isolated power setup without the need for a transmission line which enhances the safety of the electrical network
Applications of artificial Intelligence in Mechanical Engineering.pdfAtif Razi
Historically, mechanical engineering has relied heavily on human expertise and empirical methods to solve complex problems. With the introduction of computer-aided design (CAD) and finite element analysis (FEA), the field took its first steps towards digitization. These tools allowed engineers to simulate and analyze mechanical systems with greater accuracy and efficiency. However, the sheer volume of data generated by modern engineering systems and the increasing complexity of these systems have necessitated more advanced analytical tools, paving the way for AI.
AI offers the capability to process vast amounts of data, identify patterns, and make predictions with a level of speed and accuracy unattainable by traditional methods. This has profound implications for mechanical engineering, enabling more efficient design processes, predictive maintenance strategies, and optimized manufacturing operations. AI-driven tools can learn from historical data, adapt to new information, and continuously improve their performance, making them invaluable in tackling the multifaceted challenges of modern mechanical engineering.
Digital Twins Computer Networking Paper Presentation.pptxaryanpankaj78
A Digital Twin in computer networking is a virtual representation of a physical network, used to simulate, analyze, and optimize network performance and reliability. It leverages real-time data to enhance network management, predict issues, and improve decision-making processes.
1. PARALLELISM IN SPECTRAL METHODS
C. CANUTO(I)
ABSTRACT - Several strategies of parallelism for spectral algorithms are discussed.
The investigation shows that, despite the intrinsic lack of locality of spectral
methods, they are amenable to parallel implementations, even on fine grain
architectures. Typical algorithms for the spectral approximation of the viscous,
incompressible Navier-Stokes equations serve as examples in the discussion.
SOMMARIO - Si discutono diverse strategie di parallelizzazione di algoritmi di tipo
spettrale. L'analisi mostra che i metodi spettrali possono essere efficacemente
implementati su architetture parallele, anche a grana fine, nonostante il loro
carattere non-locale. Nella discussione si usano a titolo di esempio alcuni noti
algoritmi spettrali per l'approssimazione delle equazioni di Navier-Stokes vi-
scose e incompressibili.
Introduction.
Since their origin in the late sixties, spectral methods in their modern form have
been designed and developed with the aim of solving problems, which could not
be tackled by more conventional numerical methods (finite differences, and later
finite elements). The direct simulation of turbulence for incompressible flows is
the most popularly known example of such applications: the range of phenomena
amenable to a satisfactory numerical simulation has widened during the years
(l) Dipartimento di Matematica, Universit~ di Parma, 43100 Parma, Italy and
Istituto di Analisi Numerica del C.N.R., Corso C. Alberto, 5 - 27100 Pavia, Italy.
Invited paper at the International Symposium on ~,Vector and Parallel Proces-
sors for Scientific Computation- 2~, held by the Accademia Nazionale dei Lincei and
IBM, Rome, September 1987.
2. 54 C. CANUTO:Parallelism in
under the twofold effect of the increase of the computers' power and the develop-
ment of sophisticated algorithms of spectral type. The simulation of the same
phenomena by other techniques would have required a computer power larger by
order of magnitudes, hence, it would not have been feasible on the currently
available machines (a discussion of the most significant achievements of spectral
methods in fluid dynamics can be found, e.g., in Chapter 1 of ref. [1]).
Since spectral methods have been constantly used in ~extreme>>applications,
their implementation has taken place on state-of-the-art computer architectures.
The vectorization of spectral algorithms was a fairly easy task. Nowadays, spec-
tral codes for fluid-dynamics run on vector supercomputers such as the Cray
family or the Cyber 205, taking full advantage of their pipeline architectures and
reaching rates ofvectorization well above 80% (we refer, e.g., to Appendix B in
ref. [1]).
On the contrary, the implementation of spectral algorithms on parallel com-
puters is still in its infancy. This is partly due to the fact that multiprocessor
supercomputers are only now becoming available to the scientific community.
But there is also a deeper motivation: it is not yet clear whether and how the
global character of spectral methods will efficiently fit into a highly granular
parallel architecture. Thus, a deep investigation - of both a theoretical and ex-
perimental nature - is needed. As a testimony of the present uncertainty on this
topic, we quote the point of view of researchers working at the development of a
multipurpose parallel supercomputer, especially tailored for fluid-dynamics ap-
plications, known as the Navier-Stokes Computer (NSC). This is a joint project
between Princeton University and the NASA Langley Research Center, aimed at
building a parallel supercomputer made up of a fairly small number of powerful
nodes. Each node has the performance of a class VI vector supercomputer; the
initial configuration will have 64 of such nodes. Despite the superior accuracy of
spectral methods over finite difference methods, the scientists involved in this
project have chosen to employ low-order finite differences at least in the initial
investigation on how well transition and turbulence algorithms can exploit the
NSC architecture. Indeed ,~the much greater communication demands of the
global discretization may well tip the balance in favor of the less accurate, but
simpler local discretizations>> ([ 12]).
Currently, a number of implementations of spectral algorithms on parallel
architectures is documented. Let us refer here to the work done at the IBM
European Center for Scientific and Engineering Computing (ECSEC) in Rome,
at the Nasa Langley Research Center by Erlebacher, Bokhari and Hussaini [5],
and at ONERA (France) by Leca and Sacchi-Landriani [11]. The IBM con-
3. Spectral Methods 55
tributions are described in detail by P. Sguazzero in a paper in this volume. The
latter contributions will be briefly reviewed in the present paper.
The purpose of this paper is to discuss where and to what extent it is possible
to introduce parallelism in spectral algorithms. We will also try to indicate which
communication networks are suitable for the implementation of spectral methods
on fine grain, local memory architectures.
1. Basic aspects of spectral methods.
Let us briefly review the fundamental properties of spectral methods for the
approximation of boundary value problems. We will focus on those aspects of the
methods which are more related to their implementation in a multiprocessor
environment. For complete details we refer, e.g., to refs. [1], [6], [15].
Let us assume that we are interested in approximating a boundary value
problem, which we write as
(1.1) j L(u)=f in S'2
+ boundary conditions on dl-2,
in a d-dimensional box I2 = /-/~i=1 (ai, hi). We approximate the solution u by a
finite expansion
(1.2) UN = ~ 1~lk ~k(X),
Ikl~N
where
k = (kl, ..., kd),
(1.3) 0k (X) = /-/~i=l ~I (Xi)"
Each ~m (i) is a smooth global basis function on (al, bi), satisfying the orthogonality
condition
bi
(1.4) f~(im) (x) ~p(1)(x) w (x) dx = c, ~mn
JI
a i
with respect to a weight function r In most applications, the one dimensional
basis functions are trigonometric polynomials in the space directions where a
periodicity boundary condition is enforced, and orthogonal algebraic polyno-
mials (Chebyshev, or Legendre polynomials) in the remaining directions.
4. 56 C. CANUTO: Parallelism in
The boundary value problem is discretized by a suitable projection process,
which can be represented as
(1.5)
a N e X N
(LN(uN), V)N = (f, V)N, VV e YN
Here XN is the space of trial functions, YN is the space of test functions, LN is an
approximation of the differential operator L and (u, V)N is an inner product,
which may depend upon the cut-off number N. In general, when XN ----YN and
the inner product is the L2(I2) inner product we speak ofa Galerkinmethod; this is
quite common for periodic boundary value problems. Otherwise, for non-
periodic boundary conditions, we have a tau-methodwhen the inner product is the
L2- inner product and YN is a space of test functions which do not individually
satisfy the boundary conditions, or a collocationmethodwhen the inner product is
an approximation of the L2(g2)-inner product based on a Gaussian quadrature
rule.
In order to have a genuine spectral method, the basis functions in the expa-
sion (1.2) must satisfy a supplementary property, in addition to the orthogonality
condition (1.4): if one expands a smooth function according to this basis, the
~Fourier>> coefficients of the function should decay at a rate which is a monotoni-
cally increasing divergent function of the degree of regularity of the function. This
occurs if we approximate a periodic function by the trigonometric system (if u
CS(0,2z), then ilk= 0(]kl-S)). The same property holds if we expand a non-
periodic function according to the eigenfunctions of a singular Sturm-Liouville
problem (such as Jacobi polynomials). The above mentioned property is known
as the spectralaccuracyproperty. When it is satisfied, one is in the condition to prove
an error estimate for the approximation (1.5) of problem (1.1) of the form
(1.6) Ilu-uNlIH~~< C(m, r)N m-r llUllHr for all r I> r0 fixed,
where the spaces H r form a scale of Hilbert spaces in which the regularity of u is
measured. Estimate (1.6) gives theoretical evidence of the fundamental property
of spectral methods, namely, they guarantee an accurate representation of
smooth, although highly structured, phenomena by a ~minimab> number of un-
knowns.
Spectral methods owe their success to the availability of ~fast algorithms>~ to
handle complex problems. The discrete solution u~ is determined by the set of its
5. Spectral Methods 57
((Fourier coefficients)) {fikl ] k I ~< N} according to the expansion (1.2), but it can
also be uniquely defined by the set of its values {uj = uN(xj)l ~ ] ~N} at a selected
set GN = {xj I ~ I ~<N} in ~. The points in GN are usually the nodes of Gaussian
formulae in D, such as the points xj = j~/N, j=0, ..., 2N-1 in [0,2~] for the
trigonometric system, or the points xj = cos j~/N, j=0, ..., N in [-1,1] for the
Chebyshev system. Thus, we have a double representation of UN, one in transform
space, the other in physical space. The discrete transform
(1.7) {ilk) <=> {uj}
is a global transformation (each fik depends upon all the uj's, and conversely). For
the Fourier and Chebyshev systems, fast transform algorithms are available to
carry out the transformation in a cheap way. Thus, one can use either representa-
tion of the discrete solution within a spectral scheme, depending upon which is
the most appropriate and efficient.
Numerical differentiation, a crucial ingredient of any numerical method for
differential problems, can be executed within a spectral method either in trans-
form space or in physical space. Let us confine ourselves to one dimensional
Fourier or Chebyshev methods.
N-I
If u(x) = ~" Uk eikx
k=-N
is a trigonometric polynomial, then
du N-1
(#= z ~(~ e"~,
dx m=-N
(1.8)
In physical space, setting xj = j~/N, j = 0,..., 2N-1, we have
du 2N-1
(xl) = ~, dOu(xj), 0<l~<2N-1,
dx j=o
with
6. 58 C. CANUTO: Parallelism in
(1.9) d~ =
1
2
0
(-l) ~+i cot (l-J)__Z, 14=j
2N
, l=j.
N
On the other hand, if u(x) = ~_,
k=O
polynomial of the first kind), then
uk Tk(x) (Tk(x) denoting the k-th Chebyshev
du U
- T. ~(~ Tm(x), with
dx m=o
N
(1.1o) ~(2 = 2 z kak
k=m+l
Cm k+m odd
(here Co= 2, Cm = 1 for m~>l). In physical space, setting xj = cosjar/N,j =0, ...,
N, we have
du N
(xt) = X dU u(xj), O<l<~N,
dx j=o
with
(1.11) d#=!
c: (_-1_)~§
9 xt- xj
-xj
2Ne+ 1
- ~----- ,
_ 2N 2+1
6
l<.t=j<.N,
l=j= 1,
I=j=N
(here Cl=CN=2, Cj=I for I<j<N-1).
7. Spectral Methods 59
The previous relations show that spectral differentiation - like a discrete trans-
form - is again a global transformation (with the lucky exception of Fourier dif-
ferentiation in transform space). The global characterof spectral methods is cohe-
rent with the global structure of the basis functions which are used in the expan-
sion.
Globality is the first feature of spectral methods we have to cope with in
discussing vectorization and parallelization. If we represent the previous trans-
forms in a matrix-times-vector form, they can be easily implemented on a vector
computer, and they take advantage of this architecture because matrices are
either diagonal, or upper triangular, or full. When the transforms are realized
through the Fast Fourier Transform algorithm, one can use efficiently vectorized
FFT's (see, e.g., [14]).
Conversely, if we are concerned with parallelization, globality implies grea-
ter communication demand among processors. This may not be a major problem
on coarse grain, large shared memory architectures, such as the now commercial-
ly available supercomputers (e.g., Cray XMP, Cray 2, ETA t~ ...). We expect
difficulties on the future fine grain, local memory architectures, where informa-
tion will be spread over the memories of tens or hundreds of processors.
In order to make our analysis more precise, let us focus on perhaps the most
significant application of spectral methods given so far, i.e., the numerical
simulation of a viscous, incompressible flow. Let us assume we want to discretize
the time-dependent Navier-Stokes equations in primitive variables
(1.12)
ut -vAtt + ~7p+(u'V)u=f
div u=O
u=g(or u periodic)
u(x, O)=uo (x)
in ~2x (0, T],
in g2x(0, T],
on 052x(0, T],
in ~,
in a bounded domain g2 C Ra (d=2 or 3).
So far, nearly all the methods which have been proposed in the literature
(see, e.g., Chapter 7 in [1] for a review) use a spectral method for the discretiza-
tion in space, and a finite difference scheme to advance the solution in time.
Typically, the convective term is advanced by an explicit scheme (e.g., second order
Adams-Bashforth, or fourth order Runge-Kutta) for two reasons: the stability
limit is always larger than the accuracy limit required to preserve overall spectral
accuracy, and the nonlinear terms are easily handled by the pseudospectral tech-
nique (see below). Conversely, the viscous and pressure terms are advanced by an
8. 60 C. CANUTO*.Parallelism in
implicit scheme (e.g., Crank-Nicolson), in order to avoid too strict stability limits.
Thus, at each time level, one has to
1) evaluate the convective term (u-V)u for one, or several, known velocity fields.
The computed terms appear on the right-hand side G of a Stokes' like problem
(1.13)
au-vAu+ ~7p = G in t2,
div u=O in Y2,
u=g (or u periodic) on Or2,
where a= 1/At;
ii) solve the spectral discretization of problem (1.13).
In most cases, problem (1.13) is reduced to a sequence of Helmholtz prob-
lems. These, in turn, are solved by a direct method or an iterative one. In the
latter case, one has to evaluate residuals of spectral approximation of Helmholtz
problems.
We conclude that the main steps in a spectral algorithm are:
A) calculation of differential operators on given functions;
B) solution of linear systems.
When the geometry of the physical domain is not Cartesian, one first has to
reduce the computational domain to a simple geometry. In this case, one has to
resort to one of the existing
C) domain decomposition techniques.
In the next sections, we will examine these three steps in some detail in view
of their implementation on a multiprocessor architecture.
2. Spectral calculation of differential operators.
Let us consider the following problem: ~(given the representation of a finite-
dimensional velocity field u, either in Physical or in Transform space, compute
9. Spectral Methods 61
the representation of (u -~7)u in the same space~. We recall that by representa-
tion of a given function v in Transform space we mean the finite set of its
<<Fourier~ coefficients according to the expansion (1.2); this set will be denoted by
9~. Similarl% by representation ofv in Physical space we mean the set of its values
on a grid GN in the physical domain, which uniquely determines v; this repre-
sentation will be denoted by v.
Each component of (u 9 ~7)u is a sum of terms of the form
(2.1) vDw,
where v and w are components ofu and D denotes differentiation along one space
direction.
An <<approximate~ representation of (2.1) in Transform space can be com-
puted by the so-called pseudospectral technique, which can be described as fol-
lows:
" ~' V'.....
"'".............
....... " vDw , (vDw) ^(2.2) ..,,
......
.......'""
"& .... --9 D~' ~ Dw .........
The solid line denotes discrete transform (computable by FFT), the dashed
line indicates differentiation in transform space, and the dotted line means point-
wise multiplication in physical space. The result of (2.2) is not the exact repre-
sentation of (2.1) in transform space due to presence of an aliasing error; howev-
er, it is possible to eliminate such an error, using again transformations similar to
the previous ones (see, e.g., Chapter 3 in [1]).
The representation of (2.1) in physical space is as follows (with the same
meaning of the arrows as before):
(2.3)
V ............................................................... .~
.,..
,~
"""~ vDw. .,,,.O'
..............""
w ~ ~ .... --~ D~ ~ Dw ....."'"
10. 62 C. CANUTO"Parallelism in
We are now in a position to discuss how to introduce parallelism in the
calculation of (u 9 ~7)u. Two conceptually different forms of parallelism can be
considered:
a) Mathematical Parallelism: assign the calculation of different terms vDw to diffe-
rent processors;
b) Numerical Parallelism: assign different portions of the computational domain (in
Physical space or in Transform space) to different processors, with the same
mathematical task.
The mathematical parallelism is the simplest to conceive and even to code;
however, it suffers from a number of drawbacks. Since the same component of u
may be needed by different processors, a large shared memory is necessry, and/or
large data transfers occur. Different processors may require the same data at the
same time, leading to severe memory bank conflicts. Furthermore, problems of
synchronisation and balancing may arise if the different mathematical terms do
not require the same computational effort, or if their number is not a multiple of
the number of processors. The strategy becomes definitely useless on fine grain
architectures.However, it can represent a first level of parallelism, if a hierarchy
of parallelisms is available.
Leca and Sacchi-Landriani [11] report their experience of parallelization of
a mixed Fourier-Chebyshev Navier-Stokes algorithm, known as the Kleiser-
Schumann method (see the next section). They use a multi-AP system at
ONERA (France), four AP-120 processors having access to a ((large)) shared
memory (compared to the ,local, memories). Starting from a single-processor
pre-existent code, Leca and Sacchi-Landriani simply send different subroutines -
computing different contributions to the equation - to different processors. The
largest observed speed-up is 2.78 out of the maximal 4.
From now on, we will discuss strategy b) of parallelization, i.e., Numerical
Parallelism. The question is: how do we split the computational domain among
the processors in order to get the highest degree of parallelism with the lowest
communication costs? The first answer to this question comes from the following
fundamental observation:
Spectral methodsfor multi-dimensional boundary valueproblems are inherently tensorproducts
of one-dimensional spectral methods.
This means that the orthogonal basis functions and the computational grids
which define a multidimensional spectral method are obtained by taking tensor
11. Spectral Methods 63
products of suitable orthogonal basis functions and grids on intervals of the real
line.
It follows that the elementary transformations (discrete transforms, dif-
ferentiation, pointwise product, ...) which constitute a spectral method can be
obtained as a sequence (a cascade) of one dimensional transformations of the
same nature. Each of these transformations (e.g., differentiation in the x direc-
tion) can be carried out in parallel over parallel rows or columns of the computa-
tional domain (either in Physical space or in Transform Space).
Therefore, the simplest strategy of domain decomposition will consist of
assigning ~slices~ of the computational domain (i.e., groups of contiguous rows or
columns, in a two dimensional geometry) to different processors. Once again we
stress that we consider slices both in Physical Space (i.e., rows/columns ofgridva-
lues) and in Transform Space (i.e., rows/columns of~Wourier~ coefficients). After
a transformation along one space direction has been completed, one has to trans-
pose the computational lattice in order to carry out transformations along the
other directions. Transposition should not be a major problem on architectures
with large shared memory or wide-band buses.
Erlebacher, Bohkari and Hussaini [5] report preliminary experiences of cod-
ing a Fourier-Chebyshev method for compressible Navier-Stokes simulations on
a 20 processor Flex/32 computer at the NASA Langley Research Center. Since
the time marching scheme is fully explicit, almost all the work is spent in comput-
ing convective or diffusive terms by the spectral technique. Parallelization is
achieved by the strategy described above. The physical variables on the com-
putational domain are stored in shared memory; slices of them are sent to the
processors, which write the results of their computation in shared memory. The
authors' conclusions are summarized in Table 1, where speed-ups (Sp) and effi-
ciencies (Ep) are documented for different choices of the computational grid.
According to the authors, moving variables between shared and local memory
should not cause major overheads even on such a supercomputer as the ETA 1~
Indeed, quoting from [5], ~a good algorithm [on the ETA 1~ should perform at
least 5 floating point operations per word transferred one way from common
memory,s. This minimum work is certainly achieved within a spectral code:
think, for instance, of differetiation in physical space via FFT.
Transposition of the computational lattice will eventually become prohibi-
tive on fine grain, local memory architectures. In this case, small portions of the
computational domain will be permanently resident in local memories, and inter-
communication among processors will be the major issue. In order to understand
the communication needs of a spectral method, let us observe that ifL(u) is any
13. Spectral Methods 65
differential operator (of any order, with variable coefficients or non-linear terms,
etc.) then one can compute a spectral approximation to L(u) at a point P of the
computational domain using only information at the points lying on the rows and
columns meeting at P (see Figure 1.a). This means that spectral methods,
although global methods in one space dimension, exhibit a precise sparse struc-
ture in multidimensional problems.
x x x |
X X X (~)
X X X (~)
X X X (~)
X X X ~)
X X X (~)
|174174174
X X X (~)
X X X (~)
X X X X X
X X X X X
X X X X X
X X X X X
X X X X X
X X X X X
--|174174174174
P
X X X X X
X X X X X
Figure 1.a - The spectral r162 at P in the computational
domain.
Let us confine ourselves to the 2-dimensional case. If we assume to partition
the computational domain among an array of m2 processors (see Figure 1.b),
then processor Pi0do will need to exchange data only with processors Pi0d (J
varying) and Pido(i varying). Thus, information will be exchanged among at most
O(n) processors.
Note that differentiation in Fourier space and evaluation of non-linear terms
in physical space require no communication among processors. Thus, the com-
munication demand in the spectral calculation of differential operators is dictated
by the two following one dimensional transformations:
(24) Fast Fourier transforms;
(25) Differentiation in Chebyshev transform space, according to (1.10).
14. 66 C. CANUTO: Parallelism in
P~jo
Pi~jo Pij
Figure 1.b - Communications in a lattice of processors.
3. Solution of linear systems.
Hereafterl we will discuss two examples of solution of linear systems arising from
spectral approximations.
3.1. Solving a Stokes problem via an influence matrix technique.
We consider the Kleiser-Schumann algorithm for solving problem (1.13) in
the infinite slab ~ = R• (-1,1), with g= 0 and u 2~-periodic in the x and y
directions (see [10] for the details). The basic idea is that (1.13) is equivalent to
(3.1)
au-vAu+ Vp=G
Ap=div G
in g?,
u_-0 1div u=0 on 092;
this, in turn, is equivalent to
(3.2)
Ap= divG
p=2
in f2,
on 00,
au-vAu = G-Vp in g2,
u=0 on 0s
15. Spectral Methods 67
provided $ is chosen in such a way that div u -- 0 on 0g?. If we project (3.2) on
each Fourier mode in the x and y directions, we get a family of one dimensional
Helmholtz problems in the interval (-1,1), where the unknowns are the Fourier
coefficients ofp and u along the chosen mode. The boundary values 4+ and ~l for
the Fourier coefficient of p are obtained by solving a 2x2 linear system, whose
matrix - the influence matrix - is computed once and for all in pre-processing
stage.
Thus, one is reduced to solve Helmholtz problems of the form
(3.3)
-w" + flw = h for -l<z<l, fl>~O.
w(1)=a, w(-1)=b.
N
A Chebyshev approximation wN(z) =k~__owk Tk(Z) to w(z) is defined through
the tau-method as
-~b(2) + fl~bm = ]~m, O~m<~N-2,
(3.4) N N
~, ~b~ = a; ~ (-1)mrb,~ = b.
m=O m=O
^ (2) 1 N
Here Wm = Z k(k 2 - m2)~k is the m-th Chebyshev coefficient ofw".
k=m+2
Cra k+m even
Several levels of parallelism can be exploited in the Kleiser-Schumann algor-
ithm. The most obvious one consists of splitting the Fourier modes among the
processors. There is no communication needed to solve (3.2), and a perfect ba-
lance of work among processors can be easily achieved. This strategy has been
followed by Leca and Sacchi-Landriani [11]. The next level of parallelism origin-
ates from the observation that in each tau system (3.4), the odd Chebyshev mod-
es are uncoupled from the even ones. Hence, the task of solving (3.4) can be split
over two processors. Finally, each of the resulting linear systems can be written in
tridiagonal form (see e.g., [1], Chapter 5 for more details).
The last property also holds for tau approximations of the Poisson equation
in several space dimensions, provided a preliminary diagonalization has been
carried out (see, again, [1], Chapter 5). Thus, the communication demand when
solving linear systems originated by tau approximation is essentially related to
the
(3.5) solution of tridiagonal systems.
16. 68 C. CANUTO:Parallelism in
3.2. Solving an elliptic boundary value problem by an iterative technique.
Let us assume we want to solve the model problem
-Au=f in the cube g2=(-1,1) 3,
(3.6)
u=O on Og2,
by a Chebyshev collocation method. For a fixed N>0, we define the Chebyshev
grid GN ={(xi, yj, zk) I 0~<i, J, k~<N}, with xt=yt=zt=cos lzc/N, 0~<I~N. We seek
a polynomial uN of degree N in each space variable, such that
(3.7)
--AuN(xi, yj, zk)= f(xi, Yi, Zk)
uN(xi, yj, Zk)=O
V(xi, yj, Zk) e GN f?s
"V(Xi, yj, Zk) E GNfq0t2.
Setting u = {uN(xi, yj, Zk) ] l~<i,J, k~<N-1} and f= {f(xi, yj, zk)l l~<i,j, k~<N-1},
let us write (3.7) in matrix form as
(3.8) Lsp u=f.
The resulting matrix, built up from the one-dimensional matrices (1.11), is
not banded. Hence, one has to resort to an iterative technique in order to solve
(3.8). Among the most popular schemes is the preconditioned Richardson itera-
tive method
(3.9) un+l=un-anA -1 [Lsp un--J] n=0, 1, 2,. .... ,
where an>0 is an acceleration parameter which can be dynamically chosen at
each iteration, and A-~ is a preconditioning matrix. Note that the spectral re-
sidual rn=Lspu~--fcan be efficiently computed by the transform techniques de-
scribed in the previous section, for which several strategies of parallelism have
been discussed.
The matrix A is an easily <<invertiblo> approximation of Lsp, such that
17. Spectral Methods 69
~max (A_lLsp)~ 1 as apposed to ~.,,a, (Lsp) = 0(N4). (Here 2max, resp., 2rain,
~min ~rain
denote the largest, resp., the smallest eigenvalue of the indicated matrix). This is
achieved, for instance, if A is the matrix of a low order finite difference or finite
element method for the Laplace operator on the Chebyshev grid GN. Multilinear
finite elements (Deville and Mund [19]) guarantee exceedingly good precon-
ditioning properties.
The direct solution of the finite difference or finite element system at each
Richardson iteration may be prohibitive for large problems. An approximate
solution is usually enough for preconditioning purposes. Most of the algorithms
proposed in the literature (see., e.g., [1], Chapter 5 for a review) are global se-
quential algorithms (say, an LU incomplete factorization).
Recently, Pietra and the author [2] have proposed solving approximately the
trilinear finite element system by a small number of ADI iterations. They use an
efficient ADI scheme for tensor product finite elements in dimension three intro-
duced by Douglas [4]. The method can be easily extended to handle general
variable coefficients. As usual, efficiency in an ADI scheme is gained by cycling
the parameters. The ADI parameters can be automatically chosen in such a way
that the cycle length lc(e) needed to reduce the error by a factor e satisfies
lc(e) = log (2r,a~) = log (Art) = 4logN.
~min
It follows that for a fixed e, 0 < e<l, there exists a cycle length lc(e), sati-
1 2max
flying lc(e)-- C(e) log N such that 2max (A_1L,p) -- -- (A-I L,p), where
2min 1-e •min
A is the exact finite element matrix and ,~ is its ADI approximation correspon-
ding to a length lc(e) of the parameter cycle. In other words, one can get nearly
the same preconditioning power as that of the exact finite element matrix, provi-
ded the number of ADI iterations is increased with N at a mere logarithmic rate.
The choice of ADI iterations is quite appropriate in conjunction with spec-
tral methods. Indeed, ADI share with spectral methods the tensor product struc-
ture, which is the basis for the alternate sweeps in the rows and the columns of the
computational domain. Furthermore, the Douglas version of ADI for finite el-
ments reduces each sweep to the solution of a set of indepedent tridiagonal sy-
stems of linear equations, the same kind of system which originates from a one
dimensional tau spectral method. Thus, ADI and spectral methods share most of
the pros and the cons with respect to the problem of their parallelization. Johns-
18. 70 C. CAmlTO: Parallelismin
son, Saad and Schultz [9] discuss highly efficient implementations of ADI met-
hods on several parallel architectures.
3.3 Communication needs
We have explored the structure of several spectral type algorithms, pointing out
the most significant features in view of their implementation on parallel
architectures. We first stressed the tensor product structure of spectral methods,
next we indicated the one-dimensional transformations which more frequently
occur in these methods: they are given in (2.4), (2.5) and (3.5).
It is outside the scope of this paper to discuss in detail the implementation of
these transformations on specific parallel architectures. Here, we simply recall
the most suitable interconnection networks for each of these transformations re-
ferring for a deeper analysis to classical books on parallel computers such as [8],
or to review papers such as [13].
Fast Fourier Transforms play a fundamental r61e in spectral methods. The
Perfect Shuffle interconnection network (Pease (1968), Stone (1971)) is the
optimal communication scheme for this class of transforms.
Differentiation in Chebyshev transform space essentially amounts to a mat-
rix-vector multiplication, where the matrix is upper triangular Toepliz (see
(1.10)). Thus, it can be written as a recursion relation as follows
Cm I] (1) = Cm+ 2 1](1)+ 2 "~ (m+l)1]m+l, m=N-1, ..., 0;
1](1~+~ = 1](~ = 0.
Cyclic Reduction (Golub-Hockney, 1965) or Cyclic Elimination (Heller,
1976) are the implementations of recursive algorithms which are suggested for
parallel architectures. Several interconnection networks have been proposed for
these transformations (see again [8], [13]).
The tridiagonal systems arising from tau methods or finite-order precon-
ditioners can be efficiently solved on parallel machines by a variety of substruc-
turing algorithms, which include Cyclic Reduction or Cyclic Elimination. John-
son, Saad and Schultz [9] discuss to implementation of ADI methods on the
hypercube architecture. Often, it is advisable to invert the tridiagonal matrix
once and for all in a preprocessing stage and then solve the linear systems by
19. Spectral Methods 71
matrix-vector multiplication. In this case, the Nearest Neighbor Network pro-
vides the optimal communication scheme.
It is clear from the previous discussion that several intercommunication
paths should co-exist in order to allow an optimal implementation of spectral
algorithms on parallel architectures. The union of the Perfect Shuffle Network
with the Nearest Neighbor Network (PSNN) is an example of a multi-path
scheme, quite appropriate for spectral methods. The PSNN was first proposed by
Grosch [7] for an efficient parallel implementation of fast Poisson solvers.
4. Domain decompositions in spectral methods.
The parallel implementation of different domain decomposition techniques
for general boundary value problems is discussed in the paper by A. Quarteroni
in this volume; we refer to it for the details. Hereafter, we confine ourselves to
some basic considerations about the use of a domain decomposition strategy with
spectral methods.
Partitioning the domain is an absolute need if the geometry of the domain is
complex, i.e., if it cannot be easily mapped into a Cartesian region. In this case,
one breaks the domain into simple pieces, and sets up a separate scheme in each
subdomain; suitable continuities are enforced at the interfaces, usually by an
iterative procedure. (We refert to [1], Chapter 13, for a review of the existing
domain decomposition techniques for spectral methods).
The same strategy can be applied, even on a simple geometry, with the
primary purpose of splitting the computational effort over several processors.
This ~route to parallelism- - which is quite successful when the discretization
scheme is of finite order- may contain a potential source of inefficiency if used in
the context of spectral methods. Indeed, it leads to breaking the globality of the
expansion, which - as we know - is a necessary ingredient in order to have high
accuracy for regular solutions. We stress here one of the crucial differences be-
tween local, finite order approximations and spectral approximations to the same
boundary value problem, produced by a domain decomposition technique. In the
former case, the solution obtained at convergence of the iterative procedure coin-
cides with the solution obtained by a single-domain method of the same type,
which employs the union of the grids on the subdomains. In the latter case, the
single-domain solution is a global polynomial function, whereas the final multi-
domain solution is merely a piecewise polynomial function, with finite order
smoothness at the interfaces. Although this does not prevent asymptotic spectral
20. 72 C. CANUTO:Parallelism in
accuracy for the multi-domain solution, its actual accuracy may be severely de-
graded if compared to that of the single-domain solution defined by the same
total number of degrees of freedom.
Let us illustrate the situation with a model problem, taken from [2]. Consid-
er the Dirichlet problem for the Poisson equation in the square (-1, 1)2, whose
exact solution is u(x, y) = cos 2nx cos 2ary. We divide the domain into four equal
squares, on each of which we set a Chebyshev collocation method, plus we en-
force C ~ continuity at the interfaces. The results are compared with those pro-
duced by a Chebyshev collocation method on the original square, which uses the
same total number of unknowns. The relative L = errors are reported in Table 2.
Table 2. Relative maximum-norm errors for a Chebyshev collocation method (from
[2]).
u(x, y) =
cos2~ Xcossry
4 DOM, 4x4 .62 E0
1 DOM, 8x8 .35 E-1
4 DOM, 8x8 .12 E-2
1 DOM, 16x16 .11 E-6
4 DOM, 16x16 .49 E-10
1 DOM, 32x32 .38 E-14
Note the loss of four orders of magnitude in replacing the single domain with
16x 16 nodes by the four domains, each with a 8x8 grid. Of course, if we have
four processors and we can reach the theoretical speed-up of four in the domain
decomposition technique, we can run four 16x 16 subdomains in parallel at the
cost of a single 16 X 16 domain on a single-processor, and gain four order of mag-
nitudes in accuracy. However, if we seek parallelism through the splitting techni-
ques described in Sections 2 and 3, and we maintain a speed-up of four, we can
run for the same cost a 32x32 grid on the single domain, yielding a superior
accurcy again by a factor of 10-4. Thus, it appears that it is better to keep the
21. Spectral Methods 73
spectral expansion as global as possible, and look for parallelism at the level of the
solution of the algebraic system originated from the discretization method.
We conclude by going back to domain decompositions for the spectral
scheme on each, <<simple>)subdomain, supplemented by suitable continuity con-
ditions at the interfaces. Deville and Mund [3] indicated that this can be done by
an iterative procedure such as (3.9), where A-1 is a <<globalpreconditioner>), i.e.,
an approximation of the differential problem over the whole domain. If the pre-
conditioner is of finite element type, the interface conditions can be incorporated
in the variational formulation, as shown in [2]. Thus, at each iteration, one has to
compute the spectral residuals separately on each subdomain. This can be done
in parallel. Next, one has to (approximately) solve a finite element system. Again,
this can be carried out in parallel form using one of the existing domain decom-
position techniques for finite elements methods. Note that in principle the do-
main decomposition used at this stage may be totally independent of the one
introduced for setting the spectral approximation.
REFERENCES
[1] C. CANUTO,M. Y. HUSSAINI,A. QUARTERONI,T. A. ZANGSpectral Methods in
Fluid Dynamics, Springer Vertag, New York, 1988.
[2] C. CANUTO,P. PIETRA,Boundary and interfaceconditionswithin afinite elementprecon-
ditionerfor spectral methods, I.A.N.-C.N.R. Report n~ 555, Pavia, 1987.
[3] M. DEVILLE,E. MUND, Chebyshevpseudospectral solution of second-orderelliptic equa-
tions with finite elementpreconditioning, J. Comput. Phys., 60 (1985), 517-533.
[4] J. DOUOLAS,JR., Alternating directionmethodsfor threespacevariables, Numer. Math.,
4 (1962), 41-63.
[5] G. ERLEBACHER,S. BOKHARI,M. Y. HUSSAINI,Threedimensionalcompressibletransi-
tion on a 20processorFlex/32 multicomputer, preprint, NASA Langley Research
Center, 1987.
[6] D. GOTTLIEB, S. A. ORSZAO,Numerical Analysis of Spectral Methods: Theory and
Applications, SIAM-CBMS, Philadelphia, 1977.
[7] C. E. GROSCH,Performance analysis of Poisson solvers on array computers, (1979) In-
fotech State of the Art Report: Supercomputers (C. Jesshope and R. Hockney,
eds.), Infotech, Maindehead, 147-181.
[8] R. HOCKNEY,C. JESSHOPE,Parallel Computers:Architecture, Programming and Algor-
ithms, Adam Hilger, Bristol, 1981.
[9] S. L. JOHNSSON,Y. SAAD,M. H. SCHULTZ,Alternating direction methods on multip-
rocessors, Report YALEU/DCS/RR-382, October 1985.
[10] L. KLEISER, U. SCHUMANN,Treatment of incompressibility and boundary conditions in
3-D numerical spectral simulations of plane channelflows, Proc. 3rd GAMM
22. 74 C. CANUTO: Parallelism in
Conf. Numerical Methods in Fluid Mechanics (E. H. Hirschel, ed.),
Vieweg Verlag, Braunshweig, 1980, 165-173.
[11] P. Lv.cA,G. SACcm-LANDRL~NI,ParalMisation d'un algorithme de matrice d'influence
pour la rgsolutiondes equations deNavier-Stokes par m~thodesspectrales, La R6cher-
che A6rospatiale, 6 (1987), 35-42.
[12] D. M. NOSENCHUCK,S. E. KRXST,T. A. ZANO,On multigrid methodsfor the Navier-
Stokes Computer, paper presented at the 3rd Copper Mountain Conference
on Multigrid Methods, Copper Mountain, Colorado, April 6-10, 1987.
[13] J. M. ORTEGA,R. G. VOIOT,Solution of partial differential equations on vector and
parallel computers, SIAM Review, 27 (1985), 149-240.
[14] C. T~.MPERTON,Self-sorting mixed-radix fast Fourier transforms, J. Comput. Phys., 52
(1983), 1-23.
[15] R. G. Vola% D. GorrT.IEB, M. Y. HussAIm (eds). Spectral Methods for Partial
Differential Equations, SIAM, Philadelphia, 1984.