This document summarizes a paper on indexing techniques for sparse matrices. It begins by defining sparse matrices and explaining why indexing is important for efficiently storing and manipulating large sparse matrices. It then describes five main indexing schemes - bit map, address map, row-column, threaded list, and diagonal/band indexing. For each scheme, it discusses how the nonzero elements are stored and accessed. The document concludes by comparing the schemes and recommending which may be best suited to different applications based on required processing and constraints like memory usage and execution time.
SVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONSijscmcj
The purpose of this article is to determine the usefulness of the Graphics Processing Unit (GPU) calculations used to implement the Latent Semantic Indexing (LSI) reduction of the TERM-BY DOCUMENT matrix. Considered reduction of the matrix is based on the use of the SVD (Singular Value Decomposition) decomposition. A high computational complexity of the SVD decomposition - O(n3), causes that a reduction of a large indexing structure is a difficult task. In this article there is a comparison of the time complexity and accuracy of the algorithms implemented for two different environments. The first environment is associated with the CPU and MATLAB R2011a. The second environment is related to graphics processors and the CULA library. The calculations were carried out on generally available benchmark matrices, which were combined to achieve the resulting matrix of high size. For both considered environments computations were performed for double and single precision data.
Dimensionality Reduction Techniques for Document Clustering- A SurveyIJTET Journal
Abstract— Dimensionality reduction technique is applied to get rid of the inessential terms like redundant and noisy terms in documents. In this paper a systematic study is conducted for seven dimensionality reduction methods such as Latent Semantic Indexing (LSI), Random Projection (RP), Principle Component Analysis (PCA) and CUR decomposition, Latent Dirichlet Allocation(LDA), Singular value decomposition (SVD). Linear Discriminant Analysis(LDA)
AN EFFECT OF USING A STORAGE MEDIUM IN DIJKSTRA ALGORITHM PERFORMANCE FOR IDE...ijcsit
The graph model is used widely for representing connected objects within a specific area. These objects are defined as nodes; where the connection is represented as arc called edges. The shortest path between two nodes is one of the most focus researchers’ attentions. Many algorithms are developed with different structured approach for reducing the shortest path cost. The most widely used algorithm is Dijkstra algorithm. This algorithm has been represented with various structural developments in order to reduce the shortest path cost. This paper highlights the idea of using a storage medium to store the solution path from Dijkstra algorithm, then, uses it to find the implicit path in an ideal time cost. The performance of Dijkstra algorithm using an appropriate data structure is improved. Our results emphasize that the searching time through the given data structure is reduced within different graphs models.
SVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONSijscmcj
The purpose of this article is to determine the usefulness of the Graphics Processing Unit (GPU) calculations used to implement the Latent Semantic Indexing (LSI) reduction of the TERM-BY DOCUMENT matrix. Considered reduction of the matrix is based on the use of the SVD (Singular Value Decomposition) decomposition. A high computational complexity of the SVD decomposition - O(n3), causes that a reduction of a large indexing structure is a difficult task. In this article there is a comparison of the time complexity and accuracy of the algorithms implemented for two different environments. The first environment is associated with the CPU and MATLAB R2011a. The second environment is related to graphics processors and the CULA library. The calculations were carried out on generally available benchmark matrices, which were combined to achieve the resulting matrix of high size. For both considered environments computations were performed for double and single precision data.
Dimensionality Reduction Techniques for Document Clustering- A SurveyIJTET Journal
Abstract— Dimensionality reduction technique is applied to get rid of the inessential terms like redundant and noisy terms in documents. In this paper a systematic study is conducted for seven dimensionality reduction methods such as Latent Semantic Indexing (LSI), Random Projection (RP), Principle Component Analysis (PCA) and CUR decomposition, Latent Dirichlet Allocation(LDA), Singular value decomposition (SVD). Linear Discriminant Analysis(LDA)
AN EFFECT OF USING A STORAGE MEDIUM IN DIJKSTRA ALGORITHM PERFORMANCE FOR IDE...ijcsit
The graph model is used widely for representing connected objects within a specific area. These objects are defined as nodes; where the connection is represented as arc called edges. The shortest path between two nodes is one of the most focus researchers’ attentions. Many algorithms are developed with different structured approach for reducing the shortest path cost. The most widely used algorithm is Dijkstra algorithm. This algorithm has been represented with various structural developments in order to reduce the shortest path cost. This paper highlights the idea of using a storage medium to store the solution path from Dijkstra algorithm, then, uses it to find the implicit path in an ideal time cost. The performance of Dijkstra algorithm using an appropriate data structure is improved. Our results emphasize that the searching time through the given data structure is reduced within different graphs models.
An Efficient Clustering Method for Aggregation on Data FragmentsIJMER
Clustering is an important step in the process of data analysis with applications to numerous fields. Clustering ensembles, has emerged as a powerful technique for combining different clustering results to obtain a quality cluster. Existing clustering aggregation algorithms are applied directly to large number of data points. The algorithms are inefficient if the number of data points is large. This project defines an efficient approach for clustering aggregation based on data fragments. In fragment-based approach, a data fragment is any subset of the data. To increase the efficiency of the proposed approach, the clustering aggregation can be performed directly on data fragments under comparison measure and normalized mutual information measures for clustering aggregation, enhanced clustering aggregation algorithms are described. To show the minimal computational complexity. (Agglomerative, Furthest, and Local Search); nevertheless, which increases the accuracy.
GET IEEE BIG DATA,JAVA ,DOTNET,ANDROID ,NS2,MATLAB,EMBEDED AT LOW COST WITH BEST QUALITY PLEASE CONTACT BELOW NUMBER
FOR MORE INFORMATION PLEASE FIND THE BELOW DETAILS:
Nexgen Technology
No :66,4th cross,Venkata nagar,
Near SBI ATM,
Puducherry.
Email Id: praveen@nexgenproject.com
Mobile: 9791938249
Telephone: 0413-2211159
www.nexgenproject.com
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Fault diagnosis using genetic algorithms and principal curveseSAT Journals
Abstract Several applications of nonlinear principal component analysis (NPCA) have appeared recently in process monitoring and fault diagnosis. In this paper a new approach is proposed for fault detection based on principal curves and genetic algorithms. The principal curve is a generation of linear principal component (PCA) introduced by Hastie as a parametric curve passes satisfactorily through the middle of data. The existing principal curves algorithms employ the first component of the data as an initial estimation of principal curve. However the dependence on initial line leads to a lack of flexibility and the final curve is only satisfactory for specific problems. In this paper we extend this work in two ways. First, we propose a new method based on genetic algorithms to find the principal curve. Here, lines are fitted and connected to form polygonal lines (PL). Second, potential application of principal curves is discussed. An example is used to illustrate fault diagnosis of nonlinear process using the proposed approach. Index Terms: Principal curve, Genetic Algorithm, Nonlinear principal component analysis, Fault detection.
In this video from the 2015 HPC User Forum in Broomfield, Barry Bolding from Cray presents: HPC + D + A = HPDA?
"The flexible, multi-use Cray Urika-XA extreme analytics platform addresses perhaps the most critical obstacle in data analytics today — limitation. Analytics problems are getting more varied and complex but the available solution technologies have significant constraints. Traditional analytics appliances lock you into a single approach and building a custom solution in-house is so difficult and time consuming that the business value derived from analytics fails to materialize. In contrast, the Urika-XA platform is open, high performing and cost effective, serving a wide range of analytics tools with varying computing demands in a single environment. Pre-integrated with the Hadoop and Spark frameworks, the Urika-XA system combines the benefits of a turnkey analytics appliance with a flexible, open platform that you can modify for future analytics workloads. This single-platform consolidation of workloads reduces your analytics footprint and total cost of ownership."
Learn more: http://www.cray.com/products/analytics/urika-xa
Watch the video presentation: http://wp.me/p3RLEV-3yR
Sign up for our insideBIGDATA Newsletter: http://insidebigdata.com/newsletter
Textual Data Partitioning with Relationship and Discriminative AnalysisEditor IJMTER
Data partitioning methods are used to partition the data values with similarity. Similarity
measures are used to estimate transaction relationships. Hierarchical clustering model produces tree
structured results. Partitioned clustering produces results in grid format. Text documents are
unstructured data values with high dimensional attributes. Document clustering group ups unlabeled text
documents into meaningful clusters. Traditional clustering methods require cluster count (K) for the
document grouping process. Clustering accuracy degrades drastically with reference to the unsuitable
cluster count.
Textual data elements are divided into two types’ discriminative words and nondiscriminative
words. Only discriminative words are useful for grouping documents. The involvement of
nondiscriminative words confuses the clustering process and leads to poor clustering solution in return.
A variation inference algorithm is used to infer the document collection structure and partition of
document words at the same time. Dirichlet Process Mixture (DPM) model is used to partition
documents. DPM clustering model uses both the data likelihood and the clustering property of the
Dirichlet Process (DP). Dirichlet Process Mixture Model for Feature Partition (DPMFP) is used to
discover the latent cluster structure based on the DPM model. DPMFP clustering is performed without
requiring the number of clusters as input.
Document labels are used to estimate the discriminative word identification process. Concept
relationships are analyzed with Ontology support. Semantic weight model is used for the document
similarity analysis. The system improves the scalability with the support of labels and concept relations
for dimensionality reduction process.
USING ADAPTIVE AUTOMATA IN GRAMMAR-BASED TEXT COMPRESSION TO IDENTIFY FREQUEN...ijcsit
Compression techniques allow reduction in the data storage space required by applications dealing with large amount of data by increasing the information entropy of its representation. This paper presents an adaptive rule-driven device - the adaptive automata - as the device to identify recurring sequences of symbols to be compressed in a grammar-based lossless data compression scheme.
A Novel Penalized and Compensated Constraints Based Modified Fuzzy Possibilis...ijsrd.com
A cluster is a group of objects which are similar to each other within a cluster and are dissimilar to the objects of other clusters. The similarity is typically calculated on the basis of distance between two objects or clusters. Two or more objects present inside a cluster and only if those objects are close to each other based on the distance between them.The major objective of clustering is to discover collection of comparable objects based on similarity metric. Fuzzy Possibilistic C-Means (FPCM) is the effective clustering algorithm available to cluster unlabeled data that produces both membership and typicality values during clustering process. In this approach, the efficiency of the Fuzzy Possibilistic C-means clustering approach is enhanced by using the penalized and compensated constraints based FPCM (PCFPCM). The proposed PCFPCM approach differ from the conventional clustering techniques by imposing the possibilistic reasoning strategy on fuzzy clustering with penalized and compensated constraints for updating the grades of membership and typicality. The performance of the proposed approaches is evaluated on the University of California, Irvine (UCI) machine repository datasets such as Iris, Wine, Lung Cancer and Lymphograma. The parameters used for the evaluation is Clustering accuracy, Mean Squared Error (MSE), Execution Time and Convergence behavior.
DEVELOPING A NOVEL MULTIDIMENSIONAL MULTIGRANULARITY DATA MINING APPROACH FOR...cscpconf
Data Mining is one of the most significant tools for discovering association patterns that are useful for many knowledge domains. Yet, there are some drawbacks in existing mining techniques. Three main weaknesses of current data-mining techniques are: 1) re-scanning of the entire database must be done whenever new attributes are added. 2) An association rule may be true on a certain granularity but fail on a smaller ones and vise verse. 3) Current methods can only be used to find either frequent rules or infrequent rules, but not both at the same time. This research proposes a novel data schema and an algorithm that solves the above weaknesses while improving on the efficiency and effectiveness of data mining strategies. Crucial mechanisms in each step will be clarified in this paper. Finally, this paper presents experimental results regarding efficiency, scalability, information loss, etc. of the proposed approach to prove its advantages.
Data reduction techniques for high dimensional biological dataeSAT Journals
Abstract
High dimensional biological datasets in recent years has been growing rapidly. Extracting the knowledge and analyzing highdimensional
biological data is one the key challenges in which variety and veracity are the two distinct characteristics. The
question that arises now is, how to perform dimensionality reduction for this heterogeneous data and how to develop a high
performance platform to efficiently analyze high dimensional biological data and how to find the useful things from this data. To
deeply discuss this issue, this paper begins with a brief introduction to data analytics available for biological data, followed by
the discussions of big data analytics and then a survey on various data reduction methods for biological data. We propose a dense
clustering algorithm for standard high dimensional biological data.
Keywords: Big Data Analytics, Dimensionality Reduction
In recent machine learning community, there is a trend of constructing a linear logarithm version of
nonlinear version through the ‘kernel method’ for example kernel principal component analysis, kernel
fisher discriminant analysis, support Vector Machines (SVMs), and the current kernel clustering
algorithms. Typically, in unsupervised methods of clustering algorithms utilizing kernel method, a
nonlinear mapping is operated initially in order to map the data into a much higher space feature, and then
clustering is executed. A hitch of these kernel clustering algorithms is that the clustering prototype resides
in increased features specs of dimensions and therefore lack intuitive and clear descriptions without
utilizing added approximation of projection from the specs to the data as executed in the literature
presented. This paper aims to utilize the ‘kernel method’, a novel clustering algorithm, founded on the
conventional fuzzy clustering algorithm (FCM) is anticipated and known as kernel fuzzy c-means algorithm
(KFCM). This method embraces a novel kernel-induced metric in the space of data in order to interchange
the novel Euclidean matric norm in cluster prototype and fuzzy clustering algorithm still reside in the space
of data so that the results of clustering could be interpreted and reformulated in the spaces which are
original. This property is used for clustering incomplete data. Execution on supposed data illustrate that
KFCM has improved performance of clustering and stout as compare to other transformations of FCM for
clustering incomplete data.
Semi-Supervised Discriminant Analysis Based On Data Structureiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
An Efficient Clustering Method for Aggregation on Data FragmentsIJMER
Clustering is an important step in the process of data analysis with applications to numerous fields. Clustering ensembles, has emerged as a powerful technique for combining different clustering results to obtain a quality cluster. Existing clustering aggregation algorithms are applied directly to large number of data points. The algorithms are inefficient if the number of data points is large. This project defines an efficient approach for clustering aggregation based on data fragments. In fragment-based approach, a data fragment is any subset of the data. To increase the efficiency of the proposed approach, the clustering aggregation can be performed directly on data fragments under comparison measure and normalized mutual information measures for clustering aggregation, enhanced clustering aggregation algorithms are described. To show the minimal computational complexity. (Agglomerative, Furthest, and Local Search); nevertheless, which increases the accuracy.
GET IEEE BIG DATA,JAVA ,DOTNET,ANDROID ,NS2,MATLAB,EMBEDED AT LOW COST WITH BEST QUALITY PLEASE CONTACT BELOW NUMBER
FOR MORE INFORMATION PLEASE FIND THE BELOW DETAILS:
Nexgen Technology
No :66,4th cross,Venkata nagar,
Near SBI ATM,
Puducherry.
Email Id: praveen@nexgenproject.com
Mobile: 9791938249
Telephone: 0413-2211159
www.nexgenproject.com
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Fault diagnosis using genetic algorithms and principal curveseSAT Journals
Abstract Several applications of nonlinear principal component analysis (NPCA) have appeared recently in process monitoring and fault diagnosis. In this paper a new approach is proposed for fault detection based on principal curves and genetic algorithms. The principal curve is a generation of linear principal component (PCA) introduced by Hastie as a parametric curve passes satisfactorily through the middle of data. The existing principal curves algorithms employ the first component of the data as an initial estimation of principal curve. However the dependence on initial line leads to a lack of flexibility and the final curve is only satisfactory for specific problems. In this paper we extend this work in two ways. First, we propose a new method based on genetic algorithms to find the principal curve. Here, lines are fitted and connected to form polygonal lines (PL). Second, potential application of principal curves is discussed. An example is used to illustrate fault diagnosis of nonlinear process using the proposed approach. Index Terms: Principal curve, Genetic Algorithm, Nonlinear principal component analysis, Fault detection.
In this video from the 2015 HPC User Forum in Broomfield, Barry Bolding from Cray presents: HPC + D + A = HPDA?
"The flexible, multi-use Cray Urika-XA extreme analytics platform addresses perhaps the most critical obstacle in data analytics today — limitation. Analytics problems are getting more varied and complex but the available solution technologies have significant constraints. Traditional analytics appliances lock you into a single approach and building a custom solution in-house is so difficult and time consuming that the business value derived from analytics fails to materialize. In contrast, the Urika-XA platform is open, high performing and cost effective, serving a wide range of analytics tools with varying computing demands in a single environment. Pre-integrated with the Hadoop and Spark frameworks, the Urika-XA system combines the benefits of a turnkey analytics appliance with a flexible, open platform that you can modify for future analytics workloads. This single-platform consolidation of workloads reduces your analytics footprint and total cost of ownership."
Learn more: http://www.cray.com/products/analytics/urika-xa
Watch the video presentation: http://wp.me/p3RLEV-3yR
Sign up for our insideBIGDATA Newsletter: http://insidebigdata.com/newsletter
Textual Data Partitioning with Relationship and Discriminative AnalysisEditor IJMTER
Data partitioning methods are used to partition the data values with similarity. Similarity
measures are used to estimate transaction relationships. Hierarchical clustering model produces tree
structured results. Partitioned clustering produces results in grid format. Text documents are
unstructured data values with high dimensional attributes. Document clustering group ups unlabeled text
documents into meaningful clusters. Traditional clustering methods require cluster count (K) for the
document grouping process. Clustering accuracy degrades drastically with reference to the unsuitable
cluster count.
Textual data elements are divided into two types’ discriminative words and nondiscriminative
words. Only discriminative words are useful for grouping documents. The involvement of
nondiscriminative words confuses the clustering process and leads to poor clustering solution in return.
A variation inference algorithm is used to infer the document collection structure and partition of
document words at the same time. Dirichlet Process Mixture (DPM) model is used to partition
documents. DPM clustering model uses both the data likelihood and the clustering property of the
Dirichlet Process (DP). Dirichlet Process Mixture Model for Feature Partition (DPMFP) is used to
discover the latent cluster structure based on the DPM model. DPMFP clustering is performed without
requiring the number of clusters as input.
Document labels are used to estimate the discriminative word identification process. Concept
relationships are analyzed with Ontology support. Semantic weight model is used for the document
similarity analysis. The system improves the scalability with the support of labels and concept relations
for dimensionality reduction process.
USING ADAPTIVE AUTOMATA IN GRAMMAR-BASED TEXT COMPRESSION TO IDENTIFY FREQUEN...ijcsit
Compression techniques allow reduction in the data storage space required by applications dealing with large amount of data by increasing the information entropy of its representation. This paper presents an adaptive rule-driven device - the adaptive automata - as the device to identify recurring sequences of symbols to be compressed in a grammar-based lossless data compression scheme.
A Novel Penalized and Compensated Constraints Based Modified Fuzzy Possibilis...ijsrd.com
A cluster is a group of objects which are similar to each other within a cluster and are dissimilar to the objects of other clusters. The similarity is typically calculated on the basis of distance between two objects or clusters. Two or more objects present inside a cluster and only if those objects are close to each other based on the distance between them.The major objective of clustering is to discover collection of comparable objects based on similarity metric. Fuzzy Possibilistic C-Means (FPCM) is the effective clustering algorithm available to cluster unlabeled data that produces both membership and typicality values during clustering process. In this approach, the efficiency of the Fuzzy Possibilistic C-means clustering approach is enhanced by using the penalized and compensated constraints based FPCM (PCFPCM). The proposed PCFPCM approach differ from the conventional clustering techniques by imposing the possibilistic reasoning strategy on fuzzy clustering with penalized and compensated constraints for updating the grades of membership and typicality. The performance of the proposed approaches is evaluated on the University of California, Irvine (UCI) machine repository datasets such as Iris, Wine, Lung Cancer and Lymphograma. The parameters used for the evaluation is Clustering accuracy, Mean Squared Error (MSE), Execution Time and Convergence behavior.
DEVELOPING A NOVEL MULTIDIMENSIONAL MULTIGRANULARITY DATA MINING APPROACH FOR...cscpconf
Data Mining is one of the most significant tools for discovering association patterns that are useful for many knowledge domains. Yet, there are some drawbacks in existing mining techniques. Three main weaknesses of current data-mining techniques are: 1) re-scanning of the entire database must be done whenever new attributes are added. 2) An association rule may be true on a certain granularity but fail on a smaller ones and vise verse. 3) Current methods can only be used to find either frequent rules or infrequent rules, but not both at the same time. This research proposes a novel data schema and an algorithm that solves the above weaknesses while improving on the efficiency and effectiveness of data mining strategies. Crucial mechanisms in each step will be clarified in this paper. Finally, this paper presents experimental results regarding efficiency, scalability, information loss, etc. of the proposed approach to prove its advantages.
Data reduction techniques for high dimensional biological dataeSAT Journals
Abstract
High dimensional biological datasets in recent years has been growing rapidly. Extracting the knowledge and analyzing highdimensional
biological data is one the key challenges in which variety and veracity are the two distinct characteristics. The
question that arises now is, how to perform dimensionality reduction for this heterogeneous data and how to develop a high
performance platform to efficiently analyze high dimensional biological data and how to find the useful things from this data. To
deeply discuss this issue, this paper begins with a brief introduction to data analytics available for biological data, followed by
the discussions of big data analytics and then a survey on various data reduction methods for biological data. We propose a dense
clustering algorithm for standard high dimensional biological data.
Keywords: Big Data Analytics, Dimensionality Reduction
In recent machine learning community, there is a trend of constructing a linear logarithm version of
nonlinear version through the ‘kernel method’ for example kernel principal component analysis, kernel
fisher discriminant analysis, support Vector Machines (SVMs), and the current kernel clustering
algorithms. Typically, in unsupervised methods of clustering algorithms utilizing kernel method, a
nonlinear mapping is operated initially in order to map the data into a much higher space feature, and then
clustering is executed. A hitch of these kernel clustering algorithms is that the clustering prototype resides
in increased features specs of dimensions and therefore lack intuitive and clear descriptions without
utilizing added approximation of projection from the specs to the data as executed in the literature
presented. This paper aims to utilize the ‘kernel method’, a novel clustering algorithm, founded on the
conventional fuzzy clustering algorithm (FCM) is anticipated and known as kernel fuzzy c-means algorithm
(KFCM). This method embraces a novel kernel-induced metric in the space of data in order to interchange
the novel Euclidean matric norm in cluster prototype and fuzzy clustering algorithm still reside in the space
of data so that the results of clustering could be interpreted and reformulated in the spaces which are
original. This property is used for clustering incomplete data. Execution on supposed data illustrate that
KFCM has improved performance of clustering and stout as compare to other transformations of FCM for
clustering incomplete data.
Semi-Supervised Discriminant Analysis Based On Data Structureiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09849539085, 09966235788 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
Optimal Converge cast Methods for Tree- Based WSNsIJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
International Journal of Modern Engineering Research (IJMER) covers all the fields of engineering and science: Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Computer Engineering, Agricultural Engineering, Aerospace Engineering, Thermodynamics, Structural Engineering, Control Engineering, Robotics, Mechatronics, Fluid Mechanics, Nanotechnology, Simulators, Web-based Learning, Remote Laboratories, Engineering Design Methods, Education Research, Students' Satisfaction and Motivation, Global Projects, and Assessment…. And many more.
Wireless sensor networks, clustering, Energy efficient protocols, Particles S...IJMIT JOURNAL
Wireless sensor networks (WSN) is composed of a large number of small nodes with limited functionality.
The most important issue in this type of networks is energy constraints. In this area several researches have
been done from which clustering is one of the most effective solutions. The goal of clustering is to divide
network into sections each of which has a cluster head (CH). The task of cluster heads collection, data
aggregation and transmission to the base station is undertaken. In this paper, we introduce a new approach
for clustering sensor networks based on Particle Swarm Optimization (PSO) algorithm using the optimal
fitness function, which aims to extend network lifetime. The parameters used in this algorithm are residual
energy density, the distance from the base station, intra-cluster distance from the cluster head. Simulation
results show that the proposed method is more effective compared to protocols such as (LEACH, CHEF,
PSO-MV) in terms of network lifetime and energy consumption.
Extended pso algorithm for improvement problems k means clustering algorithmIJMIT JOURNAL
The clustering is a without monitoring process and one of the most common data mining techniques. The
purpose of clustering is grouping similar data together in a group, so were most similar to each other in a
cluster and the difference with most other instances in the cluster are. In this paper we focus on clustering
partition k-means, due to ease of implementation and high-speed performance of large data sets, After 30
year it is still very popular among the developed clustering algorithm and then for improvement problem of
placing of k-means algorithm in local optimal, we pose extended PSO algorithm, that its name is ECPSO.
Our new algorithm is able to be cause of exit from local optimal and with high percent produce the
problem’s optimal answer. The probe of results show that mooted algorithm have better performance
regards as other clustering algorithms specially in two index, the carefulness of clustering and the quality
of clustering.
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTIAEME Publication
In this paper, we investigated a queuing model of fuzzy environment-based a multiple channel queuing model (M/M/C) ( /FCFS) and study its performance under realistic conditions. It applies a nonagonal fuzzy number to analyse the relevant performance of a multiple channel queuing model (M/M/C) ( /FCFS). Based on the sub interval average ranking method for nonagonal fuzzy number, we convert fuzzy number to crisp one. Numerical results reveal that the efficiency of this method. Intuitively, the fuzzy environment adapts well to a multiple channel queuing models (M/M/C) ( /FCFS) are very well.
WIRELESS SENSOR NETWORK CLUSTERING USING PARTICLES SWARM OPTIMIZATION FOR RED...IJMIT JOURNAL
Wireless sensor networks (WSN) is composed of a large number of small nodes with limited functionality. The most important issue in this type of networks is energy constraints. In this area several researches have been done from which clustering is one of the most effective solutions. The goal of clustering is to divide network into sections each of which has a cluster head (CH). The task of cluster heads collection, data aggregation and transmission to the base station is undertaken. In this paper, we introduce a new approach for clustering sensor networks based on Particle Swarm Optimization (PSO) algorithm using the optimal fitness function, which aims to extend network lifetime. The parameters used in this algorithm are residual energy density, the distance from the base station, intra-cluster distance from the cluster head. Simulation results show that the proposed method is more effective compared to protocols such as (LEACH, CHEF, PSO-MV) in terms of network lifetime and energy consumption.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
AN INVERTED LIST BASED APPROACH TO GENERATE OPTIMISED PATH IN DSR IN MANETS –...Editor IJCATR
In this paper, we design and formulate the inverted list based approach for providing safer path and effective
communication in DSR protocol.Some nodes in network can participate in network more frequenctly whereas some nodes are not
participating. Because of this there is the requirement of such an approach that will take an intelligent decision regarding the sharing of
bandwidth or the resource to a node or the node group. Dynamic source routing protocol (DSR) is an on-demand, source routing
protocol , whereby all the routing information is maintained (continually updated) at mobile nodes.
Similar to A survey of indexing techniques for sparse matrices (20)
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Welcome to the first live UiPath Community Day Dubai! Join us for this unique occasion to meet our local and global UiPath Community and leaders. You will get a full view of the MEA region's automation landscape and the AI Powered automation technology capabilities of UiPath. Also, hosted by our local partners Marc Ellis, you will enjoy a half-day packed with industry insights and automation peers networking.
📕 Curious on our agenda? Wait no more!
10:00 Welcome note - UiPath Community in Dubai
Lovely Sinha, UiPath Community Chapter Leader, UiPath MVPx3, Hyper-automation Consultant, First Abu Dhabi Bank
10:20 A UiPath cross-region MEA overview
Ashraf El Zarka, VP and Managing Director MEA, UiPath
10:35: Customer Success Journey
Deepthi Deepak, Head of Intelligent Automation CoE, First Abu Dhabi Bank
11:15 The UiPath approach to GenAI with our three principles: improve accuracy, supercharge productivity, and automate more
Boris Krumrey, Global VP, Automation Innovation, UiPath
12:15 To discover how Marc Ellis leverages tech-driven solutions in recruitment and managed services.
Brendan Lingam, Director of Sales and Business Development, Marc Ellis
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
UiPath Test Automation using UiPath Test Suite series, part 4
A survey of indexing techniques for sparse matrices
1. A Survey of Indexing Techniques for Sparse Matrices
UDO W. POOCH, AND AL NIEDER
Texas A & M Umversily,* College Statwn, Texas
A sparse matrix is defined to be a matrix containing a high proportion of elements that
are zeros. Sparse matrices of large order are of great interest and application in science
and industry; for example, electrical networks, structural engineering, power
distribution, reactor diffusion, and solutions to differential equations
While conclusions within this paper are primarily drawn considering orders of
greater than 1000, much ~s applicable to sparse matrices of smaller orders in the
hundreds.
Because of increasing use of large order sparse matrices and the tendency to
attempt to solve larger order problems, great attention must be focused on core
storage and execution time Every effort should be made to optimize both computer
memory allocation and executmn times, as these are the limiting factors that most
often dictate the practicahty of solving a given problem
Indexing algorithms are the subject of this paper, as they are generMly recognized
as the most ~mportant factor in fast and efficient processing of large order sparse
matrices.
Indexing schemes of main interest are the bit map, address map, row-column, and
the threaded list Major variations of the indexing techniques above mentioned are
noted, as well as the particular indexing scheme inherent in diagonal or band matrices.
The concluding section of the paper compares the types of methods, discusses their
suitabihty for different types of processing, and makes suggestions eoneernlng the
adaptability and flexibility of the maj or exmting methods of indexing algorithms for
application to user problems
Key Words and Phrases: Matrix, sparse matrix, matrix manipulation, indexing.
CR Categomes: 5 14, 5 19
I. INTRODUCTION
Computations involving sparse matrices
have been of widespread use since the 1950s,
becoming increasingly popular with the advent of faster cycle times and larger computer memories. One cycle time is the time
required for the central processing unit to
send and to receive a data signal from main
memory. Systems applications for sparse
matrices include electrical networks and
power distribution, structural engineering,
reactor diffusion, and solutions to differentim equations.
A sparse matrix is a matrix having few
nonzero elements. Matrix density is defined
as the number of nonzero elements of the
* D e p a r t m e n t of Industrial Engineering.
matrix divided by the total number of elements in the full matrix. Most available references utilizing sparse matrices for calculations [1-8] consider matrices of order 50, or
more [9, 10], with densities ranging from 15 %
to 25 % and decreasing steadily as the order
increases. This paper will accept these
boundary conditions as a strict definition of
a sparse matrix. Brayton, Gustavson, and
Willoughby [8] say that a typical large (implied to be in the hundreds) order sparse
matrix has 2 to 10 nonzero entries per row.
Hays [5] says that an average of 20 nonzero
elements per row is not an unreasonably
small number in quite large (implied to be
around 100 and greater) order. Livesley [1]
indicates that an average of 3 or 4 elements
Computing Surveys, Vol. 5, No. 2, June 1973
2. 110
•
U. W. Pooch and A. Nieder
CONTENTS
I Introduction
II Bit Map Scheme
III Address Map Scheme
IV Row-Column Scheme
V Threaded List Scheme
¥I Diagonal or Band Indexing Scheme
VII Conclusion
Appendix A
Algorithm 1 Bit Map Scheme
Algorithm 2 Address Map Scheme
Algorithm 3. Address Map Scheme
Bibliography
109-112
112-114
114-116
116-119
119-122
122-123
123-127
127-132
132-133
C o p y r i g h t (~ 1973, A s s o c ~ a t m n for C o m p u t i n g
M a c h i n e r y , Inc. G e n e r a l p e r m i s s i o n to r e p u b h s h ,
b u t n o t for profit, all or p a r t of t h i s m a t e r i a l is
g r a n t e d , p r o v i d e d t h a t A C M ' s c o p y r i g h t n o t i c e is
g i v e n a n d t h a t r e f e r e n c e is m a d e to thJs p u b l i c a tion, to i t s d a t e of ~.ssue, a n d to t h e f a c t t h a t rep r m L i n g p r i v i l e g e s were g r a n t e d b y p e r m i s s i o n of
t h e A s s o c m t m n for C o m p u t i n g M a c h i n e r y .
Computing Sulvevs, Vol 5, No 2, June 1973
per row in a large (implied to be around
1000) order structural problem is a good
estimate.
If the order I of the matrix is reasonably
small, i.e., about order 50 or less, it would
make little difference if the full matrix were
kept in core. However, if the sparse matrix
is of larger order than about 50, it becomes
efficient in terms of execution time and core
allocation to store only the nonzero entries
of the matrix.
The efficiency of retaining only the nonzero elements becomes obvious in the exampie of a 500 X 500 matrix with 10 % density.
With one word of storage allocated for each
element, the matrix requires 250,000 words,
which is very often more than is physically
available. Storing only the nonzero elements
requires 25,000 words. If the full matrix were
multiplied by a similar full matrix a minimum of 500 X 500 X 500 = 125 X 106
arithmetic operations are required, compared
to a minimum of (500 X 10 %)3 = 125 X 103
arithmetic operations when only the nonzero
elements are retained. If both 500 X 500
matrices were to be retained in core as full
matrices, core allocation and execution time
would be prohibitive on many computers,
and the problem would be abandoned as infeasible for computer solution.
By storing the nonzero elements in some
reasonable manner, and using logical operations to decide when arithmetic operations
are necessary, Brayton, et al. [8] relate that
both the storage requirements and the required amount of arithmetic can often, in
practice, be decreased by a factor of I over
the full matrix.
Sparse matrices are classified generally by
the arrangement of the nonzero elements.
When the matrix is in random form, nonzero
elements appear in no specific pattern. A
matrix is said to be a band matrix, or in band
form, if its elements a~.~ = 0 for [ i - j I > m
(where m is a small integer, and usually
m ~ I) and where the nonzero elements form
a band along the mam diagonal. The band
width is the number of nonzero elements
that appear in one row of a band matrix
(i.e., 2m ~- 1). A block-diagonal form occurs
when submatrices of nonzero elements appear along the matrix diagonal. In block
3. Indexing Techniques for Sparse Matrices
form, the matrix has submatrices of nonzero
elements that occur in no specific pattern
throughout the full matrix. The block dimension is the order of a submatrix in a block or
block-diagonal matrix.
In electrical network and power distribution problems, the matrix is generally in
random, band, or block-diagonal form, with
the elements representing circuit voltages,
currents, impedances, power sources, or users
[9-10]; in structural engineering applications,
the sparse matrix is generally of band or
block form, with the band width or block
dimension representing the number of joints
per floor [3, 11]; in reactor diffusion problems
and differential equations, the band form of
matrix is most common, with the band width
being the number of points used in a pointdifference formula [12-14].
This paper, while not concerned with the
actual mathematical manipulations of sparse
matrices, is primarily concerned with the indexing algorithms employed in such calculations. If the sparse matrix is stored in a haphazard manner, elements can only be retrieved by a search of all the data, which
takes much time. If the sparse matrix is
stored in some very convenient form, execution time will be much less. Conservation of
execution time is of major importance in
selecting an indexing algorithm.
Another major consideration in selecting
a particular indexing method is the amount
of fast core the method requires in addition
to that used for the storage of the nonzero
data elements. For most applications, a small
difference in core allocation between two
methods is not a critical factor. In this case,
the critical consideration is the execution
time difference between the two methods.
Since execution times vary greatly with the
methods of indexing, an exact comparison of
execution times must reflect the type of
mathematical manipulation that is to be
performed on the sparse matrix.
One last major aspect of indexing algorithm selection concerns the adaptability
and flexibility of programming the selected
scheme. This depends in great part on the
type of machine, business or scientific; machine configuration; operating system capabilities; number of bits per word; access
•
111
times for peripheral devices; average instruction times; availability of the required instructions; the maximum row or column size
to be used; the expected matrix density; and
the availability and size of buffers.
As with most applications, the use of a
high-level programming language may provide relative ease of implementation for a
selected indexing scheme, but such use is frequently accompanied by penalties in execution time and storage requirements. However, on the positive side, use of high-level
languages may well result in a minimum of
elapsed time for problem solution with a
given programming staff, as well as overall
minimum cost, considering both personnel
and computer usage. Problems involving
large order sparse matrices focus their attention on core storage utilization and execution
time minimization, and therefore all but
eliminate the employment of high-level languages for indexing schemes.
In subsequent sections of this paper, current indexing schemes will be examined in an
attempt to isolate a "fast" indexing algorithm, with "fast" being defined as producing an optimization of execution time and
core storage for sparse matrices of large order. Particular advantages and disadvantages of each major type of indexing discussed will be brought to the attention of
the reader. Parts II through VI discuss aspects of particular indexing schemes, while
Part VII compares the requirements and advantages of the various schemes. Part VII, in
conclusion, also makes recommendations
concerning the adaptability and flexibility of
the major existing indexing algorithms for
application to user problems.
The authors have attempted, as much as
possible, to make their discussions machine
independent. However, the authors made use
of an IBM System 360/65 Model I in their
research and certain basic aspects of this machine, such as the 32-bit word, are alluded to
in the succeeding pages. The interested
reader should have httle difficulty in adapting the concepts presented to machines of
differing architecture.
Computing Surveys, Vol 5, N o
2, June 1973
4. 112
U. W. Pooch and A. Nieder
•
II. BIT MAP SCHEME
I 0100 1010 I 1001 0101 I . . . . . . .
I n a bit m a p scheme, a Boolean form of the
matrix M is the basic indexing reference.
Whenever a nonzero entry occurs in the
sparse matrix, a 1 bit is placed in the bit map,
with null entries remaining as zeros in the
bit map. The position of each successive nonzero entry is found by counting over to the
next 1 bit in the map.
More rapid access to any element of a row
is achieved b y providing an additional row
index vector, where each element of t h a t
vector is the address of the first nonzero elem e n t of each row [16]. An additional column
index vector m a y also be applied for a more
rapid column access, but this will also necessitate storing each nonzero entry twice. I t
should be noted, however, t h a t any machine
based on word, rather t h a n bit, addressing
techniques will give much slower access in
one dimension of the matrix t h a n in the
other.
As an example, the following matrix M,
and its associated bit m a p and reduced Zvector is given.
M=
BM=
05
00
10
[3,2,5,4,7,1,8]
Z--
01
00
10
Figure 1 demonstrates a sample bit m a p supplemented with the row index vector V; the
Z elements are the nonzero elements of the
matrix.
The bit m a p in Figure 1 is a matrix conception of the bit map. To conserve core, instead of using one word for each row of the
bit map, all four rows (16 bits) are cornv
v(2)
•
2
• z(2)
V(3)
•
4
, Z(4)
z(5)
V(4)
)
z(3)
) Z(6)
z(7)
R WIndex ValueIndicates
O
f l r s t nonzero
element for row
FI~
1.
,
Z-vector
value
Sample bit m a p .
Computing Surveys, Vol 5, No 2, June 1973
Bit Map
byte 1
byte 2
]
byte 3
FIG. 2. Bit map of Figure 1 in core.
pacted into one word as shown in Figure 2
with byte (8 bits) boundaries marked.
F r o m Figure 2, it is simple to see t h a t the
bit map, being the Boolean form of the matrix, will, in fast core, require at least W =
I . J / B words, where I and J are the dimensions of the matrix and B is the n u m b e r of
bits per word; W is rounded up to the nearest
integer. The bit m a p uses at m i n i m u m
Emt Map ---- (100/B) %
of the storage requirements of the full matrix
for indexing. The additional row index vector
adds W= I . A / B more words, where A is
the n u m b e r of bits required for an address.
Supplemented with the row index vector,
E•lt Map -~- R O W I n d e x
= IO0/B (1 ~- ( A / J ) ) %
of the full matrix is required for the indexing.
Now, if the sparse matrix has less t h a n
65,536 nonzero elements, then A can be 16
bits in excess 32,768 notation. I n a 32-bitword machine for example, 16 bits m a y be
conveniently accessed if the instruction set
has a complement of half-word instructions.
Attention should be given to the number of
bits required for an address to range through
the m a x i m u m core size. If this number of
bits is not conveniently manipulated, it will
be necessary to use more than the m i n i m u m
a m o u n t of core to gain an execution advantage. Execution times for full word instructions are often less t h a n execution times for
half-word instructions. Therefore, when
choosing a convenient number of bits for A,
the n u m b e r of bits used for an address, it is
i m p o r t a n t to realize the tradeoff between
core conservation and access time.
Using B = 32 bits (word length), and
A = 16 bits (half-word length), for a 500 ×
500 matrix the bit m a p and row index vector
require 8313 words, or 3.325 % of the 250,000
words for the full matrix; if the matIix is
only 5 % dense, another 12,500 words are
required for the nonzero elements; the total
is 20,813 words, or 8.325 % of the full matrix.
5. Indexing Techniques for Sparse Matrices
In order to reference the M,~ element, it is
necessary to physically count across to the
j t h element in the zth "row" of the bit map.
The correct bit will lie in the S~ = ((i - 1) *
J + j + (B - 1))/B word of the bit map.
To isolate the required bit, it will be necessary to either shift the word the necessary
number of bits or mask all the other bits by
a logical operation. If a shift is used, then
repeated shifts perform a row operation when
the bit map is stored by rows. Algorithm 1
(see Appendix) isolates the correct beginning
word of a row in the bit map; a segment of
the code shifts through one entire row, in
preparation for a mathematical manipulation of the row.
Algorithm 1 with slight alteration will accommodate matrices up to order 100,000.
The restriction occurs in statement 06, where
the multiplication must not result in loss of
significant bits due to exceeding word size.
In practice, the algorithm is limited either by
the index vector being half-words, as indexing is provided for only 65,536 nonzero elements; or by 4095 rows or columns, the
maximum number used in the indexing in
statement 02.
When the bit map is stored by rows, as in
the algorithm above, then to perform a column operation it is necessary to count to the
correct j bit for all I rows. This means executing virtually the entire algorithm I times.
If more than a few column operations are to
be performed, then execution time will become an important factor. The execution
time is dependent on the density of the
sparse matrix, the order of the sparse matrix,
and the number of column operations to be
performed. The time factor is exemplified by
the following:
EXAMPLE 1: A 500 X 500 m a t r i x
exists, and it is necessary to perform 10
column operations when the matrix is
5 % dense. The average column
execution time will be that of the 250th
column. Assuming the entire algorithm
is executed for each row, the execution
time will be approximately:
500 rows X 10 column operations
[(time to locate beginning of each
row)
•
113
+ .05 density X 500/2 columns X
(to process 1 bits)
+ (1 - .05 density) X 500/2 columns
X (time to process 0 bits)
+ 500/2 columns X (time to locate
bit in bit map)
+ 500/2 words X (time to locate
word in bit map)]
which is about 10 seconds on the I B M
360/65, with additional microseconds incorporated for the mathematical operation
not listed in the coding. Had the same procedure been carried out on the transpose of
the bit map, that is, the bit map is now
column-oriented instead of row-oriented,
then the execution time would have been cut
by a factor of about 500, a considerable time
savings. Not taken into consideration is any
further computer processing, such as updating an index register after each 4095
characters or bytes, if necessary.
If the bit map of the sparse matrix can be
transposed and the data rearranged in less
time than the difference between the column
and row execution times, then the transpose
operation will conserve execution time. In
the above example, the difference between
column and row execution times is about 9.7
seconds.
For certain types of operations the bit map
is ideal. Being in Boolean form, which means
elements are either 1 or 0, true or false, or
plus or minus, the bit map is the most compact form for logical operations, such as
AND, OR, or E X C L U S I V E OR. Thus, if
matrices MA and M B exist, and it is necessary to determine which elements are nonzero in both matrices, it is necessary only to
A N D each word of bit map MA with the corresponding word of bit map MB. If the result
is zero, both are not present; if the result is
nonzero, the indicated elements appear in
both matrices. An E X C L U S I V E OR determines which elements are present in either,
but not both, of the matrices; an OR determines which elements appear in either or
both of the matrices. Logical operations performed on the bit map require about 1/~2 of
the execution time for the same logical operation on the full scale matrix, because the
bit map on a 32 bit-word machine condenses
32 pieces of data into 1 word. Additionally,
Computing Surveys, Vol. 5, No 2, June 1973
6. 114
•
U. W. Pooch and A. N~eder
and often most importantly, the bit map
conserves core storage.
To determine how many elements will be
present in the sum of two rows, and their
order, an OR is performed on the two rows
of the bit map. Using similar techniques, the
feasibility of rearranging the matrix in a
form more convenient for the user, such as
diagonal form, where nonzero elements appear all along the diagonal, is determined.
Kettler and Well [15] discuss some of the
aspects of such a rearrangement algorithm.
M a n y references are found to endorse or
suggest the use of a bit map scheme for
sparse matrices [7, 15-20], but it is particularly difficult to ascertain the exact algorithms utilized, as most authors do not
include these in their papers.
While a bit map scheme appears convenient and fast, it is restricted by the amount
of fast core available for the bit map. In the
case where the sparse matrix is less dense
than the percentage of the full matrix that
the bit map scheme occupies, core storage
will be conserved by switching to an alternative method of indexing.
Givens [21] has suggested that the bit map
scheme would be more attractive to users if
some special instructions were designed and
implemented, to further decrease execution
times. One such instruction Givens references
is C L E A R TO ZERO, which would clear a
large block of core, e.g., the bit map, from a
first to a last address. Another instruction
would be LOAD N E X T NONZERO, which
would fetch the address of the next nonzero
entry of the bit map, given the previous nonzero element, thereby eliminating the necessity of counting through all the zero bits.
These special instructions would be implemented as microprogrammed subroutines
[21]. To define a microprogram, it is necessary to understand that the execution of each
assembly language instruction involves a
specific sequence of transfers of information
from one register in the processor to another;
some of the transfers take place directly, and
some through an adder or other logical circuit. Each of these steps defines a microinstruction and the complete set of steps necessary to execute the assembly language instruction constitutes a microprogram [22].
Computing Surveys, Vol 5, No 2, June 1973
IlL ADDRESS MAP SCHEME
The address map is similar in form to the
bit map, the main difference being that the
address map stores an address or address displacement for each matrix element. If the
matrix element is zero then a zero address is
stored. The bit map requires only one bit for
each matrix element.
Since an address or address displacement
requires more than one bit for each matrix
element, the address map scheme will require
N times more core storage than the bit map
scheme, where N is the number of bits used
for an address or address displacement. If
address displacements instead of full-length
addresses are used, then the address map
must be augmented by a row index vector,
as with the bit map.
Assuming there are less than 256 nonzero
entries per row, for example, an address displacement would require only 8 bits (a
common character size). If a particular computer allows character operations that are
faster than the access time to an individual
bit map entry, the improved column access
time of the address map can warrant the
increased core expenditure. On a system with
6 bit characters, up to 64 nonzero row entries
can be accommodated.
The overall percentage storage requirement of the full matrix required for the address map with the row index vector will be
EAdd.... Map = 100/B (C + A / J ) %
where B is the number of bits per word; C is
the number of bits used for an address displacement; A is the number of bits used for
an element of the row index vector; and J is
the number of columns of the matrix. Using
C = 8 bits; A = 32 bits; B = 32 bits; and
J = 1000 columns, the address map and row
index vector require 25.1% of the full matrix, that is 251,000 words compared to 1
million for the full matrix. In addition, if the
matrix is 5% dense, an additional 50,000
words are required for the storage of the
nonzero elements.
In order to isolate the M,~ element, it is
necessary to access the S, = C / B (i -- 1).
J -t- j character (or byte). In terms of words,
S, = {C[(i -- 1). J + (j - 1)] + B } / B . )
7. Indexing Techniques for Sparse Matrices
where i and 3 are respectively the row and
column of interest. If the S~ character (byte)
is zero, it is a null entry; otherwise, the content of the S~ character (byte) is added to the
row index element to give the address of the
nonzero element.
The address map scheme is subject to
many of the same limitations of the bit map
scheme, and requires a larger amount of core
storage for indexing. A sample coding, Algorithm 2, which has the same characteristics
as the example used in the bit map method
(Algorithm 1) illustrates that fewer arithmetic operations than the bit map method
are required when the computer is equipped
with character addressing capabilities. If the
computer used does not allow convenient
arithmetic manipulation of individual characters, then the coding enclosed in brackets
in Algorithm 2 must be added to overcome
this difficulty. The bracketed coding requires
much of the algorithm time, so if a computer
has built-in arithmetic character manipulation, then the algorithm becomes increasingly faster.
With an example similar to Example 1, we
find that the execution time, with the bracketed coding included, is drastically different
from the bit map time. This is primarily because of the easy access to any character. To
access by column instead of by row, only the
first row location of the correct column need
be found. To find the correct location of the
character in row 2, it is sufficient to add just
the column dimension. This process is continued until the end of the matrix is encountered.
For a column manipulation, then, we easily obtain Algorithm 3, similar to Algorithm 2.
EXAMPLE 2. As in Example 1, a 500 X
500 matrix exists with 5 % density, and
it is necessary to perform 10 column
operations. It is therefore necessary to
execute Algorithm 3, 10 times, so the
execution time will be approximately
10 column operations X
[(initialization time to lobate
beginning of each row)
500 rows X (time to locate bit in
bit map)
•
115
+ (1 -- .05 density) X (time to
process 0 bits)
+ .05 density X (time to process 1
bits)]
which is about 30 msec on the I B M 360/65,
and has incorporated 2 additional ~sec that
were included for the mathematical operation
not listed in the coding. As with Algorithm
1, the limitations are due to the use of halfwords for the index vector, and to the use of
an index register. Note that there is a considerable time savings, but at the expense of
computer memory. Again, not taken into
consideration is any further computer processing, other than the above coding, such as
updating index registers, which may be necessary and require more time.
Unhke the bit map scheme, where the entire row of the bit map up to the desired element must be scanned for nonzero entries
before data manipulation can occur, the address map method requires only a reference
to the desired element. Because the storage
location of a data element is found independently of all except the desired address
displacement, the address map method
blends well with the concept of parallel
processing. Parallel processing involves the
s~multaneous execution of a sequence of operations by dependent central processing
units. Thus, using the address map method,
4 separate central processing units could simultaneously execute the required arithmetic on 4 different elements of the matrix;
at best, using the bit map method, different
steps in the execution of 1 matrix element
would be shared by the 4 central processing
units. Employing the address map method,
the processing units could work independently, except for the final results; while the
bit map method would require transfers of
information from one processing unit to the
other processing units to execute the shared
steps, which introduces an additional time
lag.
While no references have been found to
explicitly endorse or suggest this method,
and comparatively large core requirements
exist, the address map scheme m a y prove
useful with some future computer t h a t features both very fast core of a few million
characters and a multitude of parallel proc-
Computing Surveys, Vol. 5, No. 2, June 1973
8. 116
•
U. W . Pooch and A . Nieder
0
2
0
0
the row designation and another specified
number of bits for the column designation
(Figure 4).
If computations are to be performed in a
row manner, it is highly practical and efficient to order the nonzero entries first by
rows and then by columns. Ordering the entries by rows makes it unnecessary to maintain the row index for every nonzero element;
only the row need be identified for the first
nonzero element of each row, as it is known
t h a t all the following entries up to the next
row indicator belong to the same row. In
order to create the row marker, a check bit,
such as a minus sign bit, can be set in the
first column index word of each row (Figure
5), or as is usually done, an additional and
separate row index vector can be created
(Figure 6). The row index element generally
contains the address or index number of the
first column index for the row. The same syst e m m a y be applied to ordering the entries
I: O 4 °
oo 1
o
7
9
FIG. 3
v(]>
v(2)
v(3)
v(4)
v(5)
v(6)
v(7)
5
l
2
÷
2
Z(1)
2
l
÷
6
Z(2)
2
3
+
4
Z(3)
3
l
÷
3
Z(4)
4
l
÷
4
2
÷
7
9
z(s)
z(6)
4
4
÷
5
Z(7)
Row
FIG. 4
nators
0
Sample matrix.
Col umn
Indexing with row and column deslg-
V(2)
essing units. Hoffman and McCormick [22]
state t h a t at present the value of parallel
processing on a large scale is debatable as
far as manipulating sparse matrices, as there
are virtually no available computers with
more t h a n just a few parallel central processing units, and the field is quite unexplored.
IV. R O W - C O L U M N
2
V(1)
SCHEME
Row-column indexing schemes refer to methods relying on paired vectors of some type;
generally one vector contains the nonzero
elements, which are most often ordered by
rows or columns, and the other vector maintains the indexing information. Row-column
indexing schemes are sometimes referred to
as block index, row, or column packing
schemes, depending on the author's description of how the indexing algorithm works
[7, 15, 17, 20, 23-24].
I n the simplest, but not the most core- and
time-efficient form, each nonzero element of
the matrix has a corresponding index word
t h a t contains a specified number of bits for
Computing Surveys, Vol 5, No 2, June 1973
V(3)
V(4)
V(5)
V(6)
V(7)
÷ ~
z(l)
Z(2)
-
1
÷
+
3
+
-
1
Z(4)
-
1
z(5)
~(6)
:
÷
+
2
+
4
Row Column
indicator
(Sign b i t )
z(3)
Z(7)
FIG. 5. Indexing with row m d m a t o r and column designation
VR(1) ~
!
VR(2)
VR(3)
VR(4)
First
column
index
for
each
row
(halfword)
V(1)
V(2)
1 1
V(3)
3 1 ÷
V(4)
1 : ÷
V(5)
" '
V(6) 2 ~
÷
V(7)
4 i ÷
Column
(halfword)
2
6
4
3
7
9
5
z(1)
z(2)
z(3)
z(4)
z(5)
z(6)
z(7)
FIG 6 Indexing with row vector and column
index vector.
9. Indexing Techniques for Sparse Matrices
by columns if column operations are to be
performed.
Figures 3 through 6 depict sample vectors
for the row-column schemes described above.
The index vectors are V and VR; the nonzero entries are contained in vector Z. The
data matrix used in Figures 4 through 6 is
displayed in Figure 3. The nonzero entries
of the data matrix are stored by rows, in
order of increasing column number. All index
vectors are full words unless otherwise noted.
From the above figures it is evident, there
exists a wide possibility of variation in the
row-column scheme of indexing. Further
variations and adaptations can occur as a
result of optimizing peculiar computer characteristics, or as a result of making calculations on special forms of sparse matrices,
such as block matrices.
However, caution is advised, for such
optimizations may result in a useless program whenever system changes occur, and
should therefore only be used when they are
critical economies of the calculations.
In the instance of computer peculiarities,
Smith [17] states that a particular type of
second generation IBM computer did not
utilize the bits of the second word in extended-precision floating-point calculations
that were normally used as the exponent
bits in single precision floating-point calculations. A sparse matrix row-column indexing algorithm was developed that employed these otherwise wasted 8 to 9 bits as
the row or column indices, and could accommodate matrices up to order 255 and 511
respectively.
For the case of a special sparse matrix, the
row-column indexing scheme for a blockdiagonal matrix could become a blocked
indexing scheme. The blocked indexing
scheme would be identical to the row-column
method, except that the large sparse matrix
is partitmned into several smaller submatrices (blocks). Then each submatrix is
identified with a separate row-column
scheme of some sort.
A blocked indexing scheme may also be
used to refer to combining several column
indices into one block (word). For example,
one 64-bit word would contain 4 column
indices, each index of 16 bits. When a row
•
117
operation is performed, then, 4 nonzero
elements can be readied for processing at
the expense of a loading time for only one
block [17].
I t should be noted t h a t for many computers and algorithms more time is required
to load a referenced word for arithmetic
processing than is required to perform the
necessary arithmetic to isolate the required
bits of the referenced word. Likewise, more
time is required to load extended-precision
words than ordinary ,words. Also, since
most computers are geared to utilize arithmetic data primarily by words, more time
is required to load a half-word for arithmetic processing than is required to load a
full word.
Another major variation, known as delta
or displacement indexing, is also popular,
and is somewhat similar to the address map
form of indexing. For one particular example
of a delta indexing scheme, one 64-bit extended-precision word contains one 16-bit
index and six 8-bit displacements to the index. Therefore, the column indices of 7
elements can be referred to by loading and
processing one extended-precision word,
which can result in both a considerable time
and core savings. For a delta of 8 bits, it is
possible for 2 nonzero entries of the same
row to be a maximum of 255 columns apart.
If elements can appear farther apart than
255 columns, then a greater number of bits
must be allocated for each delta or the
method must be abandoned. To determine
the column number of the first element
paired with the 64-bit index word, the first
16 bits of the index word are used. In order
to determine subsequent column numbers
for any other element paired with the 64-bit
index word, the appropriate delta is added
to the first 16 bits and the sum of deltas in
between.
Smith [17] also states that delta indexing
is more efficient for large order (implied
order about 250) sparse matrices than a
blocked index form. Figures 7 and 8 depict
the blocked and delta indexed word mentioned above, and are equivalent.
EXAMPLE 3. From Figure 7, column
index 3 = 1078. From Figure 8, column
index 3 = 1027 + 20 -t- 31 = 1078.
Computing Surveys, Vol, 5, No 2, June 1973
10. 118
*
U. W. Pooch and A. Nieder
1027
Column
index 1
1047
1078
1095
Column
Column
Column
index 2
index 3
index 4
(16 bits each index)
FIG. 7 Blocked index word.
For the row-column indexing method,
using a column index for each nonzero
entry and a row index vector, there is a
required minimum for indexing W =
I / B ( J . T . D + V) words; where I is the
number of rows; J is the number of
columns; T is the number of bits used
for a column index element; D is the
density of the matrix; V is the number
of bits used for a row index element; and
B is the number of bits per word. In
reality, however, for matrices up to
order 65,535 (in excess 32,768 notation),
half-words may be most conveniently
and efficiently used for all the row and
column indices. Half-word indices are
used to increase core savings at a
generally tolerable increase in execution
time; few it any matrices of order 30,000
or greater have been of notable use.
Using half-word indices, then, the abovementioned indexing scheme requires a
minimum core storage of
ERow-co~umn = ( 1 / 2 J + D ) %
of the full matrix for indexing.
To access an M , element, it is necessary
to refer to the ith row index, which points
to the first nonzero element of the ith row.
The column indices between the ~th and
i + 1st row indices are searched for j. If the
column indices searched do not contain j,
the M , element is zero; otherwise the data
element paired with the j column index is
fetched and processed.
For row operations, as long as the matrix
remains ordered, execution time is very fast.
For more than a few column operations,
however, on a matrix of order greater than
about 200, it is almost always more convenient and efficient to transpose the entire
matrix and reorder all the data elements
before performing the desired arithmetic.
Again, the same situation exists as with the
bit map; if the data and indexing scheme
can be transposed in less time than the difference between the column and row execution times, then the transpose operation will
conserve execution time.
Unlike the bit map and address map
schemes, which have constant core requirements for indexing, the row-column method
has a core requirement for indexing directly
proportional to the matrix density. Since
each nonzero element has a paired column
index, only the number of elements in the
row index vector is constant. For example,
adding two 50 X 50 sparse matrices, M A and
MB, does not in general produce the result
that the total number of resulting nonzero
elements is the sum of the nonzero elements
for each matrix before the matrix addition:
if M A has 250 data elements and M B has
450, the sum of matrices MA and M B will
not, in most cases, have 700 elements, i n
the sum of matrices M A and MB, the only
surety is t h a t there will still be 50 row index
elements. A variable amount of core for indexing creates core allocation difficulties
t h a t m a y not be readily acceptable to the
user.
In comparison to the bit map method, the
row-column indexing method is noted for its
fast execution time, when data elements are
properly ordered, and its ease of programming, even for matrices of very large
order (in the thousands). A wide variety of
references endorse (or imply an endorsement
of) a row-column techmque for indexing [15,
17, 25-30], or a block-diagonal method [3134], especially for particular applications, as
noted in the Introduction, or for special
matrices, such as symmetric matrices. I t
should be noted that a symmetric matrix
1027
20
31
17
Column
delta
delta
delta
m
delta
index 1
(16 bits)
(8 bits each index)
FIG 8. Delta index word.
Computing
S u r v e y s , Vol
5, N o
2, J u n e 1973
__
__
delta
delta
11. Indexing Techniques for Sparse Matrices
decreases by almost 50 % the core requirements in the row-column technique, both for
the data elements and for the indexing
elements.
Two of the more general sets of algorithms
encountered for processing random, and
some special, sparse matrices and employing
the row-column indexing technique are
MATLAN [29], an I B M product, and Algorithm 408 [30], a more recent private effort.
As these algorithms are readily available
and are of general interest, a particular
coding example is not given for the rowcolumn indexing technique. Both these
algorithms were intended for use on sparse
matrices of order less than about 32,700, and
are more efficient for orders less than (about)
1,000.
MATLAN is a programming system, operating under the control of Operating
System/360, and has a very wide applicability. MATLAN includes many supplementary features, such as different versions
for an all-core problem and for a segmented
problem, three overlay structures for core
storage, and options on precision. A segmented problem exists when portions of the
problem under consideration are stored in
core and on tapes or disks, an all-core problem exists when the storage requirement is
such that the entire problem is stored in fast
memory. Because of the variable precision
option and the all-core or segmented feature,
it is difficult to assess execution times. Array
dimensions are limited to 32,756, which indicates half-words are used for indexing
purposes.
Algorithm 408 uses a variation of the indexing algorithm depicted in Figure 6.
Instead of having the row index vector contain the address or index number of the first
column index for the row, the row index
vector contains the number of stored elements in the row. In addition, the row index
vector is appended to the column index
vector by using the same array name, M.
While the scope of Algorithm 408 is not as
broad as ~¢IATLAN, Algorithm 408 has the
distinct advantage of being readily alterable: a section of the reference is devoted to
possible alterations, such as combining three
or more indices to a word of the M array.
•
119
Because of the great variation in coding,
at present it is not considered economically
worthwhile to compare actual core storage
and execution times to determine which of
the many different existing algorithms employing the row-column method is the most
efficient or optimal.
A good basis for examing some of the rowcolumn indexing scheme characteristics rests
on using half-word indices, with a row index
vector, for calculations. At worst, the
method (as typified by Algorithm 408) will
utilize less core than the full matrix up to a
density of slightly over 66%. Conservation
of core allocation and execution time increases as the density decreases.
It has been noted that the bit map method
employs approximately 4 % of the full matrix
for indexing. Therefore, it can easily be seen
that when the matrix density falls below
about 4%, the row-column method will
conserve more core than the bit map scheme.
In addition, the advantage of the faster indexing into the data by the row-column
method in this case almost excludes the use
of the bit map, except for special cases, such
as a Boolean problem.
V. THREADED LIST SCHEME
A threaded, or linked list, scheme contains
one element of an array in core for each nonzero element of the sparse matrix. Each
array element in a linked list method has at
least three components: one component
contains the row and column indices; another contains the matrix element (data);
and the third contains the address of, or a
pointer to, the next array element.
If the third component of an array element
were not present, the linked list scheme
would have, at an absolute minimum, the
same core requirement for indexing as the
row-column method. The third component
adds W = A*D/B more words for indexing
which gives a minimum total of W -- I / B
((J.T A- A)D A- V) words for indexing
a threaded list scheme: where I is the number
of rows; J is the number of columns; D is
the density of the matrix; T is the number of
bits used for a column index; V is the number
Computing Surveys, Vol. 5, No. 2, June 1973
12. 120
•
U. W. Pooch and A. Nieder
of bits used for a row index; A is the number
of bits required for an address to range
through the entire amount of core used to
contain the complete threaded list; and B is
the number of bits per word. For any practical application, however, both the row and
column indices must be retained, which
gives an overall minimum core allocation
for indexing of W = I . J . D ( T + V + A ) / B
words.
As in the previously discussed methods of
indexing, half-words (16 bits) are used in
practice for both the row and column indices,
which give capabilities of a matrix of order
65,535 (in excess 32,768 notation). In addition, because of the great difficulty and
great time involved in manipulating addresses of less than full word size (refer to
Bit Map Scheme), full words (32 bits) are
conveniently used for addresses. These considerations now require for the overall minimum core storage for indexing, W =
2 . I . J . D words. As a percentage (E) of the
full matrix, this is
E L m k e d LI~t =
2*D %
necessary for indexing.
In order to reference an M , element, the
entire threaded list must be searched if the
nonzero elements are stored in a random
manner. Elements can be stored, except for
updates, and accessed more efficiently by
rows and colums, which can reduce access
time to particular elements or rows of elements. Elements need not be stored contiguously for reasonably efficient processing.
In one particular application of a threaded
list scheme, data elements were initially
stored by rows and columns, and a table of
pointers was kept. Each pointer addressed
the beginning element of a group of 8 elements. Any particular item, or row of items,
could be found by a binary search on the
list of pointers. Example 4 typifies the search
for a particular matrix element in this application of linked list indexing.
EXAMPLE 4. Matrix elements are
stored by rows and columns. The
element to be found is in the middle row
of the matrix, so the pointer in the
middle of the pointer list is selected.
The contents of the pointer word
Computing Surveys, Vol 5, No 2, June 1973
addresses an element of the linked list.
The element is then examined, to
compare the row and column components with the required row and
column numbers. Three separate cases
can now occur:
(1) If the row and column numbers
match, the correct element has been
found.
(2) The rest of the elements in the
group of 8 are searched, and if the
row matches, but not the column, it
is known that the correct group can
probably be found by a search on
the next few pointers about the
pointer last used. if the pointer
indexed an element whose column
number was greater than required,
then the next lower pointer is used.
(3) The rest of the elements in the
group of 8 are searched, and if the
row doesn't match, then a binary
search on the pointers is continued.
In a binary search, if the pointer
indexed an element whose row
number was greater than required,
the next pointer to be selected is
the one halfway between the last
pointer (upper bound pointer in
this case) and the lower bound
pointer (the first pointer in this
case).
When the procedure is iterated, (2) above,
and the appropriate groups are searched, but
the correct row and column cannot be found,
then it is known that the required matrix
element is the null element.
It should be noted that unless the data
elements are in reasonable order, the binary
search on the pointers is almost useless. The
particular value of a linked list is that there
is no longer the requirement that data elements be stored contiguously: updates, insertions, and deletions of matrix elements
are performed by altering the address component of the appropriate hnked elements.
However, a linked list expansion or contraction results in some pointer groups
having a greater number of link elements,
and some other pointer groups having fewer
link elements. The alterable number of link
13. Indexing Techniques for Sparse Matrices
elements in each pointer group necessitates
a periodic updating of the pointer table. A
pointer table update is vital to the efficiency of the binary search, and may require
a great amount of execution time. The
amount of execution time required for a
pointer table update depends directly on
the number of link elements to be grouped,
as each link element must be inspected m
order to find each successive link element.
For peak efficiency of the binary search,
every group should have the same number
of linked list elements.
Using the additional pointer table to
combat the otherwise slow execution time of
the linked list scheme, one pointer exists
for each 8 nonzero matrix elements. Employing a full word for each pointer, which
is an address, we now have a minimum indexing core requirement of W = 21/~*I*J*D
words, for
ELmked
List
--~
2 . 1 2 5 , D %
of the full matrix. This is a much greater
core requirement than the row-column
methods of the previous section require for
any matrix of order greater than three.
Figure 9 depicts a few elements of a linked
list, and the correlation between elements.
A pointer table is not included.
Not previously mentioned is the practical
necessity of maintaining a table of available
addresses, so that core allocation remain
conservative during the insertion and deleAddress
Address
I051
next
RW
O
Column
element
. . . . . . . . . . . . . . . . . . .
Data
element
*
Address
1162
.
F'2
I 3 1 9841
J
i
i . . . . . . . . . . .
Address
.
.
.
.
.
.
.
.
.
.
i.
.
"1
. . . . .
I
I
1273
.
I
i
i
I
H 41 1,4
FIG 9
f .6'2 J
Linkedhst elements.
f
•
121
tion of matrix elements. When matrix
elements are deleted, the address of the
deleted link element must be appended to
the table of available addresses. Not only
must the table be maintained in fast core
but the threaded list scheme additionally
requires a buffer area to be used for the inserted and/or deleted link elements. If such
a buffer area is not used or kept, then core
will not be conserved and the prime ~dvantage of the threaded list will have been
discarded.
Few references endorse, or suggest endorsement of, the linked list scheme as a
practical method for indexing sparse matrices [15, 34-37]. Only a few sources [15,
38-40] found in the literature survey actually utilized the threaded list scheme;
while the actual algorithms were seldom
described in great detail, the scheme basically followed the designs of Example 4.
Overall, the threaded list technique of indexing into sparse matrices requires a significant amount of execution time for processing
indices, in addition to the core requirements
of a buffer and two separate tables. Inherent
in the method, then, are considerable execution times for processing and considerable
core expenditure, in comparison with the bit
map and row-column schemes for identical
matrices. Offsetting these disadvantages,
however, the linked list scheme has the
distinct advantage of not requiring a significant amount of execution time to update the
linked list by insertion or deletion of single
matrix elements or series of matrix elements.
All other previously discussed indexing
techniques require a shifting of data when
an update is performed, which will take a
great amount of execution time when
numerous matrix elements have to be shifted
to make the appropriate word available for
the update. The linked list scheme is slow for
random processing of matrix elements; however, in many applications items are accessed sequentially by row or column. In
these applications, proper chains of pointers
speed up processing greatly. As with previous methods, a definite symmetry of the
sparse matrix reduces proportionately the
core requirements for indexing.
Computing Surveys, Vol. 5, No. 2, June 1973
14. 122
•
U. W . P o o c h a n d A . N~eder
Vl. DIAGONAL OR BAND INDEXING SCHEME
/
-199
Band and diagonal matrices are special
types of matrices t h a t occur frequently in
electrical engineering,
structural
engineering, nuclear engineering and physics,
solutions to differential equations, and a
host of other fields, as mentioned in the
I n t r o d u c t i o n . Band and diagonal matrices,
while of frequent occurrence, should not be
mistaken as a general case of sparse m a t rices.
When band or diagonal matrices occur, a
special effort on the part of the user should
be made to a d a p t his processing a n d / o r indexing algorithms to the case at hand. This
adaptation should be made because of the
inherent simplicity of processing, manipulating, and solving band matrices, and also
because of the opportunity to minimize core
allocation and execution time.
In most cases, band or diagonal matrices
are processed either wholly by rows or columns, and httle or no processing of single
elements occurs. For a band matrix, a comm o n manipulation involves decreasing the
band width. I n such a manipulation, it is
normal procedure for one entire row (column) to operate on the row (column) immediately above or below it (or to either
side). With such a simple processing sequence, it is evident t h a t only a few rows
(columns) need be maintained in fast core
for immediate use.
If d a t a transmission rates are comparable
to the rate with which rows (columns) are
manipulated, then rows (columns) not in
immediate use can be stored on slower access
devices, such as tapes or disks. Storing data
on tapes or disks frees the more expensive
fast core. I n most machine configurations
there is a much larger amount of m e m o r y
available in the slower devices. When slow
devices can be used efficiently for processing
band matrices, the capability of manipulating large order sparse matrices is limited
by the m a x i m u m allowable execution time
and the desired accuracy limits of the results,
and not by the order of the matrix involved.
To further conserve execution time, but at
the expense of fast memory, the entire band
matrix can be stored in fast core. Preserving
Computing Surveys, Vol 5, No 2, June 1973
lO0 5
99
-199.
lOl
98 5
-199
98
lOl 5
0
-199, I02
97 5 -199
97
102 5
-199
I03
96,5 -199
0
96
103 5
-199
104
95 5 -199
/
FIG 10. Band matrix.
the entire matrix in fast core eliminates the
transmission times between fast core and
auxiliary devices, as well as the time required to restore elements in fast core,
which is done prior to data manipulation
and processing. Another prime a d v a n t a g e
directly involved with data transmission is
the use of overlapping channels in burst or
select mode. However, when the matrix is
fully maintained in fast core, channels will
then be available to other users on multi-user
computers.
If the band matrix has full bands, t h a t is,
no row has any zero elements within the
band, then the total number of elements to
be stored is the band width multiplied by
the number of rows in the matrix. Figure 10
depicts a band matrix with full bands (a
band width of 3 here):
EXAMPLE 5. Figure 10 is the resulting
9 X 9 matrix obtained by using a central
difference approximation (3 points) to
solve the boundary-value differential
equation 2 + 3t 2 = y + y' + y" using
10 intervals between the points y ( t =
0) = 0. a n d y ( t = 1) = 1.
A 5-point interpolation would yield a
band width of 5; 50 intervals would
result in a 49 X 49 matrix. N o t e that
the augment column, a constant
associated with each row of the matrix,
is not considered here as an integral part
of the sparse matrix. Accuracy of results
depends on the number of intervals,
n u m b e r of points in the interpolation
formula, and computer round off.
I n one particular application of processing
a band matrix by rows (columns), it is convenient and efficient to store elements in full
vectors, one vector for each super- or sub-
15. Indexing Techniques for Sparse Matrices
diagonal of the band matrix. Since the
diagonal has the greatest number of elements,
the vector for the diagonal will be the largest
vector. To avoid double indexing, which
takes greater execution time, an additional
table of addresses is created. Each element
of the address table contains the address of
the first element of the respective vector.
The indexing scheme in the algorithm used
to arithmetically manipulate the band
matrix is then altered to suit the storage
scheme.
If, for some reason, it is more convenient
to store elements in a row or column form,
e.g., because of a very difficult or time-consuming arithmetic manipulation, most of
the advantage of employing a band scheme
is lost, and other methods of indexing should
be considered.
Band matrices, as noted above, are unusual from an indexing standpoint because
of the very slight core requirements for
indexing. For the application described
above, only W = I , V / B words are required
for indexing; where I is the number of rows;
V is the number of bits used for a row index
element; and B is the number of bits per
word. As a percentage (E) of the full matrix,
this indexing requirement is
Ezand = 1 0 0 / J %
where J is the number of columns in the
matrix when full words are used for the
table of addresses. If hMf-words are adequate,
it decreases this requirement further by onehalf.
It should be brought to the attention of
the user that in the instance where bands do
contain zero elements, a decision should be
made whether to employ a band scheme,
which may not be very efficient in use of core
if a large number of null entries exists, or
some other particular scheme, such as a
block-diagonM scheme, which may not conserve execution time.
Many papers [4, 10, 34, 40-43] are concerned with band matrices, primarily, as
said, because of the prevalence of band
matrices in many specific fields of interest.
Also, many algorithms are readily available
for processing band matrices; FOaTRAN M
•
123
[44] being one of the more recent programming packages.
VII. CONCLUSION
In the previous sections four major types of
indexing methods were discussed, three of
which are in general use: the bit map scheme,
the row-column scheme, and the threaded
list scheme. Each major type, of course, has
many variations (the address map method is
not in general use at present, so no variations
occur). The important special case of the
band matrix is discussed as a separate entity,
because it is not a general case of a sparse
matrix, even though it has wide application.
As stated in the Introduction, one of the
major considerations in selecting a particular
indexing method is the amount of fast core
the method requires, in addition to the data
elements. The indexing in the bit map
method requires a fast core allocation of
approximately 4 % of the full matrix; in the
address map method indexing requires about
25 % of the full matrix. The row-column and
threaded list schemes have no definite core
requirements for indexing, and fast memory
for indexing is directly proportional to the
sparse matrix density. The percentage of the
full matrix required for indexing a rowcolumn scheme is about one times the matrix
density, and about twice the density is required for a threaded list scheme.
Previous discussion indicated that an
exact comparison of execution times must
reflect the type of mathematical manipulation being performed on the sparse matrix.
For example, the bit map method is of particular use when the matrix is used to produce an "optimal" ordering, so the matrix
inverse will not have a greatly increased
density. In contrast, the row-column method
is faster than other methods when manipulations involve one row (column) acting on
other rows (columns).
The second important aspect of indexing
scheme selection is the conservation of
execution time. If arithmetic operations are
to be performed on the data, primary consideration should first be given to a rowcolumn method; if Boolean arithmetic or
Computmg Surveys, Vol. 5, No 2, June 1973
16. 124
•
U. W. Pooch and A. Nieder
reordering algorithms are to be performed,
the bit map scheme should be considered
first; and if a great number of data elements
are to be reordered, created, or annihilated,
a threaded list scheme deserves first consideration.
The bit map scheme has a definite core
allocation for indexing, offers a reasonable
row access time, is quite fast in execution
time when row operations are performed, is
core efficient when the matrix density is
greater than 4 %, and allows very fast manipulation of logical (Boolean) operations.
Logical operations can be conveniently used
to determine when arithmetic operations are
to be executed.
As to its disadvantages: the bit map
scheme has extremely poor column access
time when elements are ordered by rows,
which in most cases requires transposing
the bit map and reordering the data elements: it makes poor use of parallel processing, requires considerable time to reorder
data elements, and is not core efficient when
matrix density fails below 4 %.
The address map proves advantageous
when character addressing is available,
makes very efficient use of parallel processors, provides ready access to any element,
does not require an extensive amount of execution time (in comparison to the bit map
scheme) to reorder data elements, and exhibits a reasonable row and column execution time.
The primary disadvantages of the address
map method are: a large fast core requirement for indexing; and the relatively large
execution time, in comparison with the
threaded list scheme, to reorder matrix
elements.
Both bit and address maps require significant execution times to transpose the mat r i x - t h e map must be transposed, and all
the data elements must be reordered. Execution time to transpose the matrix is
directly proportional to the order of the
matrix and the matrix density.
Primary advantages of the row-column
schemes are: a very fast row access time in
comparison with the bit and address maps;
a relatively fast column access time in comparison to all other methods; conservation of
Computing Surveys, Vol 5, N o
2, June 1973
core with matrices of less than 4% density
when compared to the bit map method; an
increase in efficiency as the order of the
matrix increases, as more complex variations
become more efficient; and faster reordering
than the bit map or address map methods.
The main disadvantages of the row-column scheme are that column access time
and the time required to reorder elements
greatly increase as the matrix order a n d / o r
matrix density increases.
The threaded list technique is the sole
technique that allows a simple and fast executing method of reordering, adding, or
annihilating data elements.
The threaded list scheme exhibits a
variety of disadvantages, the primary ones
being a large core requirement for indexing
in comparison with the row-column method,
a slow access time for rows when elements
are stored by rows, and an even slower access
time for columns compared with the rowcolumn method. The inclusion of orthogonal
links, as discussed by K n u t h [35], removes
some of the column access difficulties, but
only at the price of additional storage.
For the special case of band matrices, a
scheme similar to the one described in Part
VI should be used unless either half or more
of the elements within the b, nd width are
null, or the nature of the mathematical
operations to be performed dictates otherwise (as described in Part VI). If the band
matrix scheme cannot he utilized, the user
must decide which characteristics of the
other types of indexing are considered vital
to the solution, and select a method on this
basis.
A final major aspect of indexing the user
must consider concerns the adaptability and
flexibility of programming the selected
scheme, which depends upon the factors
enumerated in the Introduction. The following suggestions and comments concerning
programming flexibility and adaptability
are offered.
None of the major types of indexing
schemes requires double indexing. Double
indexing involves using one register (adder)
to index across the row, and another register
to index down the column. Double indices
have at least three drawbacks: they require
17. Indexing Techniquesfor Sparse Matrices
more time than single indices; the computer
may have a built-in limit on the number of
characters or words that can be indexed by
one or both of the registers before a new
index (base) register must be designated;and
registers are at a premium, because of the
extremely fast register to register operation
time, and should be used for more vital arithmetic. In the last analysis, the increased
time involved in double indexing is the
critical factor.
In general, the larger the order of the
matrix, the lower the matrix density. Because of this the row-column method is
preferred for matrices with orders of 1000 or
more, especially when arithmetic manipulations or operations are to be performed.
As the order of the matrix increases, it
becomes more efficient to employ more complex variations of the major types. For instance, the delta indexing scheme (as described in Part VI) conserves a considerable
amount of fast core compared with the
simpler row-column schemes, without a
great increase in execution time, when the
order approaches 1000.
If the matrix requires more fast core than
is available, the user must decide either to
segment the matrix between fast and slow
core, or to reduce the complexity of the
problem. If the problem can be simplified,
or the matrix condensed or partitioned
(blocked), then it is not necessary to segment the matrix between fast and slow core.
Simplifying the matrix involves the real
consideration of whether or not it is economically feasible to reorder rows and/or
columns to produce a new matrix that can
be more efficiently processed. Many schemes
have been developed [7, 16, 18, 27] to attempt such an optimal ordering of matrix
elements. Condensing the matrix involves
the elimination of data elements that produce insignificant or negligible change in the
results. Such condensing can often be done
with reasonable competence by somebody
skilled in the nature of the problem to be
solved. If the matrix is of block-diagonal
form, each block can be processed as a
separate entity to produce a composite result.
The availability of a virtual memory
•
125
processor might lead the user to the erroneous conclusion that the benefits of a
proper indexing algorithm are negated. This
is not so; at some time during the processing
of a sparse matrix the matrix must reside in
physical memory. It then follows that the
fewer the number of pages occupied by the
sparse matrix, the fewer the page faults
generated, and therefore the less time involved in moving the matrix to and from
peripheral paging devices. In other words,
the same benefits accruing from indexing in
an ordinary processor apply in a virtual
memory processor.When such updating of data files is anticipated, the user should designate buffer
storage. When new matrix elements are
introduced, they should be stored in the
buffer area. When a considerable humber of
corrections to the data elements exist (about
5%), then the matrix is reordered. The
threaded list scheme requires no separate
buffer area, as a buffer is inherent in the indexing scheme.
The segments of coding that contain the
actual indexing algorithm should be programmed in a low-level language, such as
assembly language, to conserve execution
time. High-level languages, such as FOgWRhN
utilize a compiler, which may not produce
the most efficient coding. For instance, if a
division by 32,768 is necessary, the high-level
language may simply create a division by
32,768 in assembly language. If the highlevel compiler, however, recognized that a
division by 32,768 is identical to shifting an
accumulator right 16 bits, the assembly
language version would be a shift right
logical or shift right double logical. The first
version would require significantly more
execution time than the more efficient assembly language program version. A considerable savings is realized when the computation is performed perhaps as many as
several million times in a program.
The user should avoid making the indexing algorithm in a subroutine form,
especially in a high-level language, because
of the added linkage time during program
execution.
While a "fast" algorithm for indexing into
arbitrarily sparse matrices would allow very
Computing Surveys, Vol. 5, No. 2, June 1973
18. 126
•
U. W . Pooch and A . N~eder
efficient core storage allocation and execution
times for matrix manipulations, it is also
evident that no such single algorithm exists,
at least at present. The advent of array
processors and pipeline computers may
eliminate the desire to handle sparse matrices in any special manner whatsoever.
However, it also appears that no matter how
large, or how fast and sophisticated, computing machines become, users will continue to strive for core storage conservation
and faster execution times. It remains to be
seen if sufficiently sophisticated indexing
algorithms will be developed to accomplish
those goals in array or pipeline machines;
or whether such machines will come into
Computing Surveys, Vol. 5, No 2, June 1973
general use and provide an environment
conducive to developing sparse matrix
indexing schemes.
For the present, the choice of an indexing
algorithm depends upon many considerations, with each major type of indexing
discussed here having particular advantages
and disadvantages. Careful selection of an
algorithm can satisfactorily achieve the
goals of conservation of core memory and
execution time. In addition, whenever there
exists some pattern to the nonzero entries,
the possibility of reorganizing the calculations as a means to handle some sparse
matrices should be carefully considered.
19. Indexing Techniques for Sparse Matrices
•
127
APPENDIX
ALGORITHM
1 BIT MAP SCHEME
Statement
Meaning
is t h e row n u m b e r t h a t will b e m a m p u l a t e d
v is t h e row i n d e x v e c t o r
b = n u m b e r of b i t s / w o r d
(* -- 1)
J is t h e n u m b e r of c o l u m n s in t h e m a t r i x
(z -- 1) * J
Save (z- l)*J
(((z1 ) * J ) 4- b - D
S, = (((~ - l) * J ) 4- b -- 1 ) / b w o r d c o n t a i n s t h e
first b i t of r e q u i r e d row
E n d of row c o u n t e r ( J )
S t a r t i n g w o r d of t h e r o w
D e t e r m i n e c o r r e c t n u m b e r of d i s p l a c e m e n t b i t s ;
M A S K = m a s k for m a x i m u m d i s p l a c e m e n t
bits
S h i f t to e l i m i n a t e i n c o r r e c t b i t s ( f r o m p r e v i o u s
01
02
03
04
05
06
07
08
09
R O W ~R I N D E X ~- v(~)
BITS ~ b
R O W ¢-- R O W - 1
C O L S ~- J
ROW e- ROW * COLS
S A V E (--- R O W
R O W ~- R O W 4- B I T S R O W ~- R O W / B I T S
10
ll
12
R O W E N D (-- C O L S
S T A R T ~- R O W
R O W E N D ~- R O W E N D
MASK
13
S T A R T *- S T A R T * 2 * * S A V E
14
15
16 C O U N T
R O W E N D *- R O W E N D
GO TO ROWSCAN
R O W E N D *-- B I T S
17
18
19 R O W S C A N
R O W ~- R O W 4- 1
W O R D ~-- b i t w o r d f r o m m a p
W O R D B 1 T *-- b i t f r o m b i t - w o r d
20
C O L N U M (-- C O L N U M 4- 1
Increment column number
21
WORDB1T
22
IF YES, GO TO MATH
Following statements are branch controls
Is t h e b i t n o n - z e r o ?
Yes, an element exists.
23 E N D R O W
24
COLNUM = COLS
~
IF YES, GO TO END1
Is t h e c o l u m n c o u n t e r e q u a l to t h e r o w c o u n t e r ?
Y e s , e n d of row
25
COLNUM
26
27
28 M A T H
I F Y E S , GO T O C O U N T
GO TO ROWSCAN
R I N D E X e - - R I N D E X 4- 1
H a v e we s h i f t e d c o m p l e t e l y t h r o u g h b i t m a p
word?
Yes, fetch another word.
N o , s c a n n e x t b i t in w o r d
R I N D E X = a d d r e s s of n o n z e r o e l e m e n t
1
AND
rOW)
-
SAVE
C o r r e c t for e h m i n a t e d b i t s
B r a n c h to code to s c a n row in b i t m a p for 1 b i t s
F o l l o w m g code s c a n s o n e e n t i r e r o w of a b i t m a p .
A f t e r first w o r d of row is s c a n n e d , t h e b i t
counter (ROWEND) = b
Increment bit map word address by one
W o r d of b i t m a p
P i c k u p h{gh o r d e r b i t f r o m b i t w o r d
(WORD)
= 1
= ROWEND
COLNUM
element
=
column
number
of
non-zero
P e r f o r m r e q u i r e d o p e r a t i o n on e l e m e n t
29
30 E N D 1
GO TO ENDROW
STOP
Computing Surveys, ¥oi 5, No 2, June 1973
Return
E n d of o p e r a t i o n o n t h e row.
20. •
128
U. W . Pooch and A . Nieder
ALGORITHM
Statement
01
02
03
04
05
06
07 S T A R T
08
09
10
ll
I
12
13
14
15
16 M A T H
2: A D D R E S S
MAP SCHEME
R O W *-- i
R I N D E X ~-- v(~)
R O W *-" R O W -- 1
C O L S ¢-- 3
R O W (-- R O W * C O L S
R O W (-- R O W - 1
R O W ~-" ROW + 1
C O L N U M (--- C O L N U M + 1
COLNUM > COLS
IF YES, GO TO ENDROW
B Y T E ~- b y t e f r o m
address map
BYTE ~ 0
IF YES, GO TO START
C H E C K ~-- 0
CHECK *- BYTE
CHECK ~ CHECK + RINDEX
Meaning
i = row
v = row index vector
(i - 1)
3 = $ columns
j*(~ -- 1)
(3*(2 -- 1)) -- 1
Increment across row
Increment column $
E n d of row
Yes, done
Pick up partial word
I s b y t e zero?
Reenter scan process
Zero w o r k a r e a
Byte to work area
Points to non-zero element
Required operations
performed here
17
18 E N D R O W
19
GO TO START
STOP
END
ALGORITHM
Statement
01
O2
03
O4
O5
O6
07 S T A R T
O8
O9
10
11
12
13
14
15
16
17 M A T H
Reenter scan process
Finish
3: A D D R E S S
MAP SCHEME
B E G I N *-- A d d r e s s of a d d r e s s m a p
B E G I N ~-- B E G I N + J
B E G I N *-- B E G I N -- 1
C O L S (-- 3
ROWS ~
B E G I N ~-- B E G I N - C O L S
B E G I N ~-- B E G I N + C O L S
R I N D E X *- v(I)
R O W C T R ~-- R O W C T R + 1
ROWCTR > COLS
IF YES, GO TO ENDROW
BYTE *- byte from address map
BYTE = 0
~
IF YES, GO TO START
C H E C K ~- 0
C H E C K ~- B Y T E
C H E C K ~-- C H E C K + R I N D E X
Meaning
Pointer
J = column g
3 = g columns
i = g rows
Increment address
Row index vector
I n c r e m e n t row c o u n t e r
P a s s e d e n d of m a t r i x ?
Yes, passed end
Pick up partml word
Is b y t e zero?
Reenter scan process
Zero w o r k a r e a
Byte to work area
P o i n t s to n o n - z e r o e l e m e n t
Required operations
performed here
18
19 E N D R O W
20
GO TO START
STOP
END
Reenter scan process
Finish
Computing Surveys, Vol 5, No 2, June 1973
21. Indexing Techniques for Sparse Matrices
I
•
129
ROW + i I
,1,,
)--
IR,,DEX + v(i)
_
i_ __~ .......
~, ~IT÷bit frombitmap
I
COLS÷ J
(
D
[Row ~- RO.*CO'S i
$
,, .
I"R W
O÷
¢
NO
¢
(ROW + BITS - I ) / B I T S l !
[i.o.~,o: ~oc,~
.~
[START ÷ R W
OI
• ~
,
I MASK& SHIFT R W N I
O ED
I
....
RowEND~
NO
oc.o. ; O"E"9
@.o
@
R W N- SAVEI
O ED
FIG A1. Flowchart--algorithm 1 bit map scheme.
Computing Surveys, VoL 5, No. 2, JuBe 1973
22. 130
•
U. W. Pooch and A. Nieder
C E K÷ 0
HC
~ NDEX÷ v(i)
ICHECK÷BYTE
F~o~~o~-~I
~-EX~
ICHECK÷ C E K+ RIND - HC
CL÷j
OS
@
O _,]
~,ROW÷R W+ l
YS
E
( IS COLNUM,COLS~
~
~NO
~_~TE ÷ bYte from address map ]
~IS BYTE= 0?~
@"°
FIG A2
YES~
F l o w c h a r t - - a l g o r i t h m 2: a d d r e s s m a p s c h e m e
Computing Surveys, Vol 5, No 2, June 1973
23. Indexing Techniques for Sparse Matrices
BEGIN
I,
÷ address map address I
•
131
CHECK 0
÷
T
FCHECK÷ BYTE
BEGIN ÷ BEGIN + J - l
C E K÷ C E K+
HC
HC
~ BEG,N ÷
RINDEX
$
BEGIN + CO'S l
[ RINDEX÷ v(I> I
( ~ RO.C~> c o ~
~
'~._j /
NO
I-BYTE+ byte from address map~
~ ~,~ o~; Y~ < ~
+
Fza A3
F l o w c h a r t - - a l g o r i t h m 3 address m a p scheme.
Computing Surveys, Vol, 5, No. 2, June 1973
24. 132
•
U. W. Pooch and A. Nieder
BIBLIOGRAPHY
1. BRAYTON, R., GUSTAVSON, F., AND WILLOUGHBY~ R. "Some results on sparse matrices." RC2332, IBM Watson Research Center,
(February 1969), 37-46.
2. LARSEN, L. "A modified inversion procedure
for product form of the inverse-linear programruing codes " Comm. ACM 5, 7 (July 1962) 382383
3. LIVESLEY,R. "An analysis of large structural
system." Comp. J. 3, (1960)34-39.
4. McCoRMICK,C.W. "Application of partially
handed matrix methods to structural analysis." Sparse Matrix Proceedings, R. Willoughby (Ed.) IBM Watson Research Center,
RAl1707 (March 1969) 155-158
5 ORCHARD-HAYs, W. Advanced L~near Programming Techniques McGraw-Hill, New
York, 1968, 73-82.
6. TEWARSON, R. "On the product form of inverse of sparse matrices." S I A M Rewew 8,
(1966) 336-342.
7 TEWARSON,R. "Row column permutation of
sparse matrices." Comp. J 10, (1967/68)
300-305
8. BRAYTON, R., GUSTAVSON, F , AND WILLOUGHBY, R. "Some results on sparse matrices." (Introduction), RC2332, IBM Watson
Research Center, (February 1969) 1-3.
9. BASHKOW, T "Network analysis." Mathematical Methods for Digztal Computers A.
Ralston and A. S. Wilf, Eds., Vol. I, John
Wiley and Sons, New York, 1967280-290
1O. TINNEY,W F. "Comments on using sparsltv
techniques for power system problems."
Sparse Matrix Proceedings R Willoughby, Ed.,
IBM Watson Research Center, RAl1707
(March, 1969) 25-34.
11. PALACOL,E . L . "The finite element method
of structural analysis " Sparse Matmx Proceedzngs R. Willoughby Ed., IBM Watson Research Center, RAl1707 (March, 1969) 101-5.
12. RALSTON, A. "Numerical integration methods for the solution of ordinary differential
equations." Mathematzcal Metaods for Dzgztal
Computers A. Ralston and A. S. Wilf Eds, Vol.
I, John Wiley and Sons, New York, 1967, 95109.
13. ROMANELLI, M "Runge-Kutta methods for
the solution of ordinary differentml equations " Mathematzcal Methods for Dzgztal Computers A. Ralston and A S Wilf, Eds , Vol. I,
John Wiley and Sons, New York 1967, 110-20.
14. WAC~SPRESS,E "The numerical solution of
boundary value problems " Mathematzcal
Methods for Dzgztal Computers A Ralston and
A. S. Wflf, E d s , Vol. I, John Wiley and Sons,
New York, 1967, 121-7.
15. WEIL, R,, JR, AND KETTLER, P. " A n algorithm to provide structure for decomposition."
Sparse Matrzx Procee&ngs R. Willoughby, Ed.,
IBM Wa~sca Research Center, RAl1707
(March, 1969) 11-24
16. GUSTAVSON,F., LINIGEB,W., WILLOUGHBY,R.
"Symbohc generation of an optimal crout algorithm for sparse systems of linear equa-
Computing Surveys, Vol. 5, No 2, June 1973
17.
18.
19
20.
21.
22
23.
24.
25.
26.
27
28.
29.
30.
31.
32.
33.
34.
tlons." Sparse Matrix Proceedings R. Willoughby, Ed., IBM Watson Research Center,
RAl1707 (March, 1969) 1-10.
SMI~I, D . M . "Data logistics for matrix inversion." Sparse Matrix Proceedings R. Willoughby, Ed., IBM Watson Research Center,
RAl1707 (March, 1969) 127-32.
SPILLERS,W. R., AND t~ICKERSON,N. "Optimal elimination for sparse symmetric systems
as a graph problem." Quar Appl. Math. 26
(1968) 425-32
STEWARD,D. V. "On an
to technique for the analysis of thepproacha of large
structure
systems of equations." S I A M Rev 4 (1962)
321-42.
TEWARSON,R . P . "The Gausslan elimination
and sparse systems," Sparse Matrzx Proceed~ngs R. Willoughby, Ed., IBM Watson Research Center, RAl1707 (March, 1969) 35-42.
GIVENS, W., McCoRMICK, HOFFMAN, et al.
"Panel discussion on new and needed word
and open questions." (Chairman P. Wolfe),
Sparse Matmx Proceedings R. Willoughby, Ed.,
IBM Watson Research Center, RAl1707
(March, 1969) 159-80.
WILKES, M. V. "The growth of interest in
microprogramming: a literature survey,"
Com p. Surveys, 1,3 (September, 1969) 139-45.
ORC~ARD-HAYs,W. " M P s y s t e m s technology
for large sparse matrices." Sparse Matrix Proceedzngs R. Willoughby, Ed , IBM Watson Research Center, RAl1707 (March, 1969) 59-64.
CHANG, A. "Apphcatlon of sparse matrix
methods in electric power system analysis."
Sparse Matrix Proceedings R. Willoughby, Ed.,
IBM Watson Research Center, BAll707
(March, 1969) 113-122.
BRAYTON, n . , GUSTAVSON, F., WILLOUGHBY,
R "Some results on sparse matrices." IBM
Watson Research Center, RC2332 (February
1969) 21-22.
CHhRTRES, B A., ANn GLUDEN, J C. " C o m putable error bounds for direct solution of
hnear equations." J ACM 14, 1 (Jan 1967)
63-71
FORSY~HE, G. E. "Crout with pivoting."
Comm. ACM 3 (1960) 507-8.
JENNINGS,A. "A compact storage scheme for
the solution of symmetric linear simultaneous
equations." Comput. J. 9 (1966/67) 281-5
System 360 Matrix Language (MATLAN) Application Description, IBM H20-0479 Program
Description Manual, IBM H20-0564
McNAMEE, J M. "Algorithm 408, a sparse
matrix package." (Part I), Comm ACM 4, 4
(April 1971) 265-273.
DULMAGE, A L., AND MENDELSOHN, N. S.
"On the inversion of sparse matrices." Math.
Comp. 16 (1962) 494-496.
MAYOH,B.H. "A graph technique for inverting certain matrices." Math. Comp. 19 (1965)
644-646.
RoT~, J. P. "An application of algebraic
topology: Kron's method of tearing " Quar.
Appl. Math. 17 (1959) 1-24
SWIFT, G "A comment on matrix inversaon
by partition." S I A M Rev. 2 (1960) 132-33.
25. Indexing Techniques for Sparse Matrices
35. KNUTH, D. ]~. The Art of Computer Programm~ng, Vol. I, Addison--Wesley, Reading,
Mass. 1968 299-304, 554-556.
36. BERZTISS, A . T . Data Structures: Theory and
Practice. Academic Press, New York, 1971,
276-279.
37. LARCOMBE, M. "A hst processing approach
to the solution of large sparse sets of matrix
equations and the factorization of the overall
matrix." in Large Sparse Sets of L~near Equatwns, Reid, J. K., Ed., Academm Press,
London, 1971.
38. WEIL, R. L., ANDKI~TTLER,P . C . "Rearranging matmces to block-angular form for decompotation (and other) algorithms." Management Science 18, 1 (Sept. 1971) 98-108.
39. GUSTAVSON, F. G. "Some basic techniques
for solving sparse systems of linear equations "
in Sparse Matmces and Their Applications,
Rose, D J , and Willoughby, R. A., Eds.,
Plenum Press, New York, 1972 41-52.
40. FIKE, C . T . PL/I for Scientific Programmers,
41.
42.
43.
44.
45.
46.
•
133
Prentice-Hall, Englewood Cliffs, N. J., 1970
108, 180.
WILLOUGHBY, R. A. "A survey of sparse
matrix technology." IBM Watson Research
Center, RC3872 May 1972.
CuTmt.t., E. "Several strategies for reducing
the band-width of matrices." in Sparse Matraces and their Applications, Rose, D . J., and
Willoughby, R. A., Eds., Plenum Press, New
York, 1972, 34-38.
TEWARSON,R . P . "Computations withsparse
matrices." SIAM Rev., 12, 4 (Oct. 1970) 527543.
PETTY, J. S. "FORTRAN M: programming
package for band matrices and vectors." Aerospace Research Labs., Wright-Patterson AFB,
Ohio, ARL-69-0064 (April, 1969).
SHLL~RS, W . R . "On Diakoptics: Tearing an
arbitrary system." Quar. Appl. Math. 23
(1965) 188-90.
IBM System/360 Model 65 Functional Characteristics, IBM A22-6884-3, File No. $360-01.
Computing Surveys, VoI. 5, No. 2, June 1973