When they are applied to spatial data, the prediction accuracy happens to be low due to the disregarded spatial dependencies among the samples. The SAR model solution of the general case requires a maximum likelihood estimation procedure involving a large number of matrix inversion and determinant computations over very large matrices for real-world applications.
Computing eigenvalues is very hard. So, we claim that there are quicker but approximate methods to solve SAR model
The dotted boxed term is the term added to linear regression to get spatial auto-regression.
There is only one parallel implementation of SAR model based on the estimation of maximum likelihood method by eigenvalue computation from [Li, 1996].
The random number generator can generate very long sequence of normal random numbers with desired mean and standard deviation. Such algorithms are rarely found
Short-cut comes from doing two matrix-vector multiplications instead of 2 matrix-matrix multiplications. D^(-1/2) is a vector instead of a diagonal matrix. W_tilda is symmetric and has got the same eigenvalues as W.
Please go back to SLIDE #10 for the second term in the log-likelihood function
Please refer to slide #10 for the funtion to optimize
AHPCRC SPATIAL DATA-MINING TUTORIAL on Scalable Parallel Formulations of Spatial Auto-Regression (SAR) Models for Mining Regular Grid Geospatial Data Shashi Shekhar, Barış M. Kazar, David J. Lilja EECS Department @ University of Minnesota Army High Performance Computing Research Center (AHPCRC) Minnesota Supercomputing Institute (MSI) 05.14.2003
There are a number of sequential algorithms computing SAR model, most of which are based on the estimation of maximum likelihood method that solves for the spatial autoregression parameter ( ) and regression coefficients ( ).
As the problem size gets bigger, the sequential methods are incapable of solving this problem due to
extensive number of computations and
large memory requirement .
The new parallel formulation proposed in this study will outperform the previous parallel implementation in terms of:
The logarithm of the maximum likelihood function is called
The ML estimates of the SAR parameters:
The function to optimize:
System Diagram B Golden Section Search to find that minimizes ML function A Compute Eigenvalues Pre-processing Step C Compute and given the best estimate using least squares Calculate the ML function Eigenvalues of W The Symmetric Eigenvalue-equivalent Neighborhood Matrix
[ICP03] Baris Kazar, Shashi Shekhar, and David J. Lilja, "Parallel Formulation of Spatial Auto-Regression", submitted to ICPP 2003 [under review]
[IEE02] S. Shekhar, P. Schrater, R. Vatsavai, W. Wu, and S. Chawla, Spatial Contextual Classification and Prediction Models for Mining Geospatial Data , IEEE Transactions on Multimedia (special issue on Multimedia Dataabses) , 2002