2. The term“Modern Geographical Knowledge” about Indiacan be approachedfrom
the following perspective:
knowledge of geography acquired not by
traditional means but by modern tools, like
geoinformatics, and analyzed by geostatistics.
what does it mean?
all geographical data layers are current, updated, and
scientifically acquired, structured, and arranged for
sophisticated analysis and geovisualization.
Altogether 3 connotations —
‘modern’ → ‘current’ geographical knowledge
‘modern’ → ‘current’ tools of geographical analysis
‘modern’ → ‘current’ tools of geovisualization
3. Whatever be the perspective, the issue is
clear:
‘role of it in national development’
The hidden agenda comes to the forefront:
‘geography has a role to play in the
national development’
Very interesting, and ambitious issue.
Thanks are due to the Organizers. At long
last I feel proud of pursuing geography as
my academic career.
4. Geographical Knowledge → knowledge about
Habitat
Economy
Society
of a place, area, region.
Habitat: Physical Environment
Geology
Hydrology
Climatology
Geomorphology
Pedology
Botany
Zoology
Natural Hazards / Disasters
6. Society
Trend of population growth
Spatial pattern of population
concentration
Education and pattern of literacy
Religion and ethnicity
Age composition
Occupational pattern
Social groups
Residential pattern
Culture behaviour
Level of urbanisation
7. Development
Economic, Social, and Environmental
Development needs ‘timely and adequate inputs’
in ‘problem areas’ identified in terms of certain
‘economic’, ‘social’, and ‘environmental’ attributes.
Input and Execution are components of
Management Strategy:
these belong to the domain of the Planners.
Geographers: help identification of problem
areas, deficit areas, backward regions using
‘current information base’ and ‘modern tools’.
Hence, the Relation between the two.
8. Order out of Chaos
Spatial Order / Regularity → The Spatial Pattern of Elements over
the Earth Surface:
This can be defined, identified and analysed with a scientific
understanding of geographical knowledge.
In space – time frame, it can be measured, monitored,
mapped and modelled.
It is this that forms the philosophical foundation of the discipline
of Geography.
Naturally, it is the Geographer who discovers this Spatial Order.
Spatial Order → Order-forming Processes → Order-forming
Factors for scientific geographical explanation.
Areal Differentiation
10. Data Acquisition
Physical Database
application of RS / GIS technology
Socio-economic Database
GDM using attribute Data
Mapping
Thematic Data Layers (physical)
Thematic Data Layers (social)
Thematic Data Layers (economic)
Data Integration
using RS / GIS technology adopting
appropriate project design and
management with proper process models.
11. Statistical Techniques: Exploratory Techniques
Analysis of ‘dependence’
Multiple Regression
Analysis of ‘interdependence’
Principal Component Analysis
Factor Analysis
‘Classification’
Discriminant Analysis
Cluster Analysis
using PC Scores / Factor Scores)
Statistical Packages are now readily available for
the Geographers for such applications.
12. Multivariate Techniques / Methods
1. These allow us to consider changes in several properties
simultaneously in order to explore the properties of
dependence, independence and classification.
2. Virtually all geographical events or objects are inherently
multivariate in character.
3. These allow the researcher to manipulate more variables than
he can assimilate by himself.
4. However, the inherent problem is the conceptualisation and
graphical representation of the data. It is impossible to draw or
imagine the distribution of (say) 12 variables in 12 dimensions.
5. Hence, one of the main functions of such methods is to reduce
the dimensionality of the data to the imaginable and plottable
dimensions (viz., 2D or 3D).
13. Significance
1. Most of the problems in geography involve complex and interacting
forces, which are impossible to isolate and study individually.
2. Since lab studies of this kind are not feasible, the complex of
variables needs to be studied as a whole. It is because changes in one
variable may produce changes in other variables at different rates
either directly or indirectly, making it very difficult to isolate pairs of
strongly related variables. Systems approach is therefore a prototype
in geography.
3. In order to understand systems, it is necessary to use multivariate
analysis, as static deterministic principle of one-event-one-factor is
meaningless;
4. Hence, the best course of action is to examine as many facets of a
problem as possible and sort out, a posteriori, the major factors in
order to identify the factors (order-forming) and eventually to
scientifically explain the processes (order-forming).
14. Exploratory techniques do suggest, rather than do test
hypotheses.
1.It provides extra information if variables are
correlated with each other.
2.It brings out the structure of the data
scatter in multivariate space.
3.If there is no significant correlation,
variables are dealt separately.
However, these are commonly overlooked mainly because of:
1. blind adherence to traditional procedures,
2. inadequate knowledge in mathematics and statistics,
3. not risking the data exploration.
Modern Geographers are more equipped with basic knowledge
in mathematics, statistics, and computer.
15. Types of Multivariate Analysis
Multivariate Methods Aims Objectives
Multivariate
Generalisations of
Univariate Statistics
to make statistical statements
about population parameter
to test for equivalence of population means
Multiple Regression to make statistical statements
about the dependency relations
to find and test a best-fit equation relating
one dependent variable to any number of
independent variables
Principal Components and
Factor Analysis
to make statistical statements
about the independency
relations
to find the directions of maximum variance
in the data, to use these to ordinate data in
1,2,3 or 4 dimensions and to interpret
them as factors influencing the data.
Discriminant Functions to make statistical statements
about the discriminating
functions
to find the equation of a line that best
separates two or more user-defined (a
priori) sub-groups within the dataset and to
allocate new data to one or other of the a
priori groups on this basis.
Similarity Coefficients and
Cluster Analysis
to make statistical statements
about the level of similarity
between pairs of objects
to find the magnitude of similarity between
pairs of objects or observations and to use
this to produce an empirical classification.
16. Example – 1: Analysis of Dependence
Objective: To determine the relationship between a variable
of interest and a set of exploratory variables.
Multivariate / Multiple Linear Regression Model (MLRM)
It involves the specification and identification of the type and nature of
dependence of a single variable upon a set of controlling, predictor or
explanatory variables.
The basic postulate is that the variation in the Dependent Variable is made up
of two parts —
one, deterministically related to the explanatory variables not
included in the regression model,
and, the effects of measurement error (random variation).
Hence, the random term (called, disturbance term, when the regression
model applied to a population and residual term, when applied to a
sample) is often assumed to be normally distributed.
17. Mathematical Foundation
It describes the linear relationship between a random vector variable y and a set of
explanatory variables x1, x2, x3,............, xk. These explanatory variables are sometimes
known as independent variables, predictor variables, or controlling variables.
The general form of the model is given by —
yi = β0 + β1.xi1 + β2.xi2 + β3.xi3 +...... + βk.xik + εi (i = 1, 2, 3, ....., n)
where, εi is the disturbance term associated with the ith observed value of y.
If x0 is a unit vector, the equation can be rewritten as —
yi= β0. xi0+ β1.xi1+ β2.xi2+ β3.xi3+ .....+ βk.xik+ εi (i = 1, 2, 3, ....., n ), or
yi = ∑βj.xij + εi ( j = 1, 2, 3, .........., k ), or
y = X. β + ε
where, X is the matrix with columns x1, x2, x3, .............., xk.
The β's are the parameters of the model, and are linear functions of the yi.
The term, β0 is the constant term /intercept (level of y in the absence of any control
by the x's).
The remaining βi's give the change in the corresponding x when y is increased by one
unit, independent of the level of other x's. These are, therefore, termed partial
regression coefficients.
18. Since the x's are measured on different scales, the values of the βi's are not directly
comparable. Hence, each βi is standardized by— β(s)i = βi.si/sy,
where, si is the standard deviation of xi and sy the standard deviation of y.
In the model, the population parameters are estimated by the method of least squares
with goodness of fit in the satisfying level.
In demanding situations, multivariate non-linear regression of different types may also
be fitted and accordingly dependency relations explored.
The best linear unbiased estimates (BLUE) are found provided the following
assumptions are satisfied —
1. the mean of ε is 0, i.e., no important explanatory variable has been omitted,
2. the variance of ε is constant at each level of the x's, i.e., the variance of y is
constant over all the x values (homoscedasticity),
3. the explanatory variables are non-random and are measured error-free,
4. the explanatory variables are not perfectly linearly related,
5. n > k.
6. the values of εi should be independent of each other, i.e., the
variance-covariance matrix of the εi = σ2.l .
7. if the statistical tests of significance are to be used, the conditional distribution
of y for given x should be approximately normal.
19. Parameter Mini
mum
Maxi
mum
Mean Standard
Deviation
Vari-
ance
Skew
ness
Kurtosis
HI: Hypsometric integral 0.154 0.630 0.370 0.130 0.017 0.132 -1.058
L / W ratio 1.207 3.260 2.057 0.534 0.285 0.553 -0.164
CR: Circularity ratio 0.364 0.847 0.549 0.104 0.011 0.385 0.082
ER: Elongation ratio 0.473 0.793 0.624 0.064 0.004 0.024 0.740
CC: Compactness coefficient 1.087 1.659 1.368 0.131 0.017 0.270 -0.525
FF : Form factor 0.176 0.494 0.309 0.063 0.004 0.418 0.947
BR : Basin relief (m) 7.000 343.00 105.802 86.620 7503.0 1.105 0.177
θ : Basin slope (degree) 0.009 0.190 0.038 0.037 0.001 2.153 5.588
DI : Dissection Index 0.163 0.940 0.498 0.176 0.031 0.228 -0.269
RI : Ruggedness index 0.012 0.635 0.161 0.167 0.028 1.265 0.848
SF : Stream frequency (No./ sq km) 0.139 5.893 1.563 1.301 1.693 1.136 1.241
Dd : Drainage density (km / sq km) 0.416 2.677 1.369 0.640 0.410 0.379 -1.053
DT : Drainage texture 0.058 13.521 2.878 3.219 10.361 1.382 1.616
Descriptive Measures: 43 Sub-basins of Dulung basin
21. Model Summary
Correlation Coefficient, r = 0.84
Goodness of Fit, R2 = 0.71
Standard Error of Estimate, SE = 0.076
Durbin – Watson Coefficient = 1.275
Sum of
Squares
df Mean
Square
F Sig.
Regression 0.509757 7 0.072822 12.5394 6.45E-08
Residual 0.203262 35 0.005807
Total 0.713019 42
ANOVA
22. Unstandardized Coefficients Standardized
Coefficients
t Significance
β Std. Error βs
-0.13975 1.16389 -0.12007 0.90511
-0.16784 0.71854 -0.13445 -0.23359 0.81666
0.26056 0.56572 0.26146 0.46057 0.64795
1.00431 0.29835 0.48252 3.36618 0.00186
-0.00205 0.00055 -1.35962 -3.70638 0.00072
0.82220 0.53227 0.23536 1.54469 0.13141
0.25929 0.16109 0.34968 1.60959 0.11647
-0.05855 0.15242 -0.07488 -0.38416 0.70318
The multivariate linear regression model is represented by the equation —
HI = — 0.13975 — 0.13445 CR + 0.26146 CC + 0.48252FF —
1.35962 BR + 0.23536 θ +0.34968 DI — 0.07488 RI
Regression Parameters
23. Example – 2: Analysis of Interdependence
It is performed via two approaches — principal components
analysis (PCA) and factor analysis (FA).
PCA provides a means of eliminating redundancies from a set of
interrelated variables and the resulting principal components
are uncorrelated.
FA, on the other hand, is a method of investigating the
correlation structure of a multivariate system.
Thus, it is an attempt to find groups of variables (factors)
measuring a single important aspect of the system.
As these factors are not necessarily uncorrelated, a method of
transforming the factors (called rotation) is applied.
This involves a prior hypothesis that the system has a simple
structure and the factors are rotated to fit this as closely as
possible.
24. Factor Analysis (FA)
It interpreting the structure of the variance-covariance matrix from a collection of
multivariate observations.
As the variables measured may not all be directly comparable, all of them are
converted to standardized form. Hence, the transformed values have zero mean
and unit variance.
In geographic research, it is the most important technique in multivariate problems,
as —
1. ideas summarising the relationships among the components of a system of
interacting variables can be formed,
2. the common characteristics of the variables, that cause their intercorrelation
and explains their differences in characteristics can be identified.
3. eigen values and eigen vectors can be extracted.
4. structure of the variance - covariance matrix can be efficiently interpreted.
5. the most diagnostic and significant variable(s) in terms of factor loadings in
the multivariate system can be identified and
6. factor score values can be used as a criterion of differentiation between and
among the samples in multivariate space.
25. Mathematical Foundation
In factor analysis the relationship within a set of p variables reflects the correlation of
each of the variables with k mutually uncorrelated underlying factors; the usual
assumption is that, k < p. Variance in the p variables is, therefore, derived from
variance in the k factors, but in addition, a contribution is made by unique source that
independently affect the p original variables. The k underlying factors are common
factors while the independent contributions are unique factors. The FA model is given
by:
X = F.Λ + ε
where X represents a (n x p) matrix, Λ is the "factor loading matrix" that defines the
co-ordinates of the points representing variables relative to the axes of a
k - dimensional space, i. e., a (p x k) matrix. F is a "factor score matrix", which gives
the co-ordinates of the observations in the k-dimensional space defined by Λ.
The influence of the factors on the individual cases is expressed by the elements of F.
ε represents the "residual matrix" that expresses the effects of the specific factors
affecting the variables together with measurement error. The individual observations
are seen to be the product of two matrices, Λ and F, plus the associated disturbance
or residual terms:
Xij = ∑λir.fir + eij
where, r = 1 to k, fir is the common factor, k is the number of specified factors, λir is
the loading of the ith variate on rth factor, i.e., the loading on the principal
components and eij is the random variation unique to the original variable Xij.
26. The basic assumptions in the factor model are —
1. E (fi) = E(ei) = 0
2. the fi and ei are independent,
3. the elements eij are independent of one another,
4. the variance - covariance matrix of the e's is diagonal and
non - singular,
5. the variance - covariance matrix of the x's has rank k,
6. k < p, and
7. each xi is correlated with at least one other of the x's
FA reduces the dimensionality of a multivariate problem to manageable size.
The extraneous orthogonal axes are eliminated through a variety of
rotational schemes, of which Kaiser's varimax scheme is most popular. In
this, each factor axis is moved to position so that projections from each
variable onto the factor axes are either near the extremities or near the
origin. The method operates by adjusting the factor loading so they are
either near1 or near 0.
Thus, for each factor, there will be a few significantly high loading and many
insignificant loading. It makes interpretation much easier.
27. The fundamental postulate of FA is given by:
∑ = ΛΦΛ' + Ψ2
where, ∑ is the variance - covariance matrix derived from X.
The off-diagonal elements of ∑ can be reproduced from knowledge of the
factor loading and the correlation between the factors.
The elements of the diagonal of the ∑ are the sum of two variances —
one derived from or attributable to the common factors, and
the other of the residuals.
The first part of the variance of a variable is the communality of the
particular variable and the second is its uniqueness.
If k, the number of common factors is chosen correctly and if the factor
model holds, the "residual correlation " given by the off -diagonal elements
of the residual matrix should be randomly distributed about a mean of zero.
Hence, there are exactly k non-zero eigenvalues.
33. Output of Factor Analysis
1. The data comprise 21 socio-economic attributes of 61 GPs of
the Dulung basin.
2. The correlation matrix shows the significant relations at 0.01
level.
3. First four Factors emerged significant, together explaining
77.87% of the total variance.
4. Initially, x12, x5, x4, x9 had high positive loading and x11 high
negative loading on Factor – 1; x13 had high positive loading
and x22, x6 had high negative loading on Factor – 2.
34. After Varimax rotation,
1. x9, x12 and x22 have high positive loading and x11 high
negative loading on Factor – 1;
2. x10, xs, x2 and x3 have high positive loading on Factor – 2;
3. x6 and x7 have high positive loading on Factor – 3;
4. x19 has high positive loading on Factor – 4.
With respect to Factor – 1,
1. very high positive scores emerged in case of 9 basins;
2. basins with positive scores =35;
3. basins with negative score = 26; and
4. basins with very high negative scores = 9.
With respect to Factor – 2,
1. very high positive scores emerged in case of 9 basins;
2. basins with positive scores =28;
3. basins with negative score = 33; and
4. basins with very high negative scores = 7.
35. Factor Score – 1 may form the basis of Numerical
Classification of the GPs in terms of the 21
variables.
Range of
Factor Score
– 1
No. of Gram
Panchayats
Gram Panchayat
ID
Remarks
>2 1 61 Highly Developed
1 to 2 8 7, 55, 17, 45, 22, 27, 31, 26 Fairly Developed
0 to 1 26 2, 34, 23, 25, 11, 10, 32, 12, 13, 53,
38, 18, 35, 48, 19, 37, 16, 20, 15,
52, 28, 5, 58, 36, 4, 14
Developed
-1 to 0 17 6, 42, 44, 47, 29, 51, 59, 60, 33, 56,
24, 21, 40, 3, 54, 8
Backward
-2 to -1 6 41, 46, 39, 49, 1, 50 Fairly Backward
< -2 3 43, 57, 30 Very Backward