1. 2012-07 ims-APRM@Tsukuba
Ellipsoidal Representations n
å( X - X ) (Yi - Y )
Regarding Correlations
i
r [ X ,Y ] = i=1
n n
å( X - X) å (Y - Y )
2 2
i i
i=1 i=1
The author likes to express certain basic theories that may touch
Toshiyuki upon the very roots of statistics. One is the geometric representation
of ρ and the other is the mysterious robustness of ρ. Both are simple
SHIMONO but their implications are very deep, which may even affect
PhD. ( Info. Sci. and Tech. ), a freelance epistemology i.e. how humans sense/judge this ambiguous real
tshimono@05.alumni.u-tokyo.ac.jp
world.
Please read „ρ‟ below as “correlation coefficient” developed since 1880‟s.
Spherical Model F. Galton K. Pearson
Background Main Findings Prospects
Yˆ = a X ++ a X + b •
1 1 p p 0. A spherical triangle gives a view. Ellipsoidal representation of
Despite fundamental, multiple regression cos( ) of the inner angles give partial-ρ. correlations and regression
analysis is hard to interpret the outcomes!! 1. The „correlation ellipsoid‟ directly gives by computer software would
By choosing the explanatory variables multiple-ρ, partial-ρ, coefficients be useful.
How happens? intentionally among many candidate
2. The mysterious robustness • Influences to statistics and data
-> multicolliearity variables, you may freely invert the sign of
the regression coefficient of any variable — but not yet fully developed analysis from these geometric views.
-> un-intuitive sign inversion.invert.
you want to
-> perturbation by finite sampling
§1. An important empirical fact : the mysterious robustness regarding ρ
Theoretical development is
The relation between ρ
Quite often, |ρ[ X:Y ] −ρ[ f(X) : g(Y) ] | < 0.05 much more required,
especially for higher and Spearman‟s rank
ρ for (infinitely many
if X, Y, f(X), g(Y) have no outliers, and if f, g are dimensional cases
pairs of) X,Y that are
increasing functions. forming 2-dim
Gaussian distribution
This robustness seems comaparable to the Despite strong
the deformations
along 0≤ρ≤1 is shown
Recall Fisher‟s z transformation in red. The difference is
sampling error with N > 500. Note: peoples often judges 1 1+ r
often as small as less than
z= log cause very small 0.05.
things from only N=1,2 or 10 to see relations. 2 1- r effects on ρ.
観察 符号化
X Y ± 1
r1
r2
r3
± 1
r2 2 / r1
r3 2 / arcsin r1
When ρ is not so strong, deformations of
observations may be a very minor issue. „N‟
(sample size) is important.
These facts deeply affect how a human
recognizes relationship between/
among multiple phenomena.
§2. The Correlation Ellipse & Correlation Ellipsoid 6 teams of the Central League played 130 games in the each
of past 31 years. Each dot below corresponds to each team
and each year (N = 186 = 6 × 31).
Where does the champion The two variables are assumed to
form 2-dim Gaussian distribution with
come from? The winner is zero-centered. Note: quite many
approximately ρ times as distribution cannot be distinguished
strong as the true guy. from Gaussian distribution by
Kolmogorov-Smirnov test unless N >
30 or 100.
result
ability
This ellipsoid touches the unit cube at
(±1,±ρ12,±ρ13), (±ρ12, ±1,±ρ23),(±ρ13,±ρ23,±1)
This ellipse touches The (hyper-)ellipsoid touches the unit
the unit square at (hyper-)cube at 2×k points of
(±1,±ρ), (±ρ,±1). ±( ρ・1 , ρ・2 ,.., ρ・k ) with ・ =
1,2,..,k.
▲ The “correlation ellipse” for given ▲ The “correlation ellipsoid” for ρ-matrix
ρ
Trivial example of the correspondence between the shape
§3. How are the multiple/partial ρ and coeff. a i drawn ?
of distribution of N≤5 and the correlation ellipse.
Coincidentally, the Spearman‟s rank ρ is 0.1 times integer
when all the variables are different values.
The k-th partial ρ is read
Assume ρ between The partial ρ is read by the ruler of a straight The sign (plus/minus) can be determined by
by a graduated ruler. line parallel to
variables Y, X1, X2 k-th axis of the space either of these geometric methods. You may
r -rr
偏相関係数 r¢ = 1 2 12
consider how the coefficient a i changes its
are given. 1- r 1- r
1
2
2 2
12
The k-th regression coefficient value according to the number of variables X i
is read by
the linear scalar field inside the increases.
The multiple ρ The standardized hyper-ellipsoid, For SEM, how | a i*| > 1 happens is explained.
with ±1 valued at the tangential
is the similarity ratio. partial regression points to the facets of xk=±1,
重相関係数は相似比 coefficients a i is and 0 valued at the other
tangential points to the other
read by a linear facets.
scalar field.
(標準化)偏回帰係数
r -rr
a1 = 1 2 212
*
ANY REFERENCE?
1- r 12
0.5 色 0.0
色
0.833
Those above are basically my original works
X1 X2 X1 , X2
- 0.8 except the definitions of correlations and
regression analysis. The author sincerely
R=0.833
(0.5 , 0) Now you can grab welcomes any related literature information.
The ratio of the red arrow to the whole (0.8, -1)
(1 , -0.8)
how Y depends on and patron!
red line section is equal to the multiple Although X2 and the color Y is no correlated,
correlation coefficient. the determinant coefficient Y from X1 and X2 is
*multiple* variables
much more than that of Y from only X1.
—A case when zero-correlation is not useless
X i by a bird‟s-eye. How can one distinguish
R = r12 + r22 -> how multicollinearity things from multiple features?
or R = r12 + r22 ++ rp2 occurs / how unintuitive sign What is distinguishing in the world?
inversion occurs / etc. There seems to be several theoretical
When the correlation between/among This is when X2 is useless -> A theoretical framework
explanatory variables is/are zero, then to explain Y because X1 principles to be
telling whether any action developed..
the multiple correlation coefficient conceals the effect of X2.
becomes √r12+r22 or √ r12+..+rp2 . This happens when causes positive or negative
effect in daily/social real. Nara Great Buddha
— determinant coefficient additivity r2 = r1 r12 thus r = r1 .