Ellipsoidal Representations about correlations (2011-11, Tsukuba, Kakenhi-Symposium)
Ellipsoidal representations about correlations (Towards general correlation theory) Toshiyuki Shimono email@example.com KAKENHI* Symposium *Grant-in-Aid for Scientific Research University of Tsukuba 2011-11-8
My profile• My jobs are mainly building algorithms using data in large amounts such as: o web access log o newspaper articles o POS(Point of Sales) data o tags of millions of pictures o links among billions of pages o psychology test results of a human resource company o data produced used for recommendation engines o data produced an original search engine• This presentation touches on those above.
Background1. Paradoxes of real world data : o any elaborate regression analysis mostly gives ρ < 0.7 (This is when the observation is not very accurate, and 0.7 is arbitrary.) -> so how to deal with them? o data accuracy seems not important to see ρ if ρ < 0.7, -> details shown later.2. My temporal answer : o The correlations are very important, so we need interpretation methods. o The ellipsoids will give you insights.3. Then we will : o understand the real world dominated by weak correlations. o find new rules and findings in broad science, hopefully.
Main contents§1. What is ρ? o Shape of ellipse/ellipsoid o Mysterious robustness§2. Geometry of regression o Similarity ratio of ellips＊s o Graduated rulers o Linear scalar fields
§1. What is ρ ? (ρ : the correlation coefficient)It was developed by Karl Pearson from a similar but slightlydifferent idea introduced by Francis Galton in the 1880s. (quoted from en.wikipedia.org)
The shapes of correlation ellipses (1) Each entry of the left figure shows the 2- dimensional Gaussian distributions with ρ changing from -1 to +1 stepping with 0.1. (5000 points are plotted for each)
The shapes of correlation ellipses (2)The density function of 2-dim Gauss-distribution with standardizations.Note: for higher dimensions, The ellipse inscribes the unit square at 4 points (±1,±ρ) and (±ρ,±1).
The shapes of correlation ellipses (3)• Displacement and axial- rescaling are allowed. (Rotation or rescaling along other direction is prohibited.) When you draw the ellipses above, 1. draw an ellipse with the height and width of √(1±ρ), 2. rotate it 45 degree, 3. do parallel-shift and axial-rescaling.
The shapes of correlation ellipses (4) [Baseball example] 6 teams of the Central League played 130 games in the each of past 31 years. Each dot below corresponds to each team and each year (N = 186 = 6 × 31). x : total score lost(L)x : total score gained(G) y : - ranky : - rank ρ = -0.471ρ = 0.419x : total score gained x : -rank prediction from both G & Ly : total score lost y : - rankρ = 0.423 ρ = -0.828 (The prediction is through the multiple regression analysis)
Correlation ellipsoid (higher dimension) z ( 0.5 , 0.7 , 1 ) ρ-matrix herein is, (-1,-0.3,-0.5) 1 0.3 0.5 ( 0.3 , 1 , 0.7 ) 0.3 1 0.7 0.5 0.7 1 ( 1 , 0.3 , 0.5 )-0.3 ,-1 ,-0.7 ) y x (-0.5 ,-0.7 ,-1 ) For 3-dim case, the probability ellipsoid touches the unit cube at 6 points of ±( ρ・1 , ρ・2 , ρ・3 ) where ・ = 1,2,3. (For k-dimensions, the hyper-ellipsoid touches the unit hyper-cube at 2×k points of of ±( ρ・1 , ρ・2 ,.., ρ・k ) where ・ = 1,2,..,k.
The mysterious robustness (1)ρ[X:Y] and ρ [ f(X) : g(Y) ] seems to differ only little eachother • when f and g are both increasing functions • unless X, Y, f(X) or g(Y) contains `outlier(s). (Sampling fluctuations of ρ are much more than the effect caused by non-linearity as well as error ε.) * A function f(・) is increasing iff f(x) ≦ f(y) holds for any x ≦ y.
The mysterious robustness (2) ρ[X:Y]=0.557 ρ[X2:Y]=0.519 ρ[X:Y2]=0.536 ρ[X:log(Y)]=0.539 (x,y)=(u,0.5*u+0.707*v) with Xを2乗 Yを2乗 Yを対数化 (u,v) from an uniform square.ρ[Xrank:Yrank]=0.537 ρ[X(7):Y(7)]=0.524 ρ[X(5):Y(5)]=0.507 X,Yを順位化 X,Yを7値化 X,Yを5値化 Even N=200 causes the sampling correlations rather big fluctuations,• The deformations cause less effect on ρ, whereas the X marks from the• N=200 ≫ 1 causes bigger ρ fluctuations. experiments rather concentrates.
The mysterious robustness (3) Sampled ρ are perturbed corresponding to the sampling size with N=30(blue) or N=300(red). The deformation effect by f( ) is less.
Where does the champion come from?The champion of a game is often not the true champion. potential abilityIf ρ of the game is not close to 1, the true cannot win.The winner is approximately ρ times as strong as the true guy.(If the results and abilities form a 2-dim 0-centered Gaussian.)
Summary of `§1. What is ρ? • ρ is recognizable as an ellipse.• ρ-matrix is recognizable as an ellipsoid.• ρ seems robust against axial deformations unless outliers exist.• ρ of a game is suggested by the champions.
§2. Geometry of Regression The figures herein show the possible region where (x,y,z)=(ρ[Y:Z],ρ[Z:X],ρ[X:Y]) can exist.
Multiple-ρ is the similarity ratio of ellipses[ Formulation of MRA ] [ Multiple - ρ ] The multiple-ρ (≦ 1) is the similarity ratio of the ellipses. (When X・ is k-dimentional, the hyper-ellipsoid is determined by k×k matrix whose elements are ρ [ Xi : Xj ], and the inner point is at p-dimensional vector whose elements are ρ [ Xi : Y ] . )
Examples : Multiple-ρ from the ellipsesMany interesting phenomena would be systematicallyexplained.
Partial-ρ is read by a ruler in the ellipse The partial correlation r1 comes form the idea of the correlation between X1 and Y but X2 is fixed. The red ruler • parallel to the corresponding axis, • passing through (r1,r2), • fully expanding inside the ellipse, • graduated linearly ranging ±1, reads the partial-ρ. r1 = 0.75 for this case. r2 is also read by changing the ruler direction vertically.
Standardized partial regression coefficients • ai are called the partial regression coefficients. • Assume X1,X2,Y are standardized. Make a scalar field inside the ellipse • 1 on the plus-side boundary of k-th axis, • 0 on the boundary of the other axis, • interpolate the assigning values linearly. Then, ak is read by the value at (r1,r2). Note: • Extension to higher dimensions are easy. • Boundary points at each facet is single. • This pictorialization may be useful to SEM (Structural Equation Analysis).
The elliptical depiction for the baseball example This page is added after the symposium Red : for the multiple-ρ (0.828), Blue : for the two partial-ρ Magenta : for the partial regression coefficients. Each value corresponds to the length ratio of the bold part to the whole same-colored line section. X1 : annual total score gained X2: annual total score lost Y: zero minus annual ranking ( ρ[Y:X1] , ρ[Y:X2] ) = (0.419,-0.471) is plotted inside the ellipse slanted with ρ[X1:X2]=0.423. -> The meaning of numbers becomes clearer.
Summary and findings of §2 Geometry of regression • Multiple-ρ is the similarity ratio of two ellipses/ellipsoids. • Partial-ρ is read by a graduated ruler in the ellipse/ellipsoids. • Each regression coefficients are given by the schalar field.So far, the derived numbers from MRA (Multiple Regression Analysis)have often said to be hard to recognize. But this situation can bechanged.
Summary as a whole[ Main resutls ]Using the ellipse or hyper-ellipsoid, • any correlation matrix is wholly pictorialized. • multiple regression is translated into geometric quotients.[ Sub results ] • ρ seems quite robust against axial deformations unless outliers exist. • (Spherical trigonometry may give you insights). <- Not referred today.[ Next steps ] • treat the parameter/sampling perturbations • systematize interesting statistical phenomena • produce new theories further on • give new twists to other research areas • make useful applications to the real world cases • organize a new logic system for this ambiguous world.
Refs1. 岩波数学辞典 Encyclopedic Dictionary of Mathematics, The Mathematical Society of Japan2. R, http://www.r-project.org/3. 共分散構造分析 [事例編] The author sincerely welcomes any related literature.
Background of this presentation SKIP1. We make judgements from related things in daily or social life, but this real world is noisy and filled with exceptions.e.g. "Does the better posture and mentalconcentration cause the better performance?"2. The real world data causes paradoxes : o any elaborate regression analysis mostly gives ρ < 0.7, how to deal? o data accuracy is not important when ρ < 0.7, details shown later. o why subjective sense works in the real?3. Geometric interpretations of multiple regression analysis may be useful o that wholly takes in any correlation matrix o that is geometric using ellipsoids to observe, analyze the background phenomena in detail.4. Then we will understand weak correlations that dominates our world.
A primitive question SKIPQuestion Why(How) is data analysing important?My Answer It gives you inspirations and updates your recognition to the real world. Knowing the numbers μ, σ, ρ, ranking, VaR * from phenomena you have met is crucially important to make your next action in either of your daily, social or business life!! * average, std deviation, correlation coefficient, the rank order, Value at Risk And so, the interpretation of the numbers is necessary. (And I provides you that of ρ today!)
Main ideas in more detail SKIPUsing the ellipse or hyper-ellipsoid, • 2nd order moments are completely imaginable in a picture. • the numbers from Multiple-Regression are also imaginable.1. (Pearsons) Correlation Coefficient • basic of statistics (as you know) • may change well when outliers are contained • however, changes only few against `monotone map • depicted as correlation ellipse2. Multiple Regression Analysis • (Spherical Surface Interpretation) • Ellipse Interpretation
Main ideas SKIP1. What is the correlation coefficient after all?2. Geometric interpretations of Multiple Regression Analysis.
The mysterious robustness (3) SKIPfront figures: x - original sampling correlation. y - 3-valued thencorrelation calculated. back figures: sample of 100.
Summary of `§1. What is ρ? REDUNDANT• A correlation ρ is recognizable as an ellipse.• A correlation matrix is also recognizable as an ellipsoid.• ρ seems robust against axial deformations unless outliers exist.• You can guess `ρ of a game by the champion.
When partial-ρ is zero. (SKIP)The condition partial-ρ = 0 ⇔ • The inner angle of the spheric triangle is 90 degrees. • The two `hyper-planes cross at 90 degrees at the `hyper- axis. The axis corresponds the fixed variables and each of the planes contains each of the two variables. • On the ellipse/ellipsoid, the characteristic point is on the midpoint of the ruler.
Multiple-ρ is the similarity ratio of ellipses[REDUNDANT ] Formulation of MRA [ Multiple - ρ ] The multiple-ρ (≦ 1) is the similarity ratio of the ellipses.For arbitrary variables number case, youcalculate: the inverse of the correlation (When X・ is k-dimentional, the hyper-matrix → the reciprocal of each of thediagonal elements → 1 minus each of them ellipsoid is determined by k×k matrix→ take the square root of each → each are whose elements are ρ [ Xi : Xj ], and thethe multiple-ρ of the corresponding variable inner point is at p-dimensional vector whosefrom the rest variables. elements are ρ [ Xi : Y ] . )
Summary and findings of §2 Geometry of regressionREDUNDANT • Multiple-ρ is the similarity ratio of two ellipses/ellipsoids. • Partial-ρ is read by a graduated ruler in the ellipse/ellipsoids. • Each regression coefficients are given by the scholar field. • (Spherical trigonometry)So far, the derived numbers from MRA have often said to be hardto recognize. But this situation can be changed.
Introduction This page is added after the symposium This page may need intensive proofreading by the author.There is a Japanese word `kaizen, which means improvement. The problems still existing today are as follows:The real world is, however, so ambiguous that it often is hard to - The meaning of correlation value is not yet well known.know whether any kaizen action would make positive effect or not.- The meaning of multiple regression analysis is also not yet well known(, although when the correlation is weak the reasonableSometimes your action may cause negative effect or zero effect choice of analysis is multiple analysis or its elaboratein an averaged sense even if you believe your action is a good derivatives).one. Assume a situation that you can control a variable to makesome effect on the outcome variable (the number of controlvariables The author found that correlation is very robust against anywould increase in the following). `axial deformations’ unless variables contain outliers. Rather sampling correlation coefficient perturbs much more in manyThe authors hypothetical proposition is that the correlation cases when N is less than 1000. The author also foundcoefficient indeed plays important role. A reason is that when the geometrical backgrounds of correlations of multiple regressioncorrelation is positive then your rational action is just increasing analysis (Perhaps R.A.Fisher already knew that, but any personthe value of the control variable. And it seems very reasonable around me didn’t know that) that is producing many insights.that you should select a strongly correlated variable to the outputvariable. (The robustness is not well analyzed at this moment (some pieces of analysis and numerical examples) The geometrical background is analyzed in basic points so the author is considering to investigate further for parameter perturbations.)