Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Different kind of distance and Statistical Distance

569 views

Published on

A short brief of distance and statistical distance which is core of multivariate analysis.................you will get here some more simple conception about distances and statistical distance.

Published in: Science

Different kind of distance and Statistical Distance

  1. 1. WELCOME TO MY PRESENTATION ON STATISTICAL DISTANCE
  2. 2. Md. Menhazul Abedin M.Sc. Student Dept. of Statistics Rajshahi University Mob: 01751385142 Email: menhaz70@gmail.com
  3. 3. Objectives • To know about the meaning of statistical distance and it’s relation and difference with general or Euclidean distance
  4. 4. Content Definition of Euclidean distance Concept & intuition of statistical distance Definition of Statistical distance Necessity of statistical distance Concept of Mahalanobis distance (population &sample) Distribution of Mahalanobis distance Mahalanobis distance in R Acknowledgement
  5. 5. Euclidean Distance from origin (0,0) (X,Y) X Y
  6. 6. Euclidean Distance P(X,Y) Y O (0,0) X By Pythagoras 𝑑(𝑜, 𝑝) = 𝑋2 + 𝑌2
  7. 7. Euclidean Distance Specific point
  8. 8. we see that two specific points in each picture Our problem is to determine the length between two points . But how ?????????? Assume that these pictures are placed in two dimensional spaces and points are joined by a straight line
  9. 9. Let 1st point is (𝑥1,𝑦1) and 2nd point is (𝑥2, 𝑦2) then distance is D= √ ( (𝑥1−𝑥2)2 + (𝑦1 − 𝑦2)2 ) What will be happen when dimension is three
  10. 10. Distanse in 𝑅3
  11. 11. Distance is given by • Points are (x1,x2,x3) and (y1,y2,y3) (𝑥1 − 𝑦1)2+(𝑥2 − 𝑦2)2+(𝑥3 − 𝑦3)2
  12. 12. For n dimension it can be written as the following expression and named as Euclidian distance 22 22 2 11 2121 )()()(),( ),,,(),,,,( pp pp yxyxyxQPd yyyQxxxP   
  13. 13. 12/12/2016 14 Properties of Euclidean Distance and Mathematical Distance • Usual human concept of distance is Eucl. Dist. • Each coordinate contributes equally to the distance 22 22 2 11 2121 )()()(),( ),,,(),,,,( pp pp yxyxyxQPd yyyQxxxP    14 Mathematicians, generalizing its three properties , 1) d(P,Q)=d(Q,P). 2) d(P,Q)=0 if and only if P=Q and 3) d(P,Q)=<d(P,R)+d(R,Q) for all R, define distance on any set.
  14. 14. P(X1,Y1) Q(X2,Y2) R(Z1,Z2)) R(Z1,Z2)
  15. 15. Taxicab Distance :NotionRed: Manh attan distan ce. Green: diagonal, straight- line distance Blue, yello w: equiv alent Man hatta n dista nces.
  16. 16. • The Manhattan distance is the simple sum of the horizontal and vertical components, whereas the diagonal distance might be computed by applying the Pythagorean Theorem .
  17. 17. • Red: Manhattan distance. • Green: diagonal, straight-line distance. • Blue, yellow: equivalent Manhattan distances.
  18. 18. • Manhattan distance 12 unit • Diagonal or straight-line distance or Euclidean distance is 62 + 62 =6√2 We observe that Euclidean distance is less than Manhattan distance
  19. 19. Taxicab/Manhattan distance :Definition (p1,p2)) (q1,q2) │𝑝1 − 𝑞2│ │p2-q2│
  20. 20. Manhattan Distance • The taxicab distance between (p1,p2) and (q1,q2) is │p1-q1│+│p2-q2│
  21. 21. Relationship between Manhattan & Euclidean distance. 7 Block 6 Block
  22. 22. Relationship between Manhattan & Euclidean distance. • It now seems that the distance from A to C is 7 blocks, while the distance from A to B is 6 blocks. • Unless we choose to go off-road, B is now closer to A than C. • Taxicab distance is sometimes equal to Euclidean distance, but otherwise it is greater than Euclidean distance. Euclidean distance <Taxicab distance Is it true always ??? Or for n dimension ???
  23. 23. Proof…….. Absolute values guarantee non-negative value Addition property of inequality
  24. 24. Continued………..
  25. 25. Continued………..
  26. 26. For high dimension • It holds for high dimensional case • Σ │𝑥𝑖 − 𝑦𝑖│2 ≤ Σ │𝑥𝑖 − 𝑦𝑖│2 + 2Σ│𝑥𝑖 − 𝑥𝑖││𝑥𝑗 − 𝑥𝑗│ Which implies Σ (𝑥𝑖 − 𝑦𝑖)2 ≤ Σ│𝑥𝑖 − 𝑥𝑗│ 𝑑 𝐸 ≤ 𝑑 𝑇
  27. 27. 12/12/2016 Statistical Distance • Weight coordinates subject to a great deal of variability less heavily than those that are not highly variable Whoisnearerto datasetifitwere point? Same distance from origin
  28. 28. • Here variability in x1 axis > variability in x2 axis  Is the same distance meaningful from origin ??? Ans: no But, how we take into account the different variability ???? Ans : Give different weights on axes.
  29. 29. 12/12/2016 Statistical Distance for Uncorrelated Data     22 2 2 11 2 12* 2 2* 1 222 * 2111 * 1 21 ),( /,/ )0,0(),,( s x s x xxPOd sxxsxx OxxP   weight Standardization
  30. 30. all point that have coordinates (x1,x2) and are a constant squared distance , c2 from the origin must satisfy 𝑥12 𝑠11 + 𝑥22 𝑠22 =𝑐2 But … how to choose c ????? It’s a problem Choose c as 95% observation fall in this area …. 𝑠11 > 𝑠22 = > 1 𝑠11 < 1 𝑠22
  31. 31. 12/12/2016 Ellipse of Constant Statistical Distance for Uncorrelated Data 11sc 11sc 22sc 22sc x1 x2 0
  32. 32. • This expression can be generalized as ……… statistical distance from an arbitrary point P=(x1,x2) to any fixed point Q=(y1,y2) ;lk;lk; For P dimension……………..
  33. 33. Remark : 1) The distance of P to the origin O is obtain by setting all 𝑦𝑖 = 0 2) If all 𝑠𝑖𝑖 are equal Euclidean distance formula is appropriate
  34. 34. Scattered Plot for Correlated Measurements
  35. 35. • How do you measure the statistical distance of the above data set ?????? • Ans : Firstly make it uncorrelated . • But why and how………??????? • Ans: Rotate the axis keeping origin fixed.
  36. 36. 12/12/2016 Scattered Plot for Correlated Measurements
  37. 37. Rotation of axes keeping origin fixed O M R X1 N Q 𝑥1 P(x1,x2) x2 𝑥2 𝜃 𝜃
  38. 38. x=OM =OR-MR = 𝑥1 cos𝜃 – 𝑥2 sin𝜃 ……. (i) y=MP =QR+NP = 𝑥1 sin𝜃 + 𝑥2 cos𝜃 ……….(ii)
  39. 39. • The solution of the above equations
  40. 40. Choice of 𝜃 What 𝜃 will you choice ? How will you do it ?  Data matrix → Centeralized data matrix → Covariance of data matrix → Eigen vector Theta = angle between 1st eigen vector and [1,0] or angle between 2nd eigen vector and [0,1]
  41. 41. Why is that angle between 1st eigen vector and [0,1] or angle between 2nd eigen vector and [1,0] ?? Ans: Let B be a (p by p) positive definite matrix with eigenvalues λ1≥λ2≥λ3≥ … … . . ≥ λp>0 and associated normalized eigenvectors 𝑒1, 𝑒2, … … … , 𝑒 𝑝.Then 𝑚𝑎𝑥 𝑥≠0 𝑥′ 𝐵𝑥 𝑥′ 𝑥 = λ1 attained when x= 𝑒1 𝑚𝑖𝑛 𝑥≠0 𝑥′ 𝐵𝑥 𝑥′ 𝑥 = λ 𝑝 attained when x= 𝑒 𝑝
  42. 42. 𝑚𝑎𝑥 𝑥⊥𝑒1,𝑒2,…,𝑒 𝑘 𝑥′ 𝐵𝑥 𝑥′ 𝑥 = λ 𝑘+1 attained when x= 𝑒 𝑘+1 , k = 1,2, … , p − 1.
  43. 43. Choice of 𝜃 #### Excercise 16.page(309).Heights in inches (x) & Weights in pounds(y). An Introduction to Statistics and Probability M.Nurul Islam ####### x=c(60,60,60,60,62,62,62,64,64,64,66,66,66,66,68, 68,68,70,70,70);x y=c(115,120,130,125,130,140,120,135,130,145,135 ,170,140,155,150,160,175,180,160,175);y ############ V=eigen(cov(cdata))$vectors;V as.matrix(cdata)%*%V plot(x,y)
  44. 44. data=data.frame(x,y);data as.matrix(data) colMeans(data) xmv=c(rep(64.8,20));xmv ### x mean vector ymv=c(rep(144.5,20));ymv ### y mean vector meanmatrix=cbind(xmv,ymv);meanmatrix cdata=data-meanmatrix;cdata ### mean centred data plot(cdata) abline(h=0,v=0) cor(cdata)
  45. 45. • ################## cov(cdata) eigen(cov( cdata)) xx1=c(1,0);xx1 xx2=c(0,1);xx2 vv1=eigen(cov(cdata))$vectors[,1];vv1 vv2=eigen(cov(cdata))$vectors[,2];vv2
  46. 46. ################ theta = acos( sum(xx1*vv1) / ( sqrt(sum(xx1 * xx1)) * sqrt(sum(vv1 * vv1)) ) );theta theta = acos( sum(xx2*vv2) / ( sqrt(sum(xx2 * xx2)) * sqrt(sum(vv2 * vv2)) ) );theta ############### xx=cdata[,1]*cos( 1.41784)+cdata[,2]*sin( 1.41784);xx yy=-cdata[,1]*sin( 1.41784)+cdata[,2]*cos( 1.41784);yy plot(xx,yy) abline(h=0,v=0)
  47. 47. V=eigen(cov(cdata))$vectors;V tdata=as.matrix(cdata)%*%V;tdata ### transformed data cov(tdata) round(cov(tdata),14) cor(tdata) plot(tdata) abline(h=0,v=0) round(cor(tdata),16)
  48. 48. • ################ comparison of both method ############ comparison=tdata - as.matrix(cbind(xx,yy));comparison round(comparison,4)
  49. 49. ########### using package. md from original data ##### md=mahalanobis(data,colMeans(data),cov(data),inverted =F);md ## md =mahalanobis distance ######## mahalanobis distance from transformed data ######## tmd=mahalanobis(tdata,colMeans(tdata),cov(tdata),inverted =F);tmd ###### comparison ############ md-tmd
  50. 50. Mahalanobis distance : Manually mu=colMeans(tdata);mu incov=solve(cov(tdata));incov md1=t(tdata[1,]-mu)%*%incov%*%(tdata[1,]- mu);md1 md2=t(tdata[2,]-mu)%*%incov%*%(tdata[2,]- mu);md2 md3=t(tdata[3,]-mu)%*%incov%*%(tdata[3,]- mu);md3 ............. ……………. ………….. md20=t(tdata[20,]-mu)%*%incov%*%(tdata[20,]- mu);md20 md for package and manully are equal
  51. 51. tdata s1=sd(tdata[,1]);s1 s2=sd(tdata[,2]);s2 xstar=c(tdata[,1])/s1;xstar ystar=c(tdata[,2])/s2;ystar md1=sqrt((-1.46787309)^2 + (0.1484462)^2);md1 md2=sqrt((-1.22516896 )^2 + ( 0.6020111 )^2);md2 ………. ………… …………….. Not equal to above distances…….. Why ??????? Take into account mean
  52. 52. 12/12/2016 Statistical Distance under Rotated Coordinate System 2 2222112 2 111 212 211 22 2 2 11 2 1 21 2),( cossin~ sincos~ ~ ~ ~ ~ ),( )~,~(),0,0( xaxxaxaPOd xxx xxx s x s x POd xxPO       𝑠11 𝑠22 are sample variances
  53. 53. • After some manipulation this can be written in terms of origin variables Whereas
  54. 54. Proof………… • 𝑠11= 1 𝑛−1 Σ( 𝑥1 − 𝑥1 ) 2 = 1 𝑛−1 Σ (𝑥1 cos 𝜃 + 𝑥2 sin 𝜃 − 𝑥1 cos 𝜃 − 𝑥2 sin 𝜃 )2 = 𝑐𝑜𝑠2(𝜃)𝑠11 + 2 sin 𝜃 cos 𝜃 𝑠12 + 𝑠𝑖𝑛2(𝜃)𝑠22 𝑠22 = 1 𝑛−1 Σ( 𝑥2 − 𝑥2 ) 2 = Σ 1 𝑛−1 ( − 𝑥1 sin 𝜃 + 𝑥2 cos 𝜃 + 𝑥1 sin(𝜃) + 𝑥2 cos 𝜃 ) 2 = 𝑐𝑜𝑠2(𝜃)𝑠22 - 2 sin 𝜃 cos 𝜃 𝑠12 + 𝑠𝑖𝑛2(𝜃)𝑠11
  55. 55. Continued…………. 𝑑(𝑂, 𝑃)= (𝑥1cos 𝜃 + 𝑥2 sin 𝜃) 2 𝑠11 + (− 𝑥1 sin 𝜃 + 𝑥2 cos 𝜃)2 𝑠22
  56. 56. Continued………….
  57. 57. 12/12/2016 General Statistical Distance )])((2 ))((2))((2 )( )()([ ),( ]222 [ ),( ),,,(),0,,0,0(),,,,( 11,1 331113221112 2 2 2222 2 1111 1,131132112 22 222 2 111 2121 pppppp pppp pppp ppp pp yxyxa yxyxayxyxa yxa yxayxa QPd xxaxxaxxa xaxaxa POd yyyQOxxxP               
  58. 58. • The above distances are completely determined by the coefficients(weights) 𝑎𝑖𝑘 ; i, k = 1,2,3, … … … p. These are can be arranged in rectangular array as this array (matrix) must be symmetric positive definite.
  59. 59. Why Positive definite ???? Let A be a positive definite matrix . A=C’C X’AX= X’C’CX = (CX)’(CX) = Y’Y It obeys all the distance property. X’AX is distance , For different A it gives different distance .
  60. 60. • Why positive definite matrix ???????? • Ans: Spectral decomposition : the spectral decomposition of a k×k symmetric matrix A is given by • Where (λ𝑖, 𝑒𝑖); 𝑖 = 1,2, … … … , 𝑘 are pair of eigenvalues and eigenvectors. And λ1 ≥ λ2 ≥ λ3 ≥ … … . . And if pd λ𝑖 > 0 & invertible .
  61. 61. 4.0 4.5 5.0 5.5 6.0 2 3 4 5 λ1 λ2 𝑒1 𝑒2
  62. 62. • Suppose p=2. The distance from origin is By spectral decomposition X1 X2 𝐶 √λ1 𝐶 √λ2
  63. 63. Another property is Thus We use this property in Mahalanobis distance
  64. 64. 12/12/2016 Necessity of Statistical Distance Center of gravity Another point
  65. 65. • Consider the Euclidean distances from the point Q to the points P and the origin O. • Obviously d(PQ) > d (QO )  But, P appears to be more like the points in the cluster than does the origin .  If we take into account the variability of the points in cluster and measure distance by statistical distance , then Q will be closer to P than O .
  66. 66. Mahalanobis distance • The Mahalanobis distance is a descriptive statistic that provides a relative measure of a data point's distance from a common point. It is a unitless measure introduced by P. C. Mahalanobis in 1936
  67. 67. Intuition of Mahalanobis Distance • Recall the eqution d(O,P)= 𝑥′ 𝐴𝑥 => 𝑑2 (𝑂, 𝑃) =𝑥′ 𝐴𝑥 Where x= 𝑥1 𝑥2 , A= 𝑎11 𝑎12 𝑎21 𝑎22
  68. 68. Intuition of Mahalanobis Distance d(O,P)= 𝑥′ 𝐴𝑥 𝑑2 𝑂, 𝑃 = 𝑥′ 𝐴𝑥 Where 𝑥′ = 𝑥1 𝑥2 𝑥3 ⋯ 𝑥 𝑝 ; A=
  69. 69. Intuition of Mahalanobis Distance 𝑑2 (𝑃, 𝑄) = 𝑥 − 𝑦 ′ 𝐴(𝑥 − 𝑦) where, 𝑥′ = 𝑥1, 𝑥2, … , 𝑥 𝑝 ; 𝑦′ = (𝑦1, 𝑦2, … 𝑦𝑝) A=
  70. 70. Mahalanobis Distance • Mahalanobis used ,inverse of covariance matrix Σ instead of A • Thus 𝑑2 𝑂, 𝑃 = 𝑥′ Σ−1 𝑥 ……………..(1) • And used 𝜇 (𝑐𝑒𝑛𝑡𝑒𝑟 𝑜𝑓 𝑔𝑟𝑎𝑣𝑖𝑡𝑦 ) instead of y 𝑑2 (𝑃, 𝑄) = (𝑥 − 𝜇 )′Σ−1 (𝑥 − 𝜇)………..(2) Mah- alan- obis dist- ance
  71. 71. Mahalanobis Distance • The above equations are nothing but Mahalanobis Distance …… • For example, suppose we took a single observation from a bivariate population with Variable X and Variable Y, and that our two variables had the following characteristics
  72. 72. • single observation, X = 410 and Y = 400 The Mahalanobis distance for that single value as:
  73. 73. • ghk 1.825
  74. 74. • Therefore, our single observation would have a distance of 1.825 standardized units from the mean (mean is at X = 500, Y = 500). • If we took many such observations, graphed them and colored them according to their Mahalanobis values, we can see the elliptical Mahalanobis regions come out
  75. 75. • The points are actually distributed along two primary axes:
  76. 76. If we calculate Mahalanobis distances for each of these points and shade them according to their distance value, we see clear elliptical patterns emerge:
  77. 77. • We can also draw actual ellipses at regions of constant Mahalanobis values: 68% obs 95% obs 99.7% obs
  78. 78. • Which ellipse do you choose ?????? Ans : Use the 68-95-99.7 rule . 1) about two-thirds (68%) of the points should be within 1 unit of the origin (along the axis). 2) about 95% should be within 2 units 3)about 99.7 should be within 3 units
  79. 79. If normal
  80. 80. Sample Mahalanobis Distancce • The sample Mahalanobis distance is made by replacing Σ by S and 𝜇 by 𝑋 • i.e (X- 𝑋)’ 𝑆−1 (X- 𝑋)
  81. 81. For sample (X- 𝑿)’ 𝑺−𝟏 (X- 𝑿)≤ 𝝌 𝟐 𝒑 (∝) Distribution of mahalanobis distance
  82. 82. Distribution of mahalanobis distance Let 𝑋1, 𝑋2, 𝑋3, … … … , 𝑋 𝑛 be in dependent observation from any population with mean 𝜇 and finite (nonsingular) covariance Σ . Then  𝑛 ( 𝑋 − 𝜇) is approximately 𝑁𝑝(0, Σ) and  𝑛 𝑋 − 𝜇 ′ 𝑆−1 ( 𝑋 − 𝜇) is approximately χ 𝑝 2 for n-p large This is nothing but central limit theorem
  83. 83. Mahalanobis distance in R • ########### Mahalanobis Distance ########## • x=rnorm(100);x • dm=matrix(x,nrow=20,ncol=5,byrow=F);dm ##dm = data matrix • cm=colMeans(dm);cm ## cm= column means • cov=cov(dm);cov ##cov = covariance matrix • incov=solve(cov);incov ##incov= inverse of covarianc matrix
  84. 84. Mahalanobis distance in R • ####### MAHALANOBIS DISTANCE : MANUALY ###### • @@@ Mahalanobis distance of first • observation@@@@@@ • ob1=dm[1,];ob1 ## first observation • mv1=ob1-cm;mv1 ## deviatiopn of first observation from center of gravity • md1=t(mv1)%*%incov%*%mv1;md1 ## mahalanobis distance of first observation from center of gravity •
  85. 85. Mahalanobis distance in R • @@@@@@ Mahalanobis distance of second observation@@@@@ • ob2=dm[2,];ob2 ## second observation • mv2=ob2-cm;mv2 ## deviatiopn of second • observation from • center of gravity • md2=t(mv2)%*%incov%*%mv2;md2 ##mahalanobis distance of second observation from center of gravity ................ ……………… …..……………
  86. 86. Mahalanobis distance in R ………....... ……………… …………… @@@@@ Mahalanobis distance of 20th observation@@@@@ • Ob20=dm[,20];ob20 [## 20th observation • mv20=ob20-cm;mv20 ## deviatiopn of 20th observation from center of gravity • md20=t(mv20)%*%incov%*%mv20;md20 ## mahalanobis distance of 20thobservation from center of gravity
  87. 87. Mahalanobis distance in R ####### MAHALANOBIS DISTANCE : PACKAGE ######## • md=mahalanobis(dm,cm,cov,inverted =F);md ## md =mahalanobis distance • md=mahalanobis(dm,cm,cov);md
  88. 88. Another example • x <- matrix(rnorm(100*3), ncol = 3) • Sx <- cov(x) • D2 <- mahalanobis(x, colMeans(x), Sx)
  89. 89. • plot(density(D2, bw = 0.5), main="Squared Mahalanobis distances, n=100, p=3") • qqplot(qchisq(ppoints(100), df = 3), D2, main = expression("Q-Q plot of Mahalanobis" * ~D^2 * " vs. quantiles of" * ~ chi[3]^2)) • abline(0, 1, col = 'gray') • ?? mahalanobis
  90. 90. Acknowledgement Prof . Mohammad Nasser . Richard A. Johnson & Dean W. Wichern . & others
  91. 91. THANK YOU ALL
  92. 92. Necessity of Statistical Distance In home Mother In mess Female maid Student in mess

×