0
INTRODUCTION TO MULTIVARIATE STATISTICS Dr. Debdulal Dutta Roy, Ph.D. (Psy.) Psychology Research Unit Indian Statistical I...
The Institute <ul><li>Indian Statistical Institute (ISI), a unique   institution devoted to the research, teaching and app...
MYTHS <ul><li>Statistical treatment of more than 2 variables is multivariate statistics; </li></ul><ul><ul><li>No, when mo...
Myth 2 <ul><li>Purpose of multivariate statistics is to establish correlation among sets of variables. </li></ul><ul><ul><...
Myth 3 <ul><li>Loss of original score  </li></ul><ul><ul><li>Accepted, if analysis extracts more latent properties within ...
What is MVS ? <ul><li>MVS refers to the set of statistical tools in order to find out pattern of relationship among the se...
List of Multivariate Statistical Tools Can we assess their perception, beliefs and attitudes ?
List of Multivariate Statistical Tools <ul><li>Determining differences among groups : </li></ul><ul><ul><li>ANOVA with Fac...
List of Multivariate Research  Questions on Women and Child development
Multivariate Research Questions for Women & Child Development (Difference Perspective) <ul><li>Multiple Regression </li></...
Multivariate Research Questions for Women & Child Development (Relation Perspective) <ul><li>Canonical correlation  </li><...
Some studies on Application of Multivariate statistics
Principal Component Analysis <ul><li>Principal component analysis is a technique (1) to  reduce  the number of variables a...
Study on Principal Component Analysis <ul><li>Purpose: To determine principal components of school infrastructure percepti...
Significant correlations suggest possible latent structure of relationship signifying latent meaning  1 0.19** 0.08* -0.03...
Extraction of Factors using PCA 1.34 1.67 3.5 Eigen Value -0.17 0.46 0.38 Health checkup 0.26 -0.02 0.49 Mid-day meal 0.36...
Plot of Eigenvalues
Limitation of PCA <ul><li>PCA is applicable when variables are measured in terms of Interval and Ratio scales. </li></ul><...
Correspondence Analysis <ul><li>Correspondence analysis is an exploratory multivariate technique that converts frequency t...
Study on CA <ul><li>Purpose: To determine correspondence between computer programming tasks and relative use. </li></ul><u...
INPUT TABLE FOR CA <ul><li>Cross Tabulation of 5 Cols. (Rating categories) X 14 Rows (Computer programming tasks). </li></...
Correspondence Map of 14 Row and 5 Col. variables
Cluster Analysis  <ul><li>Cluster analysis helps to identify similar entities on the basis of characteristics they possess...
Tree diagram based on cluster analysis
MANOVA <ul><li>MANOVA is a tool to determine significant differences in one correlated variables among the groups. </li></...
Fisher’s Linear Discriminant Functions for differentiating Schools with Good and Poor Infrastructure -15.57 -26.27 Constan...
Classification Matrix of Good and poor schools in terms of infrastructure availability Correct Classification Percentage= ...
Box-plot Analysis of Discriminant Scores  between Good and Poor Infrastructure Schools.
Some of my studies on MVS  <ul><li>Dutta Roy, D. (2007). Taxonomic approach in Job analysis. Psychological assessment in P...
Thank You
Upcoming SlideShare
Loading in...5
×

Statistics

1,114

Published on

Multivariate statistical models in research on Women and Children

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,114
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
140
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "Statistics "

  1. 1. INTRODUCTION TO MULTIVARIATE STATISTICS Dr. Debdulal Dutta Roy, Ph.D. (Psy.) Psychology Research Unit Indian Statistical Institute Kolkata – 700 108 E-mail:ddroy@isical.ac.in (o) [email_address] http://www.isical.ac.in/~ddroy/abstract.html
  2. 2. The Institute <ul><li>Indian Statistical Institute (ISI), a unique  institution devoted to the research, teaching and application of statistics, natural sciences and social sciences. Founded by Professor P.C. Mahalanobis in Kolkata on 17th December, 1931, the institute gained the status of an Institution of National Importance by an act of the Indian Parliament in 1959. </li></ul><ul><li>Research in Statistics and related disciplines is the primary activity of the Institute. Teaching activities are undertaken mainly in Kolkata, Delhi and Bangalore. </li></ul>
  3. 3. MYTHS <ul><li>Statistical treatment of more than 2 variables is multivariate statistics; </li></ul><ul><ul><li>No, when more than 2 variables are interrelated with each other, we can use multivariate statistics. </li></ul></ul>
  4. 4. Myth 2 <ul><li>Purpose of multivariate statistics is to establish correlation among sets of variables. </li></ul><ul><ul><li>True. But it’s purpose is not limited in determining relation among set of variables. It tends to control the effect of some intervening variables on relationship among sets of variables. </li></ul></ul>
  5. 5. Myth 3 <ul><li>Loss of original score </li></ul><ul><ul><li>Accepted, if analysis extracts more latent properties within the variable. </li></ul></ul>
  6. 6. What is MVS ? <ul><li>MVS refers to the set of statistical tools in order to find out pattern of relationship among the set of variables – Independent, dependent and intervening variables. </li></ul><ul><li>The definition suggests that MVS can not be used when the variables are not correlated with each other. </li></ul><ul><li>Therefore, before going for MVS, it is necessary to do correlation among them. </li></ul>
  7. 7. List of Multivariate Statistical Tools Can we assess their perception, beliefs and attitudes ?
  8. 8. List of Multivariate Statistical Tools <ul><li>Determining differences among groups : </li></ul><ul><ul><li>ANOVA with Factorial design; </li></ul></ul><ul><ul><li>MANOVA; </li></ul></ul><ul><ul><li>Discriminant Function Analysis; </li></ul></ul><ul><li>Determining structure of relationship : </li></ul><ul><ul><li>Multiple Regression </li></ul></ul><ul><ul><li>Canonical Correlation </li></ul></ul><ul><ul><li>Principal component analysis; </li></ul></ul><ul><ul><li>Correspondence analysis; </li></ul></ul><ul><ul><li>Cluster analysis; </li></ul></ul>
  9. 9. List of Multivariate Research Questions on Women and Child development
  10. 10. Multivariate Research Questions for Women & Child Development (Difference Perspective) <ul><li>Multiple Regression </li></ul><ul><ul><li>What is the predictive strength of poverty, inequality, war, criminal networks, ruthless demand for cheap labour and commercial sexual exploitation in predicting motivation to human trafficking ? </li></ul></ul><ul><li>Factorial Design : </li></ul><ul><ul><li>Does eating habit (DV) of women vary with age, education and socio-economic status (IV) ? </li></ul></ul><ul><li>MANOVA </li></ul><ul><ul><li>Does food belief of pregnant mother vary with religion ? </li></ul></ul><ul><li>Discriminant function analysis ? </li></ul><ul><ul><li>What is the predictive capacity of food attitude questionnaire to classify students in terms of their mid-day meal taking ? </li></ul></ul>
  11. 11. Multivariate Research Questions for Women & Child Development (Relation Perspective) <ul><li>Canonical correlation </li></ul><ul><ul><li>Is there any relation between awareness of nutrition and motivation to follow good food taking habits ? </li></ul></ul><ul><li>Principal Component analysis </li></ul><ul><ul><li>What is the meaning of good food taking motivation ? </li></ul></ul><ul><li>Correspondence analysis </li></ul><ul><ul><li>Can we map different districts of one state in terms of human trafficking ? </li></ul></ul><ul><li>Cluster analysis </li></ul><ul><ul><li>Is it possible to classify states in terms of immunization ? </li></ul></ul>
  12. 12. Some studies on Application of Multivariate statistics
  13. 13. Principal Component Analysis <ul><li>Principal component analysis is a technique (1) to reduce the number of variables and (2) to detect structure in the relationships between variables, that is to classify variables . Therefore, PCA is applied as a data reduction or structure detection method. In principal component analysis, we seek linear composites of the original variables that display certain desirable properties, namely, scores that exhibit maximal variance, subject to being uncorrelated with previously computed composites. </li></ul>
  14. 14. Study on Principal Component Analysis <ul><li>Purpose: To determine principal components of school infrastructure perception. </li></ul><ul><li>Assumption: School infrastructure perception encompasses set of 13 variables – perception of students to school infrastructures as Classroom, Drinking Water, Toilet, Blackboard, Teachers, Book, Teaching learning materials, Friends, Games, Cultural Programs, Book bank, Mid-day Meal, and Health Check-up. Most of these infrastructures are available in rural schools. It is assumed that there would be some latent structure in perception of 13 variables. </li></ul>
  15. 15. Significant correlations suggest possible latent structure of relationship signifying latent meaning 1 0.19** 0.08* -0.03 0.04 0.25** 0.01 0.11* 0.22** 0.20** 0.29** 0.21** 0.33** Health checkup 13     1 0.05 0.15** 0.18** 0.07 -0.04 -0.01 -0.02 0.20** 0.17** 0 Book bank 11       1 0.30** 0.27** 0.24** 0.21** 0.34** 0.14** 0.07 0.05 0.11* Cultural programme 10         1 0.31** 0.33** 0.21** 0.24** 0.14** 0.11* 0.09* 0.08* Games 9           1 0.24** 0.31** 0.29** 0.21** 0.34** 0.30** 0.32** Friend 8             1 0.23** 0.24** 0.20** 0.05 -0.06 0.13** TLM 7               1 0.43** 0.41** 0.23** 0.11* 0.36** Book 6                 1 0.38** 0.27** 0.22** 0.38** Teaching 5                   1 0.22** 0.09* 0.33** Black board 4                     1 0.61** 0.28** Toilet 3                       1 0.25** Drinking Water 2                         1 Classroom 1 13 12 11 10 9 8 7 6 5 4 3 2 1    
  16. 16. Extraction of Factors using PCA 1.34 1.67 3.5 Eigen Value -0.17 0.46 0.38 Health checkup 0.26 -0.02 0.49 Mid-day meal 0.36 0.47 -0.35 Book bank 0.63 -0.01 0.19 Cultural Programme 0.74 0.11 0.07 Games 0.44 0.49 0.24 Friend 0.63 -0.09 0.21 TLM 0.25 0.07 0.68 Book 0.29 0.19 0.63 Teaching 0.1 0.06 0.7 Black board 0.03 0.79 0.2 Toilet -0.03 0.81 0.06 Drinking Water -0.03 0.35 0.62 Class room Activity based Infrastructure Supportive Infrastructure Basic Infrastructure Infrastructures
  17. 17. Plot of Eigenvalues
  18. 18. Limitation of PCA <ul><li>PCA is applicable when variables are measured in terms of Interval and Ratio scales. </li></ul><ul><li>When variables are measured in terms of nominal or categorical scale, Correspondence analysis is useful statistical tool. </li></ul>
  19. 19. Correspondence Analysis <ul><li>Correspondence analysis is an exploratory multivariate technique that converts frequency table data into graphical displays in which rows and columns are depicted as points. It provides a method for comparing row or column proportions in a two-way or multiway table. CA investigates the magnitude and the substantive nature of association between the row and column categories of cross tabulation rather than to confirm or reject hypothesis about the underlying process. </li></ul><ul><li>These methods were originally developed in France by Jean-Paul Benzerci in the early 1960’s and 1970’s and it has gained importance in the classic text by Greenacre (1984). </li></ul><ul><li>Other names : correspondence mapping, perceptual mapping, social space analysis, correspondence factor analysis, principal components analysis of qualitative data, and dual scaling; </li></ul><ul><li>Types : Simple and Multiple. </li></ul>
  20. 20. Study on CA <ul><li>Purpose: To determine correspondence between computer programming tasks and relative use. </li></ul><ul><li>Assumptions : Uses of 14 computer programming tasks vary. </li></ul><ul><li>Data Sets : 14 programming tasks (row variables) and 5 ratings of use (column variables). </li></ul>
  21. 21. INPUT TABLE FOR CA <ul><li>Cross Tabulation of 5 Cols. (Rating categories) X 14 Rows (Computer programming tasks). </li></ul><ul><li>Assumption: Some tasks are related with each other and some of them are more frequently used and some are used less frequently. </li></ul>
  22. 22. Correspondence Map of 14 Row and 5 Col. variables
  23. 23. Cluster Analysis <ul><li>Cluster analysis helps to identify similar entities on the basis of characteristics they possess. It helps to classify objects or variables having functional homogeneity. The resulting object clusters should exhibit high internal homogeneity (within cluster) and high external homogeneity between any two clusters. It is an inductive treatment and a purely empirical method of classification. </li></ul>
  24. 24. Tree diagram based on cluster analysis
  25. 25. MANOVA <ul><li>MANOVA is a tool to determine significant differences in one correlated variables among the groups. </li></ul>0 9,153 5.13 0.77 School 0 9,562 6.49 0.9 Blocks 0 452,499 14.44 0.35 District 0 9,242 3.03 0.9 S-E-S 0 9,532 5.06 0.92 Religion NS 9,558 1.76 0.97 Gender P-value df Rao’s R Wilks’ Lambda Variables
  26. 26. Fisher’s Linear Discriminant Functions for differentiating Schools with Good and Poor Infrastructure -15.57 -26.27 Constant 2.03 2.58 Equal Opportunity 4.51 6.2 Reliability 3.76 4.93 Comfort -0.45 -1.27 Safety -0.46 5.05 Cleanliness Poor Infrastructure Good Infrastructure Attitudinal Dimensions 0 5 101.41 0.53 0.687 0.9 P-Value Df Chi-Square Wilk's Lambda Canonical Correlation Eigen Values
  27. 27. Classification Matrix of Good and poor schools in terms of infrastructure availability Correct Classification Percentage= (75+60)/163 x 100=82.8 100 76.9 23.1   Poor 100 11.8 88.2 Percentage Good 163 70 93   Total 78 60 18 Poor 85 10 75 Count Good Total Predicted Group Poor Infrastructure Predicted Group Good Infrastructure Original Group
  28. 28. Box-plot Analysis of Discriminant Scores between Good and Poor Infrastructure Schools.
  29. 29. Some of my studies on MVS <ul><li>Dutta Roy, D. (2007). Taxonomic approach in Job analysis. Psychological assessment in Personnel selection. In Dr. S. Subramony and S.B.Raj (Eds.), Psychological assessment in Personnel Selection. Delhi: Defense Institute of Psychological Research, p.25-39 </li></ul><ul><li>Dutta Roy,D.(2006). Clusturing academic profiles of tribal and non-tribal school students of Manipur. Journal of Psychometry, 20,2, 1-12. </li></ul><ul><li>Dutta Roy,D.(2006). Clusturing academic profiles of tribal and non-tribal school students of Manipur. Journal of Psychometry, 20,2, 1-12. </li></ul><ul><li>Dutta Roy, D. (2002) Personality differences across four metropolitan cities of India , Indian Psychological Review , 58,2,71-78. </li></ul><ul><li>Dutta Roy.D. and Bannerjee,I.(1998) Correspondence analysis between stimulus length and amount of forgetting in assessment of short term memory span , Indian Journal of Psychometry and Education, 29,1,7-12 </li></ul>
  30. 30. Thank You
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×