Your SlideShare is downloading. ×
Statistics
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Statistics

1,077
views

Published on

Multivariate statistical models in research on Women and Children

Multivariate statistical models in research on Women and Children


0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,077
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
137
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. INTRODUCTION TO MULTIVARIATE STATISTICS Dr. Debdulal Dutta Roy, Ph.D. (Psy.) Psychology Research Unit Indian Statistical Institute Kolkata – 700 108 E-mail:ddroy@isical.ac.in (o) [email_address] http://www.isical.ac.in/~ddroy/abstract.html
  • 2. The Institute
    • Indian Statistical Institute (ISI), a unique  institution devoted to the research, teaching and application of statistics, natural sciences and social sciences. Founded by Professor P.C. Mahalanobis in Kolkata on 17th December, 1931, the institute gained the status of an Institution of National Importance by an act of the Indian Parliament in 1959.
    • Research in Statistics and related disciplines is the primary activity of the Institute. Teaching activities are undertaken mainly in Kolkata, Delhi and Bangalore.
  • 3. MYTHS
    • Statistical treatment of more than 2 variables is multivariate statistics;
      • No, when more than 2 variables are interrelated with each other, we can use multivariate statistics.
  • 4. Myth 2
    • Purpose of multivariate statistics is to establish correlation among sets of variables.
      • True. But it’s purpose is not limited in determining relation among set of variables. It tends to control the effect of some intervening variables on relationship among sets of variables.
  • 5. Myth 3
    • Loss of original score
      • Accepted, if analysis extracts more latent properties within the variable.
  • 6. What is MVS ?
    • MVS refers to the set of statistical tools in order to find out pattern of relationship among the set of variables – Independent, dependent and intervening variables.
    • The definition suggests that MVS can not be used when the variables are not correlated with each other.
    • Therefore, before going for MVS, it is necessary to do correlation among them.
  • 7. List of Multivariate Statistical Tools Can we assess their perception, beliefs and attitudes ?
  • 8. List of Multivariate Statistical Tools
    • Determining differences among groups :
      • ANOVA with Factorial design;
      • MANOVA;
      • Discriminant Function Analysis;
    • Determining structure of relationship :
      • Multiple Regression
      • Canonical Correlation
      • Principal component analysis;
      • Correspondence analysis;
      • Cluster analysis;
  • 9. List of Multivariate Research Questions on Women and Child development
  • 10. Multivariate Research Questions for Women & Child Development (Difference Perspective)
    • Multiple Regression
      • What is the predictive strength of poverty, inequality, war, criminal networks, ruthless demand for cheap labour and commercial sexual exploitation in predicting motivation to human trafficking ?
    • Factorial Design :
      • Does eating habit (DV) of women vary with age, education and socio-economic status (IV) ?
    • MANOVA
      • Does food belief of pregnant mother vary with religion ?
    • Discriminant function analysis ?
      • What is the predictive capacity of food attitude questionnaire to classify students in terms of their mid-day meal taking ?
  • 11. Multivariate Research Questions for Women & Child Development (Relation Perspective)
    • Canonical correlation
      • Is there any relation between awareness of nutrition and motivation to follow good food taking habits ?
    • Principal Component analysis
      • What is the meaning of good food taking motivation ?
    • Correspondence analysis
      • Can we map different districts of one state in terms of human trafficking ?
    • Cluster analysis
      • Is it possible to classify states in terms of immunization ?
  • 12. Some studies on Application of Multivariate statistics
  • 13. Principal Component Analysis
    • Principal component analysis is a technique (1) to reduce the number of variables and (2) to detect structure in the relationships between variables, that is to classify variables . Therefore, PCA is applied as a data reduction or structure detection method. In principal component analysis, we seek linear composites of the original variables that display certain desirable properties, namely, scores that exhibit maximal variance, subject to being uncorrelated with previously computed composites.
  • 14. Study on Principal Component Analysis
    • Purpose: To determine principal components of school infrastructure perception.
    • Assumption: School infrastructure perception encompasses set of 13 variables – perception of students to school infrastructures as Classroom, Drinking Water, Toilet, Blackboard, Teachers, Book, Teaching learning materials, Friends, Games, Cultural Programs, Book bank, Mid-day Meal, and Health Check-up. Most of these infrastructures are available in rural schools. It is assumed that there would be some latent structure in perception of 13 variables.
  • 15. Significant correlations suggest possible latent structure of relationship signifying latent meaning 1 0.19** 0.08* -0.03 0.04 0.25** 0.01 0.11* 0.22** 0.20** 0.29** 0.21** 0.33** Health checkup 13     1 0.05 0.15** 0.18** 0.07 -0.04 -0.01 -0.02 0.20** 0.17** 0 Book bank 11       1 0.30** 0.27** 0.24** 0.21** 0.34** 0.14** 0.07 0.05 0.11* Cultural programme 10         1 0.31** 0.33** 0.21** 0.24** 0.14** 0.11* 0.09* 0.08* Games 9           1 0.24** 0.31** 0.29** 0.21** 0.34** 0.30** 0.32** Friend 8             1 0.23** 0.24** 0.20** 0.05 -0.06 0.13** TLM 7               1 0.43** 0.41** 0.23** 0.11* 0.36** Book 6                 1 0.38** 0.27** 0.22** 0.38** Teaching 5                   1 0.22** 0.09* 0.33** Black board 4                     1 0.61** 0.28** Toilet 3                       1 0.25** Drinking Water 2                         1 Classroom 1 13 12 11 10 9 8 7 6 5 4 3 2 1    
  • 16. Extraction of Factors using PCA 1.34 1.67 3.5 Eigen Value -0.17 0.46 0.38 Health checkup 0.26 -0.02 0.49 Mid-day meal 0.36 0.47 -0.35 Book bank 0.63 -0.01 0.19 Cultural Programme 0.74 0.11 0.07 Games 0.44 0.49 0.24 Friend 0.63 -0.09 0.21 TLM 0.25 0.07 0.68 Book 0.29 0.19 0.63 Teaching 0.1 0.06 0.7 Black board 0.03 0.79 0.2 Toilet -0.03 0.81 0.06 Drinking Water -0.03 0.35 0.62 Class room Activity based Infrastructure Supportive Infrastructure Basic Infrastructure Infrastructures
  • 17. Plot of Eigenvalues
  • 18. Limitation of PCA
    • PCA is applicable when variables are measured in terms of Interval and Ratio scales.
    • When variables are measured in terms of nominal or categorical scale, Correspondence analysis is useful statistical tool.
  • 19. Correspondence Analysis
    • Correspondence analysis is an exploratory multivariate technique that converts frequency table data into graphical displays in which rows and columns are depicted as points. It provides a method for comparing row or column proportions in a two-way or multiway table. CA investigates the magnitude and the substantive nature of association between the row and column categories of cross tabulation rather than to confirm or reject hypothesis about the underlying process.
    • These methods were originally developed in France by Jean-Paul Benzerci in the early 1960’s and 1970’s and it has gained importance in the classic text by Greenacre (1984).
    • Other names : correspondence mapping, perceptual mapping, social space analysis, correspondence factor analysis, principal components analysis of qualitative data, and dual scaling;
    • Types : Simple and Multiple.
  • 20. Study on CA
    • Purpose: To determine correspondence between computer programming tasks and relative use.
    • Assumptions : Uses of 14 computer programming tasks vary.
    • Data Sets : 14 programming tasks (row variables) and 5 ratings of use (column variables).
  • 21. INPUT TABLE FOR CA
    • Cross Tabulation of 5 Cols. (Rating categories) X 14 Rows (Computer programming tasks).
    • Assumption: Some tasks are related with each other and some of them are more frequently used and some are used less frequently.
  • 22. Correspondence Map of 14 Row and 5 Col. variables
  • 23. Cluster Analysis
    • Cluster analysis helps to identify similar entities on the basis of characteristics they possess. It helps to classify objects or variables having functional homogeneity. The resulting object clusters should exhibit high internal homogeneity (within cluster) and high external homogeneity between any two clusters. It is an inductive treatment and a purely empirical method of classification.
  • 24. Tree diagram based on cluster analysis
  • 25. MANOVA
    • MANOVA is a tool to determine significant differences in one correlated variables among the groups.
    0 9,153 5.13 0.77 School 0 9,562 6.49 0.9 Blocks 0 452,499 14.44 0.35 District 0 9,242 3.03 0.9 S-E-S 0 9,532 5.06 0.92 Religion NS 9,558 1.76 0.97 Gender P-value df Rao’s R Wilks’ Lambda Variables
  • 26. Fisher’s Linear Discriminant Functions for differentiating Schools with Good and Poor Infrastructure -15.57 -26.27 Constant 2.03 2.58 Equal Opportunity 4.51 6.2 Reliability 3.76 4.93 Comfort -0.45 -1.27 Safety -0.46 5.05 Cleanliness Poor Infrastructure Good Infrastructure Attitudinal Dimensions 0 5 101.41 0.53 0.687 0.9 P-Value Df Chi-Square Wilk's Lambda Canonical Correlation Eigen Values
  • 27. Classification Matrix of Good and poor schools in terms of infrastructure availability Correct Classification Percentage= (75+60)/163 x 100=82.8 100 76.9 23.1   Poor 100 11.8 88.2 Percentage Good 163 70 93   Total 78 60 18 Poor 85 10 75 Count Good Total Predicted Group Poor Infrastructure Predicted Group Good Infrastructure Original Group
  • 28. Box-plot Analysis of Discriminant Scores between Good and Poor Infrastructure Schools.
  • 29. Some of my studies on MVS
    • Dutta Roy, D. (2007). Taxonomic approach in Job analysis. Psychological assessment in Personnel selection. In Dr. S. Subramony and S.B.Raj (Eds.), Psychological assessment in Personnel Selection. Delhi: Defense Institute of Psychological Research, p.25-39
    • Dutta Roy,D.(2006). Clusturing academic profiles of tribal and non-tribal school students of Manipur. Journal of Psychometry, 20,2, 1-12.
    • Dutta Roy,D.(2006). Clusturing academic profiles of tribal and non-tribal school students of Manipur. Journal of Psychometry, 20,2, 1-12.
    • Dutta Roy, D. (2002) Personality differences across four metropolitan cities of India , Indian Psychological Review , 58,2,71-78.
    • Dutta Roy.D. and Bannerjee,I.(1998) Correspondence analysis between stimulus length and amount of forgetting in assessment of short term memory span , Indian Journal of Psychometry and Education, 29,1,7-12
  • 30. Thank You

×