Correspondence Analysis

•Download as PPT, PDF•

2 likes•5,054 views

Gaetan Lion

An introductory overview of Correspondence Analysis.

Education

Correspondence Analysis with XLStat Guy Lion Financial Modeling April 2005

Capabilities ,[object Object],[object Object]

4 Steps ,[object Object],[object Object],[object Object],[object Object]

An Example: Moviegoers You classify by Age buckets the opinions of 1357 movie viewers on a movie.

Testing Independence: Chi Square One cell (16-24/Good) accounts for 49.3% (73.1/148.3) of the Chi Square value for all 28 cells. Observed Expected Bad Average Good Very Good Total Bad Average Good Very Good Total 16-24 69 49 48 41 207 16-24 124.2 41.2 14.9 26.7 207 25-34 148 45 14 22 229 25-34 137.4 45.6 16.5 29.5 229 35-44 170 65 12 29 276 35-44 165.6 54.9 19.9 35.6 276 45-54 159 57 12 28 256 45-54 153.6 50.9 18.5 33.0 256 55-64 122 26 6 18 172 55-64 103.2 34.2 12.4 22.2 172 65-74 106 21 5 23 155 65-74 93.0 30.8 11.2 20.0 155 75+ 40 7 1 14 62 75+ 37.2 12.3 4.5 8.0 62 Total 814 270 98 175 1357 Total 814 270 98 175 1357 60% 20% 7% 13% 100% 60% 20% 7% 13% 100% Chi Square Calculations (Observed - Expected) 2 /Expected Bad Average Good Very Good Total (48 - 14.9) 2 /14.9 = 73.1 16-24 24.5 1.5 73.1 7.7 106.7 25-34 0.8 0.0 0.4 1.9 3.1 35-44 0.1 1.9 3.2 1.2 6.3 45-54 0.2 0.7 2.3 0.8 4.0 55-64 3.4 2.0 3.3 0.8 9.5 Chi Squ. 148.3 65-74 1.8 3.1 3.4 0.5 8.8 DF 18 = (7 -1)(4 - 1) 75+ 0.2 2.3 2.7 4.5 9.7 p value 1.613E-22 31.1 11.5 88.3 17.3 148.3

Eigenvalues of Dimensions Dimension F1 Eigenvalue 0.095 explains 86.6% (0.095/0.109) of the Inertia or Variance. F1 Coordinates are derived using PCA.

Singular Value Singular value = SQRT(Eigenvalue). It is the maximum Canonical Correlation between the categories of the variables in analysis for any given dimension.

Calculating Chi Square Distance for Points-rows Chi Square Distance defines the distance between a Point-row and the Centroid (Average) at the intersection of the F1 and F2 dimensions. The Point-row 16-24 is most distant from Centroid (0.72).

Calculating Inertia [or Variance] using Points-rows XLStat calculates this table. It shows what Row category generates the most Inertia (Row 16-24 accounts for 72% of it)

2 other ways to calculate Inertia ,[object Object],[object Object]

Contribution of Points-rows to Dimension F1 The contribution of points to dimensions is the proportion of Inertia of a Dimension explained by the Point. The contribution of Points-rows to dimensions help us interpret the dimensions. The sum of contributions for each dimension equals 100%.

Contribution of Dimension to Points-rows. Squared Correlation . ,[object Object],[object Object]

Squared Correlation = COS 2 If Contribution is high, the angle between the point vector and the axis is small.

Quality Quality = Sum of the Squared Correlations for dimensions shown (normally F1 and F2). Quality is different for each Point-row (or Point-column). Quality represents whether the Point on a two dimensional graph is accurately represented. Quality is interpreted as proportion of Chi Square accounted for given the respective number of dimensions. A low quality means the current number of dimensions does not represent well the respective row (or column).

Contribution of Points-column to Dimension F1 Contribution = (Col.Mass)(Coordinate 2 )/Eigenvalue

Contribution of Dimension F1 to Points-columns

Conclusion ,[object Object],[object Object],[object Object]

Conclusion (continued) We have to remember that we can’t directly compare the Distance across categories (Row vs Column). We see that the 16-24 Point-row makes a greater contribution to Inertia and overall Chi Square vs the Good Point-column. This is because the 16-24 Point-row has a greater mass (207 occurrences vs only 98 for Good).

What's hot

Correlation and regressionPadma Metta

Regression analysisShameer P Hamsa

Regression Analysis presentation by Al Arizmendez and Cathryn LottierAl Arizmendez

Principal Component Analysis and ClusteringUsha Vijay

Structural Equation Modelling (SEM) Part 1COSTARCH Analytical Consulting (P) Ltd.

Regression analysisDr.ammara khakwani

RegressionLavanyaK75

Correlation and Regression Dr. Tushar J Bhatt

Regression analysissaba khan

Multinomial Logistic RegressionDr Athar Khan

What is chi square testTalent Corner HR Services Pvt Ltd.

CorrelationJames Neill

General Linear Model | StatisticsTransweb Global Inc

Structural Equation Modelling (SEM) Part 3COSTARCH Analytical Consulting (P) Ltd.

Introduction to Generalized Linear Modelsrichardchandler

Regressionmandrewmartin

Simple linear regression and correlationShakeel Nouman

Assumptions of Linear Regression - Machine LearningKush Kulshrestha

Linear regression analysisNimrita Koul

Chi squareKaori Kubo Germano, PhD

What's hot (20)

Correlation and regression

Regression analysis

Regression Analysis presentation by Al Arizmendez and Cathryn Lottier

Principal Component Analysis and Clustering

Structural Equation Modelling (SEM) Part 1

Regression analysis

Regression

Correlation and Regression

Regression analysis

Multinomial Logistic Regression

What is chi square test

Correlation

General Linear Model | Statistics

Structural Equation Modelling (SEM) Part 3

Introduction to Generalized Linear Models

Regression

Simple linear regression and correlation

Assumptions of Linear Regression - Machine Learning

Linear regression analysis

Chi square

Similar to Correspondence Analysis

What is the KMeans Clustering Algorithm and How Does an Enterprise Use it to ...Smarten Augmented Analytics

Cmcchapter02 100613132406-phpapp02Cleophas Rwemera

Cmc chapter 02Jane Hamze

Stats chapter 1Richard Ferreria

Practice test1 solutionLong Beach City College

Statistik Chapter 2WanBK Leo

Dynamic Kohonen Network for Representing Changes in InputsJean Fecteau

measure of variability (windri). In research include examplewindri3

S5 pnInternational advisers

StatisticsSophiyaPrabin

Matrix algebra in_rRazzaqe

Day2 session i&ii - spssabir hossain

Displaying dataAbhishekDas15

Univariate, bivariate analysis, hypothesis testing, chi squarekongara

02 PSBE3_PPT.Ch01_2_Examining Distribution.pptBishoyRomani

Empirics of standard deviationAdebanji Ayeni

Research MethodologyEvanNathan3

Two Dimensional Shape and Texture Quantification - Medical Image ProcessingChamod Mune

Demand forecasting methods 1 gpPUTTU GURU PRASAD

RegressionLong Beach City College

Similar to Correspondence Analysis (20)

What is the KMeans Clustering Algorithm and How Does an Enterprise Use it to ...

Cmcchapter02 100613132406-phpapp02

Cmc chapter 02

Stats chapter 1

Practice test1 solution

Statistik Chapter 2

Dynamic Kohonen Network for Representing Changes in Inputs

measure of variability (windri). In research include example

S5 pn

Statistics

Matrix algebra in_r

Day2 session i&ii - spss

Displaying data

Univariate, bivariate analysis, hypothesis testing, chi square

02 PSBE3_PPT.Ch01_2_Examining Distribution.ppt

Empirics of standard deviation

Research Methodology

Two Dimensional Shape and Texture Quantification - Medical Image Processing

Demand forecasting methods 1 gp

Regression

Recently uploaded

Food safety_Challenges food safety laboratories_.pdfSherif Taha

TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection

SOC 101 Demonstration of Learning Presentationcamerronhm

HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC

Making communications land - Are they received and understood as intended? we...Association for Project Management

Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid

General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil

On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash

This PowerPoint helps students to consider the concept of infinity.christianmathematics

Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic

Salient Features of India constitution especially power and functionsKarakKing

Single or Multiple melodic lines structuredhanjurrannsibayan2

Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop

Holdier Curriculum Vitae (April 2024).pdfagholdier

How to Create and Manage Wizard in Odoo 17Celine George

Towards a code of practice for AI in AT.pptxJisc

REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda

Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh

FSB Advising Checklist - Orientation 2024Elizabeth Walsh

Recently uploaded (20)

Food safety_Challenges food safety laboratories_.pdf

TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...

SOC 101 Demonstration of Learning Presentation

HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx

Making communications land - Are they received and understood as intended? we...

Basic Civil Engineering first year Notes- Chapter 4 Building.pptx

General Principles of Intellectual Property: Concepts of Intellectual Proper...

On National Teacher Day, meet the 2024-25 Kenan Fellows

This PowerPoint helps students to consider the concept of infinity.

Key note speaker Neum_Admir Softic_ENG.pdf

Salient Features of India constitution especially power and functions

Single or Multiple melodic lines structure

Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...

Holdier Curriculum Vitae (April 2024).pdf

How to Create and Manage Wizard in Odoo 17

Towards a code of practice for AI in AT.pptx

REMIFENTANIL: An Ultra short acting opioid.pptx

Micro-Scholarship, What it is, How can it help me.pdf

FSB Advising Checklist - Orientation 2024

Correspondence Analysis

1. Correspondence Analysis with XLStat Guy Lion Financial Modeling April 2005

2. Statistical Methods Classification

3. The Solar (PCA) System

6. An Example: Moviegoers You classify by Age buckets the opinions of 1357 movie viewers on a movie.

7. Testing Independence: Chi Square One cell (16-24/Good) accounts for 49.3% (73.1/148.3) of the Chi Square value for all 28 cells. Observed Expected Bad Average Good Very Good Total Bad Average Good Very Good Total 16-24 69 49 48 41 207 16-24 124.2 41.2 14.9 26.7 207 25-34 148 45 14 22 229 25-34 137.4 45.6 16.5 29.5 229 35-44 170 65 12 29 276 35-44 165.6 54.9 19.9 35.6 276 45-54 159 57 12 28 256 45-54 153.6 50.9 18.5 33.0 256 55-64 122 26 6 18 172 55-64 103.2 34.2 12.4 22.2 172 65-74 106 21 5 23 155 65-74 93.0 30.8 11.2 20.0 155 75+ 40 7 1 14 62 75+ 37.2 12.3 4.5 8.0 62 Total 814 270 98 175 1357 Total 814 270 98 175 1357 60% 20% 7% 13% 100% 60% 20% 7% 13% 100% Chi Square Calculations (Observed - Expected) 2 /Expected Bad Average Good Very Good Total (48 - 14.9) 2 /14.9 = 73.1 16-24 24.5 1.5 73.1 7.7 106.7 25-34 0.8 0.0 0.4 1.9 3.1 35-44 0.1 1.9 3.2 1.2 6.3 45-54 0.2 0.7 2.3 0.8 4.0 55-64 3.4 2.0 3.3 0.8 9.5 Chi Squ. 148.3 65-74 1.8 3.1 3.4 0.5 8.8 DF 18 = (7 -1)(4 - 1) 75+ 0.2 2.3 2.7 4.5 9.7 p value 1.613E-22 31.1 11.5 88.3 17.3 148.3

8. Row Mass & Profile

9. Eigenvalues of Dimensions Dimension F1 Eigenvalue 0.095 explains 86.6% (0.095/0.109) of the Inertia or Variance. F1 Coordinates are derived using PCA.

10. Singular Value Singular value = SQRT(Eigenvalue). It is the maximum Canonical Correlation between the categories of the variables in analysis for any given dimension.

11. Calculating Chi Square Distance for Points-rows Chi Square Distance defines the distance between a Point-row and the Centroid (Average) at the intersection of the F1 and F2 dimensions. The Point-row 16-24 is most distant from Centroid (0.72).

12. Calculating Inertia [or Variance] using Points-rows XLStat calculates this table. It shows what Row category generates the most Inertia (Row 16-24 accounts for 72% of it)

13.

14. Contribution of Points-rows to Dimension F1 The contribution of points to dimensions is the proportion of Inertia of a Dimension explained by the Point. The contribution of Points-rows to dimensions help us interpret the dimensions. The sum of contributions for each dimension equals 100%.

15.

16. Squared Correlation = COS 2 If Contribution is high, the angle between the point vector and the axis is small.

17. Quality Quality = Sum of the Squared Correlations for dimensions shown (normally F1 and F2). Quality is different for each Point-row (or Point-column). Quality represents whether the Point on a two dimensional graph is accurately represented. Quality is interpreted as proportion of Chi Square accounted for given the respective number of dimensions. A low quality means the current number of dimensions does not represent well the respective row (or column).

18. Plot of Points-Rows

19. Review of Calculation Flows

20. Column Profile & Mass

21. Calculating Chi Square Distance for Points-column Distance = SQRT(Sum(Column Profile – Avg. Column Profile 2 /Avg. Column Profile)

22. Contribution of Points-column to Dimension F1 Contribution = (Col.Mass)(Coordinate 2 )/Eigenvalue

23. Contribution of Dimension F1 to Points-columns

24. Plot of Points-Columns

25. Plot of all Points

26. Observing the Correspondences

27.

28. Conclusion (continued) We have to remember that we can’t directly compare the Distance across categories (Row vs Column). We see that the 16-24 Point-row makes a greater contribution to Inertia and overall Chi Square vs the Good Point-column. This is because the 16-24 Point-row has a greater mass (207 occurrences vs only 98 for Good).

Correspondence Analysis

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Correspondence Analysis

Similar to Correspondence Analysis (20)

More from Gaetan Lion

More from Gaetan Lion (20)

Recently uploaded

Recently uploaded (20)

Correspondence Analysis