SlideShare a Scribd company logo
Multivariate statistics
           Overview
           Hoksan | January 10 ,2013 | Rotterdam




Factual
decision
making
Why multivariate statistics?

    •   Observing the correlation or factors between variables

    •   Summarizing (redundant) variables/observations

    •   Reduce overfitting issues




                                                                 Factual
                                                                 decision
                                                                 making     2
Outline

    •       Factor analysis:
        •     Principal Component Analysis
        •     Explanatory Factor Analysis


    •       Multidimensional scaling:
        •     Principle Coordinates Analysis
        •     Stress Minimization


    •       Cluster Analysis

                                               Factual
                                               decision
                                               making     3
Factor Analysis – idea

    •       Given 𝑿 (n by m matrix)

    •       Find p factor loadings 𝒁 from original data 𝑿 with :

        •      𝒙 𝒌 ≈ 𝑢1𝑘 𝒛 𝟏 + 𝑢2𝑘 𝒛 𝟐 + ⋯ + 𝑢 𝑗𝑗 𝒛 𝒑 → 𝑿 ≈ 𝒁𝒁



              Variance matrix of 𝐙 equals diagonal matrix
    •       Loadings are mutually uncorrelated
        •



                                                                   Factual
                                                                   decision
                                                                   making     4
Specific matrix properties

    -       Normalized data Xs:

        -      𝑋 𝑠𝑇 𝑋 𝑠 = correlation matrix:
        -     Zero mean and equal standard deviation


               -   𝑐𝑐𝑐𝑐 𝑋 𝑖 , 𝑋 𝑗 =
                                      ∑ 𝑘 𝑥 𝑖𝑖 −𝑥̅ 𝑖 𝑥 𝑗𝑘 −𝑥̅ 𝑗
                                         𝑠𝑠 𝑋 𝑖 ×𝑠𝑠 𝑋 𝑗




      • 𝑈 𝑇 𝑈 = identity matrix
    • Orthonormal matrix U (rotation matrix)



      • If 𝑥 = 𝑎1 𝑧1 + 𝑎2 𝑧2 + 𝑎3 𝑧3 (with 𝑧 𝑖 independent
    • Variance of linear combination of uncorrelated variables:



      • 𝑥 𝑇 𝑥 = 𝑎1 (𝑧1𝑇 𝑧1 ) + 𝑎2 (𝑧2𝑇 𝑧2 ) + 𝑎3 (𝑧3𝑇 𝑧3 )
                  2             2              2
         variables)
                                                                  Factual
                                                                  decision
                                                                  making     5
Factor Analysis – Principal Component Analysis

    •   Find uncorrelated set of components 𝒁

    •   Such that the variances 𝑣𝑣𝑣(𝒛 𝒊 ) are maximized

        Based on Singular Value Decomposition: 𝐗 = 𝒁 𝒔 𝑫𝑼 𝑻
                            1 ⋯ 0                1 ⋯ 0
    •

               𝑍 𝑠𝑇   𝑍 𝑠 = ⋮ ⋱ ⋮ = 𝐼,      𝑈 𝑈= ⋮ ⋱ ⋮ = 𝐼
                                             𝑇

                            0 ⋯ 1                0 ⋯ 1
           –

                  𝑑1      ⋯   0
               𝐷= ⋮       ⋱   ⋮ = diagional matrix
                  0       ⋯   𝑑𝑘
           –


                                                              Factual
                                                              decision
                                                              making     6
Factor Loadings

    Given approximation of X with factors/components:

                𝑿 = 𝒁 𝒔 𝑫𝑼 𝑻 (with components 𝒁 𝒔 )

    The variances of X equals:

                𝒙𝒌        =     𝑢1𝑘 𝑑1 𝒛 𝟏 + 𝑢2𝑘 𝑑2 𝒛 𝟐 + ⋯ + 𝑢 𝑗𝑗 𝑑 𝑝 𝒛 𝒑
             𝑣𝑣𝑣 𝒙 𝒌      =         𝑑1 𝑢1𝑘 + 𝑑2 𝑢2𝑘 + ⋯ + 𝑑 2 𝑢2
                                     2 2      2 2
                                                            𝑝 𝑗𝑗

        � 𝑣𝑣𝑣 𝒙 𝒌         =               𝑑1 + 𝑑2 + ⋯ + 𝑑 2
                                           2    2
                                                          𝑝
         𝒌                                                                   Factual
                                                                             decision
                                                                             making     7
Factor Analysis – Explanatory Factor Analysis


            Find uncorrelated set of factors 𝚵 :
    •       Similar to PCA: uncorrelated set of factors/components
    •

        •      𝑿 = 𝚵𝚲 𝑻 + 𝚫, for example:




               𝚫 𝑻 𝚫 = diagonal matrix
    •       Such that the unexplained part Δ is also uncorrelated:

               𝚵 𝑻 𝚵 = identity matrix
        •
        •

                                                                     Factual
                                                                     decision
                                                                     making     8
Factor Analysis – Examples


    • Decompose correlation matrix 𝐑 into 𝐑 = 𝐃 𝐓 𝐃
    Note: Correlation matrix as input is also possible




               𝐑 𝑿 = 𝒁 𝒔 𝑫𝑼 𝑻
    •       In case of PCA:

               𝑿 𝑻 𝑿 = 𝑼𝑼𝒁 𝒔𝑻 𝒁 𝒔 𝑫𝑼 𝑻 = 𝑼𝑫 𝟐 𝑼 𝑻
        •
        •
        •     Equals eigen decomposition

    •       Only the component loadings can be calculated, not the
            compents itself
                                                                     Factual
                                                                     decision
                                                                     making     9
Multidimensional Scaling – idea

    •       Given distance matrix 𝚫 (n by n matrix)



              With coordinates 𝑿 (n by k matrix)
    •       Map the objects into k-dimensional space
        •


            Approximating given distance matrix:
                                                     2 1/2
               𝛿 𝑖𝑖 ≈ 𝑑 𝑖𝑖 = ∑         𝑥 𝑖𝑖 − 𝑥 𝑗𝑗
    •
                                 𝑘
        •                        𝑎=1




                                                             Factual
                                                             decision
                                                             making     10
MDS – Principle Coordinates Analysis


              Create full coordinates 𝑿 (n by n-1 matrix) which result in
    •       Similar to Principal Component Analyis
        •
              distance matrix
        •     Perform principal component analysis to get the most of the
              variances


    •       Main differences:
        •     MDS focuses on the differences/similarities between objects
        •     FA focuses on the underlying factors/components


                                                                            Factual
                                                                            decision
                                                                            making     11
MDS – Stress Minimization


    • Find representative coordinates 𝑿 that has approximately distance
    Similar to Principle Coordinates Analysis:

      matrix equal to 𝚫


    But by minimizing the stress value:


                    𝐦𝐦𝐦 𝝈 𝑿 = �           𝑑 𝑖𝑖 𝑋 − 𝜹 𝑖𝑖
                                                          2

                                  𝑖<𝑗≤𝑛



                                                                          Factual
                                                                          decision
                                                                          making     12
Cluster Analysis – idea

    •       Grouping similar objects in clusters

    •       Two kinds of clustering methods:
        •     Partitioning methods (k-means)

        •     Hierarchical methods (dendrogram)




                                                   Factual
                                                   decision
                                                   making     13
Questions




            ?   Factual
                decision
                making     14

More Related Content

Viewers also liked

Multivariate data analysis regression, cluster and factor analysis on spss
Multivariate data analysis   regression, cluster and factor analysis on spssMultivariate data analysis   regression, cluster and factor analysis on spss
Multivariate data analysis regression, cluster and factor analysis on spss
Aditya Banerjee
 
A system for denial of-service attack detection based on multivariate correla...
A system for denial of-service attack detection based on multivariate correla...A system for denial of-service attack detection based on multivariate correla...
A system for denial of-service attack detection based on multivariate correla...
IGEEKS TECHNOLOGIES
 
8. Correlation
8. Correlation8. Correlation
8. Correlation
Razif Shahril
 
Multivariate analysis
Multivariate analysisMultivariate analysis
Multivariate analysis
Naveen Deswal
 
HFS 3283 independent t test
HFS 3283 independent t testHFS 3283 independent t test
HFS 3283 independent t test
wajihahwafa
 
HFS3283 paired t tes-t and anova
HFS3283 paired t tes-t and anovaHFS3283 paired t tes-t and anova
HFS3283 paired t tes-t and anova
wajihahwafa
 
Multivariate data analysis
Multivariate data analysisMultivariate data analysis
Multivariate data analysisSetia Pramana
 
Data Analysis with SPSS : One-way ANOVA
Data Analysis with SPSS : One-way ANOVAData Analysis with SPSS : One-way ANOVA
Data Analysis with SPSS : One-way ANOVADr Ali Yusob Md Zain
 
Multivariate Analysis An Overview
Multivariate Analysis An OverviewMultivariate Analysis An Overview
Multivariate Analysis An Overview
guest3311ed
 
Multivariate Analysis Techniques
Multivariate Analysis TechniquesMultivariate Analysis Techniques
Multivariate Analysis Techniques
Mehul Gondaliya
 
Multiple Linear Regression II and ANOVA I
Multiple Linear Regression II and ANOVA IMultiple Linear Regression II and ANOVA I
Multiple Linear Regression II and ANOVA I
James Neill
 
What is a T-test?
What is a T-test?What is a T-test?
What is a T-test?YanoLabLT
 
T test, independant sample, paired sample and anova
T test, independant sample, paired sample and anovaT test, independant sample, paired sample and anova
T test, independant sample, paired sample and anovaQasim Raza
 
Introduction to t-tests (statistics)
Introduction to t-tests (statistics)Introduction to t-tests (statistics)
Introduction to t-tests (statistics)
Dr Bryan Mills
 
Student t-test
Student t-testStudent t-test
Student t-test
Steve Bishop
 
Hypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-testHypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-test
Shakehand with Life
 

Viewers also liked (17)

Multivariate data analysis regression, cluster and factor analysis on spss
Multivariate data analysis   regression, cluster and factor analysis on spssMultivariate data analysis   regression, cluster and factor analysis on spss
Multivariate data analysis regression, cluster and factor analysis on spss
 
A system for denial of-service attack detection based on multivariate correla...
A system for denial of-service attack detection based on multivariate correla...A system for denial of-service attack detection based on multivariate correla...
A system for denial of-service attack detection based on multivariate correla...
 
8. Correlation
8. Correlation8. Correlation
8. Correlation
 
Multivariate analysis
Multivariate analysisMultivariate analysis
Multivariate analysis
 
HFS 3283 independent t test
HFS 3283 independent t testHFS 3283 independent t test
HFS 3283 independent t test
 
HFS3283 paired t tes-t and anova
HFS3283 paired t tes-t and anovaHFS3283 paired t tes-t and anova
HFS3283 paired t tes-t and anova
 
Multivariate data analysis
Multivariate data analysisMultivariate data analysis
Multivariate data analysis
 
Data Analysis with SPSS : One-way ANOVA
Data Analysis with SPSS : One-way ANOVAData Analysis with SPSS : One-way ANOVA
Data Analysis with SPSS : One-way ANOVA
 
Multivariate Analysis An Overview
Multivariate Analysis An OverviewMultivariate Analysis An Overview
Multivariate Analysis An Overview
 
Multivariate Analysis Techniques
Multivariate Analysis TechniquesMultivariate Analysis Techniques
Multivariate Analysis Techniques
 
Multiple Linear Regression II and ANOVA I
Multiple Linear Regression II and ANOVA IMultiple Linear Regression II and ANOVA I
Multiple Linear Regression II and ANOVA I
 
What is a T-test?
What is a T-test?What is a T-test?
What is a T-test?
 
T test, independant sample, paired sample and anova
T test, independant sample, paired sample and anovaT test, independant sample, paired sample and anova
T test, independant sample, paired sample and anova
 
Introduction to t-tests (statistics)
Introduction to t-tests (statistics)Introduction to t-tests (statistics)
Introduction to t-tests (statistics)
 
T test
T testT test
T test
 
Student t-test
Student t-testStudent t-test
Student t-test
 
Hypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-testHypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-test
 

Similar to Multivariate statistics

【博士論文発表会】パラメータ制約付き特異モデルの統計的学習理論
【博士論文発表会】パラメータ制約付き特異モデルの統計的学習理論【博士論文発表会】パラメータ制約付き特異モデルの統計的学習理論
【博士論文発表会】パラメータ制約付き特異モデルの統計的学習理論
Naoki Hayashi
 
Lecture 4 - Opponent Modelling
Lecture 4 - Opponent ModellingLecture 4 - Opponent Modelling
Lecture 4 - Opponent Modelling
Luke Dicken
 
Advanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsAdvanced database and data mining & clustering concepts
Advanced database and data mining & clustering concepts
NithyananthSengottai
 
4646150.ppt
4646150.ppt4646150.ppt
4646150.ppt
TulkinChulliev
 
Elements of Statistical Learning 読み会 第2章
Elements of Statistical Learning 読み会 第2章Elements of Statistical Learning 読み会 第2章
Elements of Statistical Learning 読み会 第2章
Tsuyoshi Sakama
 
Bayesian Neural Networks
Bayesian Neural NetworksBayesian Neural Networks
Bayesian Neural Networks
Natan Katz
 
Derivative Free Optimization and Robust Optimization
Derivative Free Optimization and Robust OptimizationDerivative Free Optimization and Robust Optimization
Derivative Free Optimization and Robust Optimization
SSA KPI
 
Calculus Review Session Brian Prest Duke University Nicholas School of the En...
Calculus Review Session Brian Prest Duke University Nicholas School of the En...Calculus Review Session Brian Prest Duke University Nicholas School of the En...
Calculus Review Session Brian Prest Duke University Nicholas School of the En...
rofiho9697
 
03 Data Mining Techniques
03 Data Mining Techniques03 Data Mining Techniques
03 Data Mining Techniques
Valerii Klymchuk
 
مدخل إلى تعلم الآلة
مدخل إلى تعلم الآلةمدخل إلى تعلم الآلة
مدخل إلى تعلم الآلة
Fares Al-Qunaieer
 
Causality-inspired ML - a use case in unsupervised domain adaptation
Causality-inspired ML - a use case in unsupervised domain adaptationCausality-inspired ML - a use case in unsupervised domain adaptation
Causality-inspired ML - a use case in unsupervised domain adaptation
Sara Magliacane
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipeline
ChenYiHuang5
 
DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx
36rajneekant
 
Distributional RL via Moment Matching
Distributional RL via Moment MatchingDistributional RL via Moment Matching
Distributional RL via Moment Matching
taeseon ryu
 
Lecture 4
Lecture 4Lecture 4
Inferential statistics nominal data
Inferential statistics   nominal dataInferential statistics   nominal data
Inferential statistics nominal data
Dhritiman Chakrabarti
 
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Maninda Edirisooriya
 
Dive into the Data
Dive into the DataDive into the Data
Dive into the Data
dr_jp_ebejer
 

Similar to Multivariate statistics (20)

【博士論文発表会】パラメータ制約付き特異モデルの統計的学習理論
【博士論文発表会】パラメータ制約付き特異モデルの統計的学習理論【博士論文発表会】パラメータ制約付き特異モデルの統計的学習理論
【博士論文発表会】パラメータ制約付き特異モデルの統計的学習理論
 
Covariance.pdf
Covariance.pdfCovariance.pdf
Covariance.pdf
 
Lecture 4 - Opponent Modelling
Lecture 4 - Opponent ModellingLecture 4 - Opponent Modelling
Lecture 4 - Opponent Modelling
 
Advanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsAdvanced database and data mining & clustering concepts
Advanced database and data mining & clustering concepts
 
4646150.ppt
4646150.ppt4646150.ppt
4646150.ppt
 
Elements of Statistical Learning 読み会 第2章
Elements of Statistical Learning 読み会 第2章Elements of Statistical Learning 読み会 第2章
Elements of Statistical Learning 読み会 第2章
 
Bayesian Neural Networks
Bayesian Neural NetworksBayesian Neural Networks
Bayesian Neural Networks
 
Derivative Free Optimization and Robust Optimization
Derivative Free Optimization and Robust OptimizationDerivative Free Optimization and Robust Optimization
Derivative Free Optimization and Robust Optimization
 
Statistical Distributions
Statistical DistributionsStatistical Distributions
Statistical Distributions
 
Calculus Review Session Brian Prest Duke University Nicholas School of the En...
Calculus Review Session Brian Prest Duke University Nicholas School of the En...Calculus Review Session Brian Prest Duke University Nicholas School of the En...
Calculus Review Session Brian Prest Duke University Nicholas School of the En...
 
03 Data Mining Techniques
03 Data Mining Techniques03 Data Mining Techniques
03 Data Mining Techniques
 
مدخل إلى تعلم الآلة
مدخل إلى تعلم الآلةمدخل إلى تعلم الآلة
مدخل إلى تعلم الآلة
 
Causality-inspired ML - a use case in unsupervised domain adaptation
Causality-inspired ML - a use case in unsupervised domain adaptationCausality-inspired ML - a use case in unsupervised domain adaptation
Causality-inspired ML - a use case in unsupervised domain adaptation
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipeline
 
DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx
 
Distributional RL via Moment Matching
Distributional RL via Moment MatchingDistributional RL via Moment Matching
Distributional RL via Moment Matching
 
Lecture 4
Lecture 4Lecture 4
Lecture 4
 
Inferential statistics nominal data
Inferential statistics   nominal dataInferential statistics   nominal data
Inferential statistics nominal data
 
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
 
Dive into the Data
Dive into the DataDive into the Data
Dive into the Data
 

Recently uploaded

Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
Mohammed Sikander
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
chanes7
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 

Recently uploaded (20)

Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 

Multivariate statistics

  • 1. Multivariate statistics Overview Hoksan | January 10 ,2013 | Rotterdam Factual decision making
  • 2. Why multivariate statistics? • Observing the correlation or factors between variables • Summarizing (redundant) variables/observations • Reduce overfitting issues Factual decision making 2
  • 3. Outline • Factor analysis: • Principal Component Analysis • Explanatory Factor Analysis • Multidimensional scaling: • Principle Coordinates Analysis • Stress Minimization • Cluster Analysis Factual decision making 3
  • 4. Factor Analysis – idea • Given 𝑿 (n by m matrix) • Find p factor loadings 𝒁 from original data 𝑿 with : • 𝒙 𝒌 ≈ 𝑢1𝑘 𝒛 𝟏 + 𝑢2𝑘 𝒛 𝟐 + ⋯ + 𝑢 𝑗𝑗 𝒛 𝒑 → 𝑿 ≈ 𝒁𝒁 Variance matrix of 𝐙 equals diagonal matrix • Loadings are mutually uncorrelated • Factual decision making 4
  • 5. Specific matrix properties - Normalized data Xs: - 𝑋 𝑠𝑇 𝑋 𝑠 = correlation matrix: - Zero mean and equal standard deviation - 𝑐𝑐𝑐𝑐 𝑋 𝑖 , 𝑋 𝑗 = ∑ 𝑘 𝑥 𝑖𝑖 −𝑥̅ 𝑖 𝑥 𝑗𝑘 −𝑥̅ 𝑗 𝑠𝑠 𝑋 𝑖 ×𝑠𝑠 𝑋 𝑗 • 𝑈 𝑇 𝑈 = identity matrix • Orthonormal matrix U (rotation matrix) • If 𝑥 = 𝑎1 𝑧1 + 𝑎2 𝑧2 + 𝑎3 𝑧3 (with 𝑧 𝑖 independent • Variance of linear combination of uncorrelated variables: • 𝑥 𝑇 𝑥 = 𝑎1 (𝑧1𝑇 𝑧1 ) + 𝑎2 (𝑧2𝑇 𝑧2 ) + 𝑎3 (𝑧3𝑇 𝑧3 ) 2 2 2 variables) Factual decision making 5
  • 6. Factor Analysis – Principal Component Analysis • Find uncorrelated set of components 𝒁 • Such that the variances 𝑣𝑣𝑣(𝒛 𝒊 ) are maximized Based on Singular Value Decomposition: 𝐗 = 𝒁 𝒔 𝑫𝑼 𝑻 1 ⋯ 0 1 ⋯ 0 • 𝑍 𝑠𝑇 𝑍 𝑠 = ⋮ ⋱ ⋮ = 𝐼, 𝑈 𝑈= ⋮ ⋱ ⋮ = 𝐼 𝑇 0 ⋯ 1 0 ⋯ 1 – 𝑑1 ⋯ 0 𝐷= ⋮ ⋱ ⋮ = diagional matrix 0 ⋯ 𝑑𝑘 – Factual decision making 6
  • 7. Factor Loadings Given approximation of X with factors/components: 𝑿 = 𝒁 𝒔 𝑫𝑼 𝑻 (with components 𝒁 𝒔 ) The variances of X equals: 𝒙𝒌 = 𝑢1𝑘 𝑑1 𝒛 𝟏 + 𝑢2𝑘 𝑑2 𝒛 𝟐 + ⋯ + 𝑢 𝑗𝑗 𝑑 𝑝 𝒛 𝒑 𝑣𝑣𝑣 𝒙 𝒌 = 𝑑1 𝑢1𝑘 + 𝑑2 𝑢2𝑘 + ⋯ + 𝑑 2 𝑢2 2 2 2 2 𝑝 𝑗𝑗 � 𝑣𝑣𝑣 𝒙 𝒌 = 𝑑1 + 𝑑2 + ⋯ + 𝑑 2 2 2 𝑝 𝒌 Factual decision making 7
  • 8. Factor Analysis – Explanatory Factor Analysis Find uncorrelated set of factors 𝚵 : • Similar to PCA: uncorrelated set of factors/components • • 𝑿 = 𝚵𝚲 𝑻 + 𝚫, for example: 𝚫 𝑻 𝚫 = diagonal matrix • Such that the unexplained part Δ is also uncorrelated: 𝚵 𝑻 𝚵 = identity matrix • • Factual decision making 8
  • 9. Factor Analysis – Examples • Decompose correlation matrix 𝐑 into 𝐑 = 𝐃 𝐓 𝐃 Note: Correlation matrix as input is also possible 𝐑 𝑿 = 𝒁 𝒔 𝑫𝑼 𝑻 • In case of PCA: 𝑿 𝑻 𝑿 = 𝑼𝑼𝒁 𝒔𝑻 𝒁 𝒔 𝑫𝑼 𝑻 = 𝑼𝑫 𝟐 𝑼 𝑻 • • • Equals eigen decomposition • Only the component loadings can be calculated, not the compents itself Factual decision making 9
  • 10. Multidimensional Scaling – idea • Given distance matrix 𝚫 (n by n matrix) With coordinates 𝑿 (n by k matrix) • Map the objects into k-dimensional space • Approximating given distance matrix: 2 1/2 𝛿 𝑖𝑖 ≈ 𝑑 𝑖𝑖 = ∑ 𝑥 𝑖𝑖 − 𝑥 𝑗𝑗 • 𝑘 • 𝑎=1 Factual decision making 10
  • 11. MDS – Principle Coordinates Analysis Create full coordinates 𝑿 (n by n-1 matrix) which result in • Similar to Principal Component Analyis • distance matrix • Perform principal component analysis to get the most of the variances • Main differences: • MDS focuses on the differences/similarities between objects • FA focuses on the underlying factors/components Factual decision making 11
  • 12. MDS – Stress Minimization • Find representative coordinates 𝑿 that has approximately distance Similar to Principle Coordinates Analysis: matrix equal to 𝚫 But by minimizing the stress value: 𝐦𝐦𝐦 𝝈 𝑿 = � 𝑑 𝑖𝑖 𝑋 − 𝜹 𝑖𝑖 2 𝑖<𝑗≤𝑛 Factual decision making 12
  • 13. Cluster Analysis – idea • Grouping similar objects in clusters • Two kinds of clustering methods: • Partitioning methods (k-means) • Hierarchical methods (dendrogram) Factual decision making 13
  • 14. Questions ? Factual decision making 14