SlideShare a Scribd company logo
1 of 22
Canonical Correlation
Analysis: An overview with
application to learning
methods
By David R. Hardoon, Sandor Szedmak, John Shawe-Taylor
School of Electronics and Computer Science, University of
Southampton
Published in Neural Computaion, 2004
Presented by:
Shankar Bhargav
Measuring the linear relationship between
two multi dimensional variables
Finding two sets of basis vectors such that
the correlation between the projections of
the variables onto these basis vectors is
maximized
Determine Correlation Coefficients
Canonical Correlation Analysis
More than one canonical correlations will
be found each corresponding to a different
set of basis vectors/Canonical variates
Correlations between successively
extracted canonical variates are smaller
and smaller
Correlation coefficients : Proportion of
correlation between the canonical variates
accounted for by the particular variable.
Canonical Correlation Analysis
Differences with Correlation
Not dependent on the coordinate system
of variables
Finds direction that yield maximum
correlations
Find basis vectors for two sets of variables x, y
such that the correlations between the
projections of the variables onto these basis
vector
Sx = (x.wx) and Sy = (y.wy)
ρ = E[Sx Sy ]
√ E[Sx
2] E[Sy
2]
ρ = E[(xT wx yT wy)]
√E[(xT wx xT wx) ] E[(yT wy yT wy)]
ρ = max wx wy E[wx
Tx yT wy]
√E[wx
Tx xT wx ] E[wy
T y yT wy]
ρ = max wx wy wx
TCxy wy
√ wx
TCxxwx wy
TCyy wy
Solving this
with constraint wx
TCxxwx =1
wy
TCyy wy=1
Cxx
-1CxyCyy
-1Cyx wx = ρ2 wx
Cyy
-1CyxCxx
-1Cxy wy= ρ2 wy
Cxy wy = ρλx Cxx wx
Cyx wx = ρλy Cyy wy
λx=λy
-1= wy
TCyywy
√ wx
TCxxwx
CCA in Matlab
[ A, B, r, U, V ] = canoncorr(x, y)
x, y : set of variables in the form of matrices
 Each row is an observation
 Each column is an attribute/feature
A, B: Matrices containing the correlation coefficient
r : Column matrix containing the canonical
correlations (Successively decreasing)
U, V: Canonical variates/basis vectors for A,B
respectively
Interpretation of CCA
Correlation coefficient represents unique
contribution of each variable to relation
Multicollinearity may obscure relationships
Factor Loading : Correlations between the
canonical variates (basis vector) and the
variables in each set
Proportion of variance explained by the
canonical variates can be inferred by
factor loading
Redundancy Calculation
Redundancy left =[ ∑ (loadingsleft
2)/p]*Rc
2
Redundancy right =[ ∑ (loadingsright
2)/q]*Rc
2
p – Number of variable in the first (left) set of variables
q – Number of variable in the second (right) set of
variables
Rc2 – Respective squared canonical correlation
Since successively extracted roots are uncorrelated we
can sum the redundancies across all correlations to
get a single index of redundancy.
Application
Kernel CCA can be used to find non linear
relationships between multi variates
Two views of the same semantic object to
extract the representation of the semantics
 Speaker Recognition – Audio and Lip
movement
 Image retrieval – Image features (HSV,
Texture) and Associated text
Use of KCCA in cross-modal
retrieval
 400 records of JPEG images for each class
with associated text and a total of 3 classes
 Data was split randomly into 2 parts for
training and test
 Features
Image – HSV Color, Gabor texture
Text – Term frequencies
 Results were taken for an average of 10 runs
Cross-modal retrieval
Content based retrieval: Retrieve images
in the same class
Tested with 10 and 30 images sets
 where countj
k = 1 if the image k in the set is of
the same label as the text query present in
the set, else countj
k = 0.
Comparison of KCCA (with 5 and 30 Eigen
vectors) with GVSM
Content based retrieval
`
Mate based retrieval
Match the exact image among the
selected retrieved images
Tested with 10 and 30 images sets
 where countj = 1 if the exact matching image
was present in the set else it is 0
Comparison of KCCA (with 30 and 150 Eigen
vectors) with GVSM
Mate based retrieval
Comments
The good
 Good explanation of CCA and KCCA
 Innovative use of KCCA in image retrieval application
The bad
 The data set and the number of classes used
were small
 The image set size is not taken into account
while calculating accuracy in Mate based
retrieval
 Could have done cross-validation tests
Limitations and Assumptions of
CCA
At least 40 to 60 times as many cases as
variables is recommended to get relliable
estimates for two roots– BarciKowski & Stevens(1986)
Outliers can greatly affect the canonical
correlation
Variables in two sets should not be
completely redundant
Thank you

More Related Content

Similar to CCA.ppt

Similar to CCA.ppt (20)

Cannonical Correlation
Cannonical CorrelationCannonical Correlation
Cannonical Correlation
 
Cannonical correlation
Cannonical correlationCannonical correlation
Cannonical correlation
 
cca stat
cca statcca stat
cca stat
 
Exploring bivariate data
Exploring bivariate dataExploring bivariate data
Exploring bivariate data
 
Textmining Retrieval And Clustering
Textmining Retrieval And ClusteringTextmining Retrieval And Clustering
Textmining Retrieval And Clustering
 
Textmining Retrieval And Clustering
Textmining Retrieval And ClusteringTextmining Retrieval And Clustering
Textmining Retrieval And Clustering
 
Textmining Retrieval And Clustering
Textmining Retrieval And ClusteringTextmining Retrieval And Clustering
Textmining Retrieval And Clustering
 
Regression -Linear.pptx
Regression -Linear.pptxRegression -Linear.pptx
Regression -Linear.pptx
 
Correation, Linear Regression and Multilinear Regression using R software
Correation, Linear Regression and Multilinear Regression using R softwareCorreation, Linear Regression and Multilinear Regression using R software
Correation, Linear Regression and Multilinear Regression using R software
 
Simple Correspondence Analysis
Simple Correspondence AnalysisSimple Correspondence Analysis
Simple Correspondence Analysis
 
Correlation
CorrelationCorrelation
Correlation
 
Canonical Correlation Analysis
Canonical Correlation AnalysisCanonical Correlation Analysis
Canonical Correlation Analysis
 
J itendra cca stat
J itendra cca statJ itendra cca stat
J itendra cca stat
 
Using Mathematical Foundations To Study The Equivalence Between Mass And Ener...
Using Mathematical Foundations To Study The Equivalence Between Mass And Ener...Using Mathematical Foundations To Study The Equivalence Between Mass And Ener...
Using Mathematical Foundations To Study The Equivalence Between Mass And Ener...
 
1607.01152.pdf
1607.01152.pdf1607.01152.pdf
1607.01152.pdf
 
Correlation research
Correlation researchCorrelation research
Correlation research
 
Statistics
StatisticsStatistics
Statistics
 
Statistics
Statistics Statistics
Statistics
 
Glossary ib
Glossary ibGlossary ib
Glossary ib
 
Chapter05
Chapter05Chapter05
Chapter05
 

Recently uploaded

Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 

Recently uploaded (20)

Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 

CCA.ppt

  • 1. Canonical Correlation Analysis: An overview with application to learning methods By David R. Hardoon, Sandor Szedmak, John Shawe-Taylor School of Electronics and Computer Science, University of Southampton Published in Neural Computaion, 2004 Presented by: Shankar Bhargav
  • 2. Measuring the linear relationship between two multi dimensional variables Finding two sets of basis vectors such that the correlation between the projections of the variables onto these basis vectors is maximized Determine Correlation Coefficients Canonical Correlation Analysis
  • 3. More than one canonical correlations will be found each corresponding to a different set of basis vectors/Canonical variates Correlations between successively extracted canonical variates are smaller and smaller Correlation coefficients : Proportion of correlation between the canonical variates accounted for by the particular variable. Canonical Correlation Analysis
  • 4. Differences with Correlation Not dependent on the coordinate system of variables Finds direction that yield maximum correlations
  • 5. Find basis vectors for two sets of variables x, y such that the correlations between the projections of the variables onto these basis vector Sx = (x.wx) and Sy = (y.wy) ρ = E[Sx Sy ] √ E[Sx 2] E[Sy 2] ρ = E[(xT wx yT wy)] √E[(xT wx xT wx) ] E[(yT wy yT wy)]
  • 6. ρ = max wx wy E[wx Tx yT wy] √E[wx Tx xT wx ] E[wy T y yT wy] ρ = max wx wy wx TCxy wy √ wx TCxxwx wy TCyy wy Solving this with constraint wx TCxxwx =1 wy TCyy wy=1
  • 7. Cxx -1CxyCyy -1Cyx wx = ρ2 wx Cyy -1CyxCxx -1Cxy wy= ρ2 wy Cxy wy = ρλx Cxx wx Cyx wx = ρλy Cyy wy λx=λy -1= wy TCyywy √ wx TCxxwx
  • 8. CCA in Matlab [ A, B, r, U, V ] = canoncorr(x, y) x, y : set of variables in the form of matrices  Each row is an observation  Each column is an attribute/feature A, B: Matrices containing the correlation coefficient r : Column matrix containing the canonical correlations (Successively decreasing) U, V: Canonical variates/basis vectors for A,B respectively
  • 9. Interpretation of CCA Correlation coefficient represents unique contribution of each variable to relation Multicollinearity may obscure relationships Factor Loading : Correlations between the canonical variates (basis vector) and the variables in each set Proportion of variance explained by the canonical variates can be inferred by factor loading
  • 10. Redundancy Calculation Redundancy left =[ ∑ (loadingsleft 2)/p]*Rc 2 Redundancy right =[ ∑ (loadingsright 2)/q]*Rc 2 p – Number of variable in the first (left) set of variables q – Number of variable in the second (right) set of variables Rc2 – Respective squared canonical correlation Since successively extracted roots are uncorrelated we can sum the redundancies across all correlations to get a single index of redundancy.
  • 11. Application Kernel CCA can be used to find non linear relationships between multi variates Two views of the same semantic object to extract the representation of the semantics  Speaker Recognition – Audio and Lip movement  Image retrieval – Image features (HSV, Texture) and Associated text
  • 12. Use of KCCA in cross-modal retrieval  400 records of JPEG images for each class with associated text and a total of 3 classes  Data was split randomly into 2 parts for training and test  Features Image – HSV Color, Gabor texture Text – Term frequencies  Results were taken for an average of 10 runs
  • 13.
  • 14. Cross-modal retrieval Content based retrieval: Retrieve images in the same class Tested with 10 and 30 images sets  where countj k = 1 if the image k in the set is of the same label as the text query present in the set, else countj k = 0.
  • 15. Comparison of KCCA (with 5 and 30 Eigen vectors) with GVSM Content based retrieval
  • 16. `
  • 17. Mate based retrieval Match the exact image among the selected retrieved images Tested with 10 and 30 images sets  where countj = 1 if the exact matching image was present in the set else it is 0
  • 18.
  • 19. Comparison of KCCA (with 30 and 150 Eigen vectors) with GVSM Mate based retrieval
  • 20. Comments The good  Good explanation of CCA and KCCA  Innovative use of KCCA in image retrieval application The bad  The data set and the number of classes used were small  The image set size is not taken into account while calculating accuracy in Mate based retrieval  Could have done cross-validation tests
  • 21. Limitations and Assumptions of CCA At least 40 to 60 times as many cases as variables is recommended to get relliable estimates for two roots– BarciKowski & Stevens(1986) Outliers can greatly affect the canonical correlation Variables in two sets should not be completely redundant