Upcoming SlideShare
×

# Methods of multivariate analysis

5,387 views

Published on

3 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
5,387
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
287
0
Likes
3
Embeds 0
No embeds

No notes for slide

### Methods of multivariate analysis

1. 1. Methods of Multivariate Analysis [Hardcover]
2. 2. Contents Cover Half Title page Title page Copyright page Preface Acknowledgments Chapter 1: Introduction 1.1 Why Multivariate Analysis? 1.2 Prerequisites 1.3 Objectives 1.4 Basic Types of Data and Analysis Chapter 2: Matrix Algebra 2.1 Introduction 2.2 Notation and Basic Definitions 2.3 Operations 2.4 Partitioned Matrices 2.5 Rank 2.6 Inverse 2.7 Positive Definite Matrices 2.8 Determinants 2.9 Trace 2.10 Orthogonal Vectors and Matrices 2.11 Eigenvalues and Eigenvectors 2.12 Kronecker and Vec Notation Problems Chapter 3: Characterizing and Displaying Multivariate Data 3.1 Mean and Variance of a Univariate Random Variable 3.2 Covariance and Correlation of Bivariate Random Variables 3.3 Scatterplots of Bivariate Samples 3.4 Graphical Displays for Multivariate Samples 3.5 Dynamic Graphics 3.6 Mean Vectors 3.7 Covariance Matrices 3.8 Correlation Matrices 3.9 Mean Vectors and Covariance Matrices for Subsets of Variables 3.10 Linear Combinations of Variables 3.11 Measures of Overall Variability
3. 3. 3.12 Estimation of Missing Values 3.13 Distance Between Vectors Problems Chapter 4: The Multivariate Normal Distribution 4.1 Multivariate Normal Density Function 4.2 Properties of Multivariate Normal Random Variables 4.3 Estimation in the Multivariate Normal 4.4 Assessing Multivariate Normality 4.5 Transformations to Normality 4.6 Outliers Problems Chapter 5: Tests on One or Two Mean Vectors 5.1 Multivariate Versus Univariate Tests 5.2 Tests on μ With Σ Known 5.3 Tests on μ When Σ Is Unknown 5.4 Comparing Two Mean Vectors 5.5 Tests on Individual Variables Conditional on Rejection of H0 By the T2- Test 5.6 Computation of T2 5.7 Paired Observations Test 5.8 Test for Additional Information 5.9 Profile Analysis Problems Chapter 6: Multivariate Analysis of Variance 6.1 One-Way Models 6.2 Comparison of the Four Manova Test Statistics 6.3 Contrasts 6.4 Tests on Individual Variables Following Rejection ofH0 By the Overall Manova Test 6.5 Two-Way Classification 6.6 Other Models 6.7 Checking on the Assumptions 6.8 Profile Analysis 6.9 Repeated Measures Designs 6.10 Growth Curves 6.11 Tests on a Subvector Problems Chapter 7: Tests on Covariance Matrices 7.1 Introduction 7.2 Testing a Specified Pattern for Σ 7.3 Tests Comparing Covariance Matrices
4. 4. 7.4 Tests of Independence Problems Chapter 8: Discriminant Analysis: Description of Group Separation 8.1 Introduction 8.2 The Discriminant Function for Two Groups 8.3 Relationship Between Two-Group Discriminant Analysis and Multiple Regression 8.4 Discriminant Analysis for Several Groups 8.5 Standardized Discriminant Functions 8.6 Tests of Significance 8.7 Interpretation of Discriminant Functions 8.8 Scatterplots 8.9 Stepwise Selection of Variables Problems Chapter 9: Classification Analysis: Allocation of Observations to Groups 9.1 Introduction 9.2 Classification Into Two Groups 9.3 Classification Into Several Groups 9.4 Estimating Misclassification Rates 9.5 Improved Estimates of Error Rates 9.6 Subset Selection 9.7 Nonparametric Procedures Problems Chapter 10: Multivariate Regression 10.1 Introduction 10.2 Multiple Regression: Fixed x’s 10.4 Multivariate Multiple Regression: Estimation 10.5 Multivariate Multiple Regression: Hypothesis Tests 10.6 Multivariate Multiple Regression: Prediction 10.7 Measures of Association Between the y’s and the x’s 10.8 Subset Selection 10.9 Multivariate Regression: Random x’s Problems Chapter 11: Canonical Correlation 11.1 Introduction 11.2 Canonical Correlations and Canonical Variates 11.3 Properties of Canonical Correlations 11.4 Tests of Significance 11.5 Interpretation
5. 5. 11.6 Relationships of Canonical Correlation Analysis to Other Multivariate Techniques Problems Chapter 12: Principal Component Analysis 12.1 Introduction 12.2 Geometric and Algebraic Bases of Principal Components 12.3 Principal Components and Perpendicular Regression 12.4 Plotting of Principal Components 12.5 Principal Components From the Correlation Matrix 12.6 Deciding How Many Components to Retain 12.7 Information in the Last Few Principal Components 12.8 Interpretation of Principal Components 12.9 Selection of Variables Problems Chapter 13: Exploratory Factor Analysis 13.1 Introduction 13.2 Orthogonal Factor Model 13.3 Estimation of Loadings and Communalities 13.4 Choosing the Number of Factors, M 13.5 Rotation 13.6 Factor Scores 13.7 Validity of the Factor Analysis Model 13.8 Relationship of Factor Analysis to Principal Component Analysis Problems Chapter 14: Confirmatory Factor Analysis 14.1 Introduction 14.2 Model Specification and Identification 14.3 Parameter Estimation and Model Assessment 14.4 Inference for Model Parameters 14.5 Factor Scores Problems Chapter 15: Cluster Analysis 15.1 Introduction 15.2 Measures of Similarity or Dissimilarity 15.3 Hierarchical Clustering 15.4 Nonhierarchical Methods 15.5 Choosing the Number of Clusters 15.6 Cluster Validity 15.7 Clustering Variables Problems Chapter 16: Graphical Procedures
6. 6. 16.1 Multidimensional Scaling 16.2 Correspondence Analysis 16.3 Biplots Problems Appendix A: Tables Appendix B: Answers and Hints to Problems Appendix C: Data Sets and Sas Files References Index METHODS OF MULTIVARIATE ANALYSIS WILEY SERIES IN PROBABILITY AND STATISTICS ESTABLISHED BY WALTER A. SHEWHART AND SAMUEL S. WILKS Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Harvey Goldstein, Iain M. Johnstone, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay, Sanford Weisberg Editors Emeriti: Vic Barnett, J. Stuart Hunter, Joseph B. Kadane, Jozef L. Teugels The Wiley Series in Probability and Statistics is well established and authoritative. It covers many topics of current research interest in both pure and applied statistics and probability theory. Written by leading statisticians and institutions, the titles span both state-of-the-art developments in the field and classical methods. Reflecting the wide range of current research in statistics, the series encompasses applied, methodological and theoretical statistics, ranging from applications and new techniques made possible by advances in computerized practice to rigorous treatment of theoretical approaches. This series provides essential and invaluable reading for all statisticians, whether in academia, industry, government, or research. † ABRAHAM and LEDOLTER · Statistical Methods for Forecasting AGRESTI · Analysis of Ordinal Categorical Data, Second Edition AGRESTI · An Introduction to Categorical Data Analysis, Second Edition AGRESTI · Categorical Data Analysis, Second Edition ALTMAN, GILL, and McDONALD · Numerical Issues in Statistical Computing for the Social Scientist AMARATUNGA and CABRERA · Exploration and Analysis of DNA Microarray and Protein Array Data AND L · Mathematics of Chance ANDERSON · An Introduction to Multivariate Statistical Analysis, Third Edition * ANDERSON · The Statistical Analysis of Time Series ANDERSON, AUQUIER, HAUCK, OAKES, VANDAELE, and WEISBERG · Statistical Methods for Comparative Studies ANDERSON and LOYNES · The Teaching of Practical Statistics ARMITAGE and DAVID (editors) · Advances in Biometry ARNOLD, BALAKRISHNAN, and NAGARAJA · Records * ARTHANARI and DODGE · Mathematical Programming in Statistics * BAILEY · The Elements of Stochastic Processes with Applications to the Natural Sciences BAJORSKI · Statistics for Imaging, Optics, and Photonics BALAKRISHNAN and KOUTRAS · Runs and Scans with Applications BALAKRISHNAN and NG · Precedence-Type Tests and Applications BARNETT · Comparative Statistical Inference, Third Edition BARNETT · Environmental Statistics BARNETT and LEWIS · Outliers in Statistical Data, Third Edition BARTHOLOMEW, KNOTT, and MOUSTAKI · Latent Variable Models and Factor Analysis: A Unified Approach, Third Edition
7. 7. BARTOSZYNSKI and NIEWIADOMSKA-BUGAJ · Probability and Statistical Inference, Second Edition BASILEVSKY · Statistical Factor Analysis and Related Methods: Theory and Applications BATES and WATTS · Nonlinear Regression Analysis and Its Applications BECHHOFER, SANTNER, and GOLDSMAN · Design and Analysis of Experiments for Statistical Selection, Screening, and Multiple Comparisons BEIRLANT, GOEGEBEUR, SEGERS, TEUGELS, and DE WAAL · Statistics of Extremes: Theory and Applications BELSLEY · Conditioning Diagnostics: Collinearity and Weak Data in Regression † BELSLEY, KUH, and WELSCH · Regression Diagnostics: Identifying Influential Data and Sources of Collinearity BENDAT and PIERSOL · Random Data: Analysis and Measurement Procedures,Fourth Edition BERNARDO and SMITH · Bayesian Theory BHAT and MILLER · Elements of Applied Stochastic Processes, Third Edition BHATTACHARYA and WAYMIRE · Stochastic Processes with Applications BIEMER, GROVES, LYBERG, MATHIOWETZ, and SUDMAN · Measurement Errors in Surveys BILLINGSLEY · Convergence of Probability Measures, Second Edition BILLINGSLEY · Probability and Measure, Anniversary Edition BIRKES and DODGE · Alternative Methods of Regression BISGAARD and KULAHCI · Time Series Analysis and Forecasting by Example BISWAS, DATTA, FINE, and SEGAL · Statistical Advances in the Biomedical Sciences: Clinical Trials, Epidemiology, Survival Analysis, and Bioinformatics BLISCHKE and MURTHY (editors) · Case Studies in Reliability and Maintenance BLISCHKE and MURTHY · Reliability: Modeling, Prediction, and Optimization BLOOMFIELD · Fourier Analysis of Time Series: An Introduction, Second Edition BOLLEN · Structural Equations with Latent Variables BOLLEN and CURRAN · Latent Curve Models: A Structural Equation Perspective BOROVKOV · Ergodicity and Stability of Stochastic Processes BOSQ and BLANKE · Inference and Prediction in Large Dimensions BOULEAU · Numerical Methods for Stochastic Processes * BOX and TIAO · Bayesian Inference in Statistical Analysis BOX · Improving Almost Anything, Revised Edition * BOX and DRAPER · Evolutionary Operation: A Statistical Method for Process Improvement BOX and DRAPER · Response Surfaces, Mixtures, and Ridge Analyses, Second Edition BOX, HUNTER, and HUNTER · Statistics for Experimenters: Design, Innovation, and Discovery, Second Editon BOX, JENKINS, and REINSEL · Time Series Analysis: Forcasting and Control, Fourth Edition BOX, LUCEÑO, and PANIAGUA-QUIÑONES · Statistical Control by Monitoring and Adjustment, Second Edition * BROWN and HOLLANDER · Statistics: A Biomedical Introduction CAIROLI and DALANG · Sequential Stochastic Optimization CASTILLO, HADI, BALAKRISHNAN, and SARABIA · Extreme Value and Related Models with Applications in Engineering and Science CHAN · Time Series: Applications to Finance with R and S-Plus®, Second Edition CHARALAMBIDES · Combinatorial Methods in Discrete Distributions CHATTERJEE and HADI · Regression Analysis by Example, Fourth Edition CHATTERJEE and HADI · Sensitivity Analysis in Linear Regression CHERNICK · Bootstrap Methods: A Guide for Practitioners and Researchers, Second Edition CHERNICK and FRIIS · Introductory Biostatistics for the Health Sciences CHILÈS and DELFINER · Geostatistics: Modeling Spatial Uncertainty, Second Edition CHOW and LIU · Design and Analysis of Clinical Trials: Concepts and Methodologies,Second Edition CLARKE · Linear Models: The Theory and Application of Analysis of Variance CLARKE and DISNEY · Probability and Random Processes: A First Course with Applications, Second Edition * COCHRAN and COX · Experimental Designs, Second Edition COLLINS and LANZA · Latent Class and Latent Transition Analysis: With Applications in the Social, Behavioral, and Health Sciences CONGDON · Applied Bayesian Modelling CONGDON · Bayesian Models for Categorical Data
8. 8. CONGDON · Bayesian Statistical Modelling, Second Edition CONOVER · Practical Nonparametric Statistics, Third Edition COOK · Regression Graphics COOK and WEISBERG · An Introduction to Regression Graphics COOK and WEISBERG · Applied Regression Including Computing and Graphics CORNELL · A Primer on Experiments with Mixtures CORNELL · Experiments with Mixtures, Designs, Models, and the Analysis of Mixture Data, Third Edition COX · A Handbook of Introductory Statistical Methods CRESSIE · Statistics for Spatial Data, Revised Edition CRESSIE and WIKLE · Statistics for Spatio-Temporal Data CSÖRGÖ and HORVÁTH · Limit Theorems in Change Point Analysis DAGPUNAR · Simulation and Monte Carlo: With Applications in Finance and MCMC DANIEL · Applications of Statistics to Industrial Experimentation DANIEL · Biostatistics: A Foundation for Analysis in the Health Sciences, Eighth Edition * DANIEL · Fitting Equations to Data: Computer Analysis of Multifactor Data, Second Edition DASU and JOHNSON · Exploratory Data Mining and Data Cleaning DAVID and NAGARAJA · Order Statistics, Third Edition * DEGROOT, FIENBERG, and KADANE · Statistics and the Law DEL CASTILLO · Statistical Process Adjustment for Quality Control DEMARIS · Regression with Social Data: Modeling Continuous and Limited Response Variables DEMIDENKO · Mixed Models: Theory and Applications DENISON, HOLMES, MALLICK and SMITH · Bayesian Methods for Nonlinear Classification and Regression DETTE and STUDDEN · The Theory of Canonical Moments with Applications in Statistics, Probability, and Analysis DEY and MUKERJEE · Fractional Factorial Plans DILLON and GOLDSTEIN · Multivariate Analysis: Methods and Applications * DODGE and ROMIG · Sampling Inspection Tables, Second Edition * DOOB · Stochastic Processes DOWDY, WEARDEN, and CHILKO · Statistics for Research, Third Edition DRAPER and SMITH · Applied Regression Analysis, Third Edition DRYDEN and MARDIA · Statistical Shape Analysis DUDEWICZ and MISHRA · Modern Mathematical Statistics DUNN and CLARK · Basic Statistics: A Primer for the Biomedical Sciences, Fourth Edition DUPUIS and ELLIS · A Weak Convergence Approach to the Theory of Large Deviations EDLER and KITSOS · Recent Advances in Quantitative Methods in Cancer and Human Health Risk Assessment * ELANDT-JOHNSON and JOHNSON · Survival Models and Data Analysis ENDERS · Applied Econometric Time Series, Third Edition † ETHIER and KURTZ · Markov Processes: Characterization and Convergence EVANS, HASTINGS, and PEACOCK · Statistical Distributions, Third Edition EVERITT, LANDAU, LEESE, and STAHL · Cluster Analysis, Fifth Edition FEDERER and KING · Variations on Split Plot and Split Block Experiment Designs FELLER · An Introduction to Probability Theory and Its Applications, Volume I, Third Edition, Revised; Volume II, Second Edition FITZMAURICE, LAIRD, and WARE · Applied Longitudinal Analysis, Second Edition * FLEISS · The Design and Analysis of Clinical Experiments FLEISS · Statistical Methods for Rates and Proportions, Third Edition † FLEMING and HARRINGTON · Counting Processes and Survival Analysis FUJIKOSHI, ULYANOV, and SHIMIZU · Multivariate Statistics: High-Dimensional and Large-Sample Approximations FULLER · Introduction to Statistical Time Series, Second Edition † FULLER · Measurement Error Models GALLANT · Nonlinear Statistical Models GEISSER · Modes of Parametric Statistical Inference
9. 9. GELMAN and MENG · Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives GEWEKE · Contemporary Bayesian Econometrics and Statistics GHOSH, MUKHOPADHYAY, and SEN · Sequential Estimation GIESBRECHT and GUMPERTZ · Planning, Construction, and Statistical Analysis of Comparative Experiments GIFI · Nonlinear Multivariate Analysis GIVENS and HOETING · Computational Statistics GLASSERMAN and YAO · Monotone Structure in Discrete-Event Systems GNANADESIKAN · Methods for Statistical Data Analysis of Multivariate Observations, Second Edition GOLDSTEIN · Multilevel Statistical Models, Fourth Edition GOLDSTEIN and LEWIS · Assessment: Problems, Development, and Statistical Issues GOLDSTEIN and WOOFF · Bayes Linear Statistics GREENWOOD and NIKULIN · A Guide to Chi-Squared Testing GROSS, SHORTLE, THOMPSON, and HARRIS · Fundamentals of Queueing Theory,Fourth Edition GROSS, SHORTLE, THOMPSON, and HARRIS · Solutions Manual to Accompany Fundamentals of Queueing Theory, Fourth Edition * HAHN and SHAPIRO · Statistical Models in Engineering HAHN and MEEKER · Statistical Intervals: A Guide for Practitioners HALD · A History of Probability and Statistics and their Applications Before 1750 † HAMPEL · Robust Statistics: The Approach Based on Influence Functions HARTUNG, KNAPP, and SINHA · Statistical Meta-Analysis with Applications HEIBERGER · Computation for the Analysis of Designed Experiments HEDAYAT and SINHA · Design and Inference in Finite Population Sampling HEDEKER and GIBBONS · Longitudinal Data Analysis HELLER · MACSYMA for Statisticians HERITIER, CANTONI, COPT, and VICTORIA-FESER · Robust Methods in Biostatistics HINKELMANN and KEMPTHORNE · Design and Analysis of Experiments, Volume 1: Introduction to Experimental Design, Second Edition HINKELMANN and KEMPTHORNE · Design and Analysis of Experiments, Volume 2: Advanced Experimental Design HINKELMANN (editor) · Design and Analysis of Experiments, Volume 3: Special Designs and Applications HOAGLIN, MOSTELLER, and TUKEY · Fundamentals of Exploratory Analysis of Variance * HOAGLIN, MOSTELLER, and TUKEY · Exploring Data Tables, Trends and Shapes * HOAGLIN, MOSTELLER, and TUKEY · Understanding Robust and Exploratory Data Analysis HOCHBERG and TAMHANE · Multiple Comparison Procedures HOCKING · Methods and Applications of Linear Models: Regression and the Analysis of Variance, Second Edition HOEL · Introduction to Mathematical Statistics, Fifth Edition HOGG and KLUGMAN · Loss Distributions HOLLANDER and WOLFE · Nonparametric Statistical Methods, Second Edition HOSMER and LEMESHOW · Applied Logistic Regression, Second Edition HOSMER, LEMESHOW, and MAY · Applied Survival Analysis: Regression Modeling of Time-to-Event Data, Second Edition HUBER · Data Analysis: What Can Be Learned From the Past 50 Years HUBER · Robust Statistics † HUBER and RONCHETTI · Robust Statistics, Second Edition HUBERTY · Applied Discriminant Analysis, Second Edition HUBERTY and OLEJNIK · Applied MANOVA and Discriminant Analysis, Second Edition HUITEMA · The Analysis of Covariance and Alternatives: Statistical Methods for Experiments, Quasi- Experiments, and Single-Case Studies, Second Edition HUNT and KENNEDY · Financial Derivatives in Theory and Practice, Revised Edition HURD and MIAMEE · Periodically Correlated Random Sequences: Spectral Theory and Practice HUSKOVA, BERAN, and DUPAC · Collected Works of Jaroslav Hajek—with Commentary HUZURBAZAR · Flowgraph Models for Multistate Time-to-Event Data
10. 10. JACKMAN · Bayesian Analysis for the Social Sciences † JACKSON · A User’s Guide to Principle Components JOHN · Statistical Methods in Engineering and Quality Assurance JOHNSON · Multivariate Statistical Simulation JOHNSON and BALAKRISHNAN · Advances in the Theory and Practice of Statistics: A Volume in Honor of Samuel Kotz JOHNSON, KEMP, and KOTZ · Univariate Discrete Distributions, Third Edition JOHNSON and KOTZ (editors) · Leading Personalities in Statistical Sciences: From the Seventeenth Century to the Present JOHNSON, KOTZ, and BALAKRISHNAN · Continuous Univariate Distributions, Volume 1, Second Edition JOHNSON, KOTZ, and BALAKRISHNAN · Continuous Univariate Distributions, Volume 2, Second Edition JOHNSON, KOTZ, and BALAKRISHNAN · Discrete Multivariate Distributions JUDGE, GRIFFITHS, HILL, LÜTKEPOHL, and LEE · The Theory and Practice of Econometrics, Second Edition JUREK and MASON · Operator-Limit Distributions in Probability Theory KADANE · Bayesian Methods and Ethics in a Clinical Trial Design KADANE AND SCHUM · A Probabilistic Analysis of the Sacco and Vanzetti Evidence KALBFLEISCH and PRENTICE · The Statistical Analysis of Failure Time Data, Second Edition KARIYA and KURATA · Generalized Least Squares KASS and VOS · Geometrical Foundations of Asymptotic Inference † KAUFMAN and ROUSSEEUW · Finding Groups in Data: An Introduction to Cluster Analysis KEDEM and FOKIANOS · Regression Models for Time Series Analysis KENDALL, BARDEN, CARNE, and LE · Shape and Shape Theory KHURI · Advanced Calculus with Applications in Statistics, Second Edition KHURI, MATHEW, and SINHA · Statistical Tests for Mixed Linear Models * KISH · Statistical Design for Research KLEIBER and KOTZ · Statistical Size Distributions in Economics and Actuarial Sciences KLEMELÄ · Smoothing of Multivariate Data: Density Estimation and Visualization KLUGMAN, PANJER, and WILLMOT · Loss Models: From Data to Decisions, Third Edition KLUGMAN, PANJER, and WILLMOT · Solutions Manual to Accompany Loss Models: From Data to Decisions, Third Edition KOSKI and NOBLE · Bayesian Networks: An Introduction KOTZ, BALAKRISHNAN, and JOHNSON · Continuous Multivariate Distributions, Volume 1, Second Edition KOTZ and JOHNSON (editors) · Encyclopedia of Statistical Sciences: Volumes 1 to 9 with Index KOTZ and JOHNSON (editors) · Encyclopedia of Statistical Sciences: Supplement Volume KOTZ, READ, and BANKS (editors) · Encyclopedia of Statistical Sciences: Update Volume 1 KOTZ, READ, and BANKS (editors) · Encyclopedia of Statistical Sciences: Update Volume 2 KOWALSKI and TU · Modern Applied U-Statistics KRISHNAMOORTHY and MATHEW · Statistical Tolerance Regions: Theory, Applications, and Computation KROESE, TAIMRE, and BOTEV · Handbook of Monte Carlo Methods KROONENBERG · Applied Multiway Data Analysis KULINSKAYA, MORGENTHALER, and STAUDTE · Meta Analysis: A Guide to Calibrating and Combining Statistical Evidence KULKARNI and HARMAN · An Elementary Introduction to Statistical Learning Theory KUROWICKA and COOKE · Uncertainty Analysis with High Dimensional Dependence Modelling KVAM and VIDAKOVIC · Nonparametric Statistics with Applications to Science and Engineering LACHIN · Biostatistical Methods: The Assessment of Relative Risks, Second Edition LAD · Operational Subjective Statistical Methods: A Mathematical, Philosophical, and Historical Introduction LAMPERTI · Probability: A Survey of the Mathematical Theory, Second Edition LAWLESS · Statistical Models and Methods for Lifetime Data, Second Edition LAWSON · Statistical Methods in Spatial Epidemiology, Second Edition
11. 11. LE · Applied Categorical Data Analysis, Second Edition LE · Applied Survival Analysis LEE · Structural Equation Modeling: A Bayesian Approach LEE and WANG · Statistical Methods for Survival Data Analysis, Third Edition LEPAGE and BILLARD · Exploring the Limits of Bootstrap LESSLER and KALSBEEK · Nonsampling Errors in Surveys LEYLAND and GOLDSTEIN (editors) · Multilevel Modelling of Health Statistics LIAO · Statistical Group Comparison LIN · Introductory Stochastic Analysis for Finance and Insurance LITTLE and RUBIN · Statistical Analysis with Missing Data, Second Edition LLOYD · The Statistical Analysis of Categorical Data LOWEN and TEICH · Fractal-Based Point Processes MAGNUS and NEUDECKER · Matrix Differential Calculus with Applications in Statistics and Econometrics, Revised Edition MALLER and ZHOU · Survival Analysis with Long Term Survivors MARCHETTE · Random Graphs for Statistical Pattern Recognition MARDIA and JUPP · Directional Statistics MARKOVICH · Nonparametric Analysis of Univariate Heavy-Tailed Data: Research and Practice MARONNA, MARTIN and YOHAI · Robust Statistics: Theory and Methods MASON, GUNST, and HESS · Statistical Design and Analysis of Experiments with Applications to Engineering and Science, Second Edition McCULLOCH, SEARLE, and NEUHAUS · Generalized, Linear, and Mixed Models,Second Edition McFADDEN · Management of Data in Clinical Trials, Second Edition * McLACHLAN · Discriminant Analysis and Statistical Pattern Recognition McLACHLAN, DO, and AMBROISE · Analyzing Microarray Gene Expression Data McLACHLAN and KRISHNAN · The EM Algorithm and Extensions, Second Edition McLACHLAN and PEEL · Finite Mixture Models McNEIL · Epidemiological Research Methods MEEKER and ESCOBAR · Statistical Methods for Reliability Data MEERSCHAERT and SCHEFFLER · Limit Distributions for Sums of Independent Random Vectors: Heavy Tails in Theory and Practice MENGERSEN, ROBERT, and TITTERINGTON · Mixtures: Estimation and Applications MICKEY, DUNN, and CLARK · Applied Statistics: Analysis of Variance and Regression, Third Edition * MILLER · Survival Analysis, Second Edition MONTGOMERY, JENNINGS, and KULAHCI · Introduction to Time Series Analysis and Forecasting MONTGOMERY, PECK, and VINING · Introduction to Linear Regression Analysis,Fifth Edition MORGENTHALER and TUKEY · Configural Polysampling: A Route to Practical Robustness MUIRHEAD · Aspects of Multivariate Statistical Theory MULLER and STOYAN · Comparison Methods for Stochastic Models and Risks MURTHY, XIE, and JIANG · Weibull Models MYERS, MONTGOMERY, and ANDERSON-COOK · Response Surface Methodology: Process and Product Optimization Using Designed Experiments, Third Edition MYERS, MONTGOMERY, VINING, and ROBINSON · Generalized Linear Models. With Applications in Engineering and the Sciences, Second Edition NATVIG · Multistate Systems Reliability Theory With Applications † NELSON · Accelerated Testing, Statistical Models, Test Plans, and Data Analyses † NELSON · Applied Life Data Analysis NEWMAN · Biostatistical Methods in Epidemiology NG, TAIN, and TANG · Dirichlet Theory: Theory, Methods and Applications OKABE, BOOTS, SUGIHARA, and CHIU · Spatial Tesselations: Concepts and Applications of Voronoi Diagrams, Second Edition OLIVER and SMITH · Influence Diagrams, Belief Nets and Decision Analysis PALTA · Quantitative Methods in Population Health: Extensions of Ordinary Regressions PANJER · Operational Risk: Modeling and Analytics PANKRATZ · Forecasting with Dynamic Regression Models PANKRATZ · Forecasting with Univariate Box-Jenkins Models: Concepts and Cases
12. 12. PARDOUX · Markov Processes and Applications: Algorithms, Networks, Genome and Finance PARMIGIANI and INOUE · Decision Theory: Principles and Approaches * PARZEN · Modern Probability Theory and Its Applications PEÑA, TIAO, and TSAY · A Course in Time Series Analysis PESARIN and SALMASO · Permutation Tests for Complex Data: Applications and Software PIANTADOSI · Clinical Trials: A Methodologic Perspective, Second Edition POURAHMADI · Foundations of Time Series Analysis and Prediction Theory POWELL · Approximate Dynamic Programming: Solving the Curses of Dimensionality,Second Edition POWELL and RYZHOV · Optimal Learning PRESS · Subjective and Objective Bayesian Statistics, Second Edition PRESS and TANUR · The Subjectivity of Scientists and the Bayesian Approach PURI, VILAPLANA, and WERTZ · New Perspectives in Theoretical and Applied Statistics † PUTERMAN · Markov Decision Processes: Discrete Stochastic Dynamic Programming QIU · Image Processing and Jump Regression Analysis * RAO · Linear Statistical Inference and Its Applications, Second Edition RAO · Statistical Inference for Fractional Diffusion Processes RAUSAND and HØYLAND · System Reliability Theory: Models, Statistical Methods, and Applications, Second Edition RAYNER, THAS, and BEST · Smooth Tests of Goodnes of Fit: Using R, Second Edition RENCHER and SCHAALJE · Linear Models in Statistics, Second Edition RENCHER and CHRISTENSEN · Methods of Multivariate Analysis, Third Edition RENCHER · Multivariate Statistical Inference with Applications RIGDON and BASU · Statistical Methods for the Reliability of Repairable Systems * RIPLEY · Spatial Statistics * RIPLEY · Stochastic Simulation ROHATGI and SALEH · An Introduction to Probability and Statistics, Second Edition ROLSKI, SCHMIDLI, SCHMIDT, and TEUGELS · Stochastic Processes for Insurance and Finance ROSENBERGER and LACHIN · Randomization in Clinical Trials: Theory and Practice ROSSI, ALLENBY, and McCULLOCH · Bayesian Statistics and Marketing † ROUSSEEUW and LEROY · Robust Regression and Outlier Detection ROYSTON and SAUERBREI · Multivariate Model Building: A Pragmatic Approach to Regression Analysis Based on Fractional Polynomials for Modeling Continuous Variables * RUBIN · Multiple Imputation for Nonresponse in Surveys RUBINSTEIN and KROESE · Simulation and the Monte Carlo Method, Second Edition RUBINSTEIN and MELAMED · Modern Simulation and Modeling RYAN · Modern Engineering Statistics RYAN · Modern Experimental Design RYAN · Modern Regression Methods, Second Edition RYAN · Statistical Methods for Quality Improvement, Third Edition SALEH · Theory of Preliminary Test and Stein-Type Estimation with Applications SALTELLI, CHAN, and SCOTT (editors) · Sensitivity Analysis SCHERER · Batch Effects and Noise in Microarray Experiments: Sources and Solutions * SCHEFFE · The Analysis of Variance SCHIMEK · Smoothing and Regression: Approaches, Computation, and Application SCHOTT · Matrix Analysis for Statistics, Second Edition SCHOUTENS · Levy Processes in Finance: Pricing Financial Derivatives SCOTT · Multivariate Density Estimation: Theory, Practice, and Visualization * SEARLE · Linear Models † SEARLE · Linear Models for Unbalanced Data † SEARLE · Matrix Algebra Useful for Statistics † SEARLE, CASELLA, and McCULLOCH · Variance Components SEARLE and WILLETT · Matrix Algebra for Applied Economics SEBER · A Matrix Handbook For Statisticians † SEBER · Multivariate Observations SEBER and LEE · Linear Regression Analysis, Second Edition † SEBER and WILD · Nonlinear Regression
13. 13. SENNOTT · Stochastic Dynamic Programming and the Control of Queueing Systems * SERFLING · Approximation Theorems of Mathematical Statistics SHAFER and VOVK · Probability and Finance: It’s Only a Game! SHERMAN · Spatial Statistics and Spatio-Temporal Data: Covariance Functions and Directional Properties SILVAPULLE and SEN · Constrained Statistical Inference: Inequality, Order, and Shape Restrictions SINGPURWALLA · Reliability and Risk: A Bayesian Perspective SMALL and McLEISH · Hilbert Space Methods in Probability and Statistical Inference SRIVASTAVA · Methods of Multivariate Statistics STAPLETON · Linear Statistical Models, Second Edition STAPLETON · Models for Probability and Statistical Inference: Theory and Applications STAUDTE and SHEATHER · Robust Estimation and Testing STOYAN · Counterexamples in Probability, Second Edition STOYAN, KENDALL, and MECKE · Stochastic Geometry and Its Applications, Second Edition STOYAN and STOYAN · Fractals, Random Shapes and Point Fields: Methods of Geometrical Statistics STREET and BURGESS · The Construction of Optimal Stated Choice Experiments: Theory and Methods STYAN · The Collected Papers of T. W. Anderson: 1943–1985 SUTTON, ABRAMS, JONES, SHELDON, and SONG · Methods for Meta-Analysis in Medical Research TAKEZAWA · Introduction to Nonparametric Regression TAMHANE · Statistical Analysis of Designed Experiments: Theory and Applications TANAKA · Time Series Analysis: Nonstationary and Noninvertible Distribution Theory THOMPSON · Empirical Model Building: Data, Models, and Reality, Second Edition THOMPSON · Sampling, Third Edition THOMPSON · Simulation: A Modeler’s Approach THOMPSON and SEBER · Adaptive Sampling THOMPSON, WILLIAMS, and FINDLAY · Models for Investors in Real World Markets TIERNEY · LISP-STAT: An Object-Oriented Environment for Statistical Computing and Dynamic Graphics TSAY · Analysis of Financial Time Series, Third Edition UPTON and FINGLETON · Spatial Data Analysis by Example, Volume II: Categorical and Directional Data † VAN BELLE · Statistical Rules of Thumb, Second Edition VAN BELLE, FISHER, HEAGERTY, and LUMLEY · Biostatistics: A Methodology for the Health Sciences, Second Edition VESTRUP · The Theory of Measures and Integration VIDAKOVIC · Statistical Modeling by Wavelets VIERTL · Statistical Methods for Fuzzy Data VINOD and REAGLE · Preparing for the Worst: Incorporating Downside Risk in Stock Market Investments WALLER and GOTWAY · Applied Spatial Statistics for Public Health Data WEISBERG · Applied Linear Regression, Third Edition WEISBERG · Bias and Causation: Models and Judgment for Valid Comparisons WELSH · Aspects of Statistical Inference WESTFALL and YOUNG · Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment * WHITTAKER · Graphical Models in Applied Multivariate Statistics WINKER · Optimization Heuristics in Economics: Applications of Threshold Accepting WOODWORTH · Biostatistics: A Bayesian Introduction WOOLSON and CLARKE · Statistical Methods for the Analysis of Biomedical Data,Second Edition WU and HAMADA · Experiments: Planning, Analysis, and Parameter Design Optimization, Second Edition WU and ZHANG · Nonparametric Regression Methods for Longitudinal Data Analysis YIN · Clinical Trial Design: Bayesian and Frequentist Adaptive Methods YOUNG, VALERO-MORA, and FRIENDLY · Visual Statistics: Seeing Data with Dynamic Interactive Graphics
17. 17. who have pointed out errors or made suggestions for improvements. The book is better for their caring and their efforts. THIRD EDITION For the third edition, we have added a new chapter covering confirmatory factor analysis (Chapter 14). We have added new sections discussing Kronecker products and vec notation (Section 2.12), dynamic graphics (Section 3.5), transformations to normality (Section 4.5), classification trees (Section 9.7.4), seemingly unrelated regressions (Section 10.4.6), and prediction for multivariate multiple regression (Section 10.6). Additionally, we have updated and revised the graphics throughout the book and have substantially expanded the section discussing estimation for multivariate multiple regression (Section 10.4). Many other additions and changes have been made in an effort to broaden the book’s scope and improve its exposition. Additional problems have been added to accompany the new material. The new ftp site for the third edition can be found at:ftp://ftp.wiley.com/public/sci_tech_med/multivariate_analysis_3e. We thank Jonathan Christensen for the countless ways he contributed to the revision, from updated graphics to technical and formatting assistance. We are grateful to Byron Gajewski, Scott Grimshaw, and Jeremy Yorgason, who have provided useful feedback on sections of the book that are new to this edition. We are also grateful to Scott Simpson and Storm Atwood for invaluable editing assistance and to the students of William’s multivariate statistics course at Brigham Young University for pointing out errors and making suggestions for improvements in this edition. We are deeply appreciative of the support provided by the BYU Department of Statistics for the writing of the book. Al dedicates this volume to his wife, LaRue, who has supplied much needed support and encouragement. William dedicates this volume to his wife Mary Elizabeth, who has provided both encouragement and careful feedback throughout the writing process. ALVIN C. RENCHER & WILLIAM F. CHRISTENSEN Department of Statistics Brigham Young University Provo, Utah Acknowledgments We wish to thank the authors, editors, and owners of copyrights for permission to reproduce the following materials:  Figure 3.8 and Table 3.2, Kleiner and Hartigan (1981), Reprinted by permission of Journal of the American Statistical Association  Table 3.3, Colvin et al. (1997), Reprinted by permission of T. S. Colvin and D. L. Karlen  Table 3.4, Kramer and Jensen (1969a), Reprinted by permission of Journal of Quality Technology  Table 3.5, Reaven and Miller (1979), Reprinted by permission ofDiabetologia  Table 3.6, Timm (1975), Reprinted by permission of Elsevier North- Holland Publishing Company  Table 3.7, Elston and Grizzle (1962), Reprinted by permission ofBiometrics  Table 3.8, Frets (1921), Reprinted by permission of Genetica  Table 3.9, O’Sullivan and Mahan (1966), Reprinted by permission of American Journal of Clinical Nutrition  Table 4.2, Royston (1983), Reprinted by permission of Applied Statistics
18. 18.  Table 5.1, Beall (1945), Reprinted by permission ofPsychometrika  Table 5.2, Hummel and Sligo (1971), Reprinted by permission ofPsychological Bulletin  Table 5.3, Kramer and Jensen (1969b), Reprinted by permission of Journal of Quality Technology  Table 5.5, Lubischew (1962), Reprinted by permission ofBiometrics  Table 5.6, Travers (1939), Reprinted by permission ofPsychometrika  Table 5.7, Andrews and Herzberg (1985), Reprinted by permission of Springer-Verlag  Table 5.8, Tintner (1946), Reprinted by permission of Journal of the American Statistical Association  Table 5.9, Kramer (1972), Reprinted by permission of the author  Table 5.10, Cameron and Pauling (1978), Reprinted by permission of National Academy of Sciences  Table 6.2, Andrews and Herzberg (1985), Reprinted by permission of Springer-Verlag  Table 6.3, Rencher and Scott (1990), Reprinted by permission ofCommunications in Statistics: Simulation and Computation  Table 6.6, Posten (1962), Reprinted by permission of the author  Table 6.8, Crowder and Hand (1990, pp. 21–29), Reprinted by permission of Routledge Chapman and Hall  Table 6.12, Cochran and Cox (1957), Timm (1980), Reprinted by permission of John Wiley and Sons and Elsevier North-Holland Publishing Company  Table 6.14, Timm (1980), Reprinted by permission of Elsevier North- Holland Publishing Company  Table 6.16, Potthoff and Roy (1964), Reprinted by permission of Biometrika Trustees  Table 6.17, Baten, Tack, and Baeder (1958), Reprinted by permission of Quality Progress  Table 6.18, Keuls et al. (1984), Reprinted by permission ofScientia Horticulturae  Table 6.19, Burdick (1979), Reprinted by permission of the author  Table 6.20, Box (1950), Reprinted by permission of Biometrics  Table 6.21, Rao (1948), Reprinted by permission of Biometrika Trustees  Table 6.22, Cameron and Pauling (1978), Reprinted by permission of National Academy of Sciences  Table 6.23, Williams and Izenman (1981), Reprinted by permission of Colorado State University  Table 6.24, Beauchamp and Hoel (1973), Reprinted by permission of Journal of Statistical Computation and Simulation
19. 19.  Table 6.25, Box (1950), Reprinted by permission of Biometrics  Table 6.26, Grizzle and Allen (1969), Reprinted by permission ofBiometrics  Table 6.27, Crepeau et. al (1985), Reprinted by permission ofBiometrics  Table 6.28, Zerbe (1979a), Reprinted by permission of Journal of the American Statistical Association  Table 6.29, Timm (1980), Reprinted by permission of Elsevier North- Holland Publishing Company  Table 7.1, Siotani et al. (1963), Reprinted by permission of Institute of Statistical Mathematics  Table 7.2, Reprinted by permission of R. J. Freund.  Table 8.1, Kramer and Jensen (1969a), Reprinted by permission of Journal of Quality Technology  Table 8.3, Reprinted by permission of G. R. Bryce and R. M. Barker  Table 10.1, Box and Youle (1955), Reprinted by permission ofBiometrics  Tables 12.2, 12.3, and 12.4, Jeffers (1967), Reprinted by permission of Applied Statistics  Table 13.1, Brown et al. (1984), Reprinted by permission ofJournal of Pascal, Ada, and Modula  Correlation matrix in Example 13.6, Brown, Strong, and Rencher (1973), Reprinted by permission of The Journal of the Acoustical Society of America  Table 15.1, Hartigan (1975a), Reprinted by permission of John Wiley and Sons  Table 15.3, Dawkins (1989), Reprinted by permission of The American Statistician  Table 15.7, Hand et al. (1994), Reprinted by permission of D. J. Hand  Table 15.12, Sokol and Rohlf (1981), Reprinted by permission of W. H. Freeman and Co.  Table 15.13, Hand et al. (1994), Reprinted by permission of D. J. Hand  Table 16.1, Kruskal and Wish (1978), Reprinted by permission of Sage Publications  Tables 16.2 and 16.5, Hand et al. (1994), Reprinted by permission of D. J. Hand  Table 16.13, Edwards and Kreiner (1983), Reprinted by permission of Biometrika  Table 16.15, Hand et al. (1994), Reprinted by permission of D. J. Hand  Table 16.16, Everitt (1987), Reprinted by permission of the author
20. 20.  Table 16.17, Andrews and Herzberg (1985), Reprinted by permission of Springer Verlag  Table 16.18, Clausen (1998), Reprinted by permission of Sage Publications  Table 16.19, Andrews and Herzberg (1985), Reprinted by permission of Springer Verlag  Table A.1, Mulholland (1977), Reprinted by permission of Biometrika Trustees  Table A.2, D’Agostino and Pearson (1973), Reprinted by permission of Biometrika Trustees  Table A.3, D’Agostino and Tietjen (1971), Reprinted by permission of Biometrika Trustees  Table A.4, D’Agostino (1972), Reprinted by permission of Biometrika Trustees  Table A.5, Mardia (1970, 1974), Reprinted by permission of Biometrika Trustees  Table A.6, Barnett and Lewis (1978), Reprinted by permission of John Wiley and Sons  Table A.7, Kramer and Jensen (1969a), Reprinted by permission of Journal of Quality Technology  Table A.8, Bailey (1977), Reprinted by permission of Journal of the American Statistical Association  Table A.9, Wall (1967), Reprinted by permission of the author, Albuquerque, NM  Table A.10, Pearson and Hartley (1972) and Pillai (1964, 1965), Reprinted by permission of Biometrika Trustees  Table A.11, Schuurmann et al. (1975), Reprinted by permission ofJournal of Statistical Computation and Simulation  Table A.12, Davis (1970a, b, 1980), Reprinted by permission of Biometrika Trustees  Table A.13, Kleinbaum, Kupper, and Muller (1988), Reprinted by permission of PWS-KENT Publishing Company  Table A.14, Lee et al. (1977), Reprinted by permission of Elsevier North-Holland Publishing Company  Table A.15, Mathai and Katiyar (1979), Reprinted by permission of Biometrika Trustees
21. 21. CHAPTER 1 INTRODUCTION 1.1 WHY MULTIVARIATE ANALYSIS? Multivariate analysis consists of a collection of methods that can be used when several measurements are made on each individual or object in one or more samples. We will refer to the measurements as variables and to the individuals or objects as units(research units, sampling units, or experimental units) or observations. In practice, multivariate data sets are common, although they are not always analyzed as such. But the exclusive use of univariate procedures with such data is no longer excusable, given the availability of multivariate techniques and inexpensive computing power to carry them out. Historically, the bulk of applications of multivariate techniques have been in the behavioral and biological sciences. However, interest in multivariate methods has now spread to numerous other fields of investigation. For example, we have collaborated on multivariate problems with researchers in education, chemistry, environmental science, physics, geology, medicine, engineering, law, business, literature, religion, public broadcasting, nursing, mining, linguistics, biology, psychology, and many other fields. Table 1.1 shows some examples of multivariate observations. Table 1.1 Examples of Multivariate Data Units Variables 1. Students Several exam scores in a single course 2. Students Grades in mathematics, history, music, art, physics 3. People Height, weight, percentage of body fat, resting heart rate 4. Skulls Length, width, cranial capacity 5. Companies Expenditures for advertising, labor, raw materials 6. Manufactured items Various measurements to check on compliance with specifications 7. Applicants for bank loans Income, education level, length of residence, savings account, current debt load 8. Segments of literature Sentence length, frequency of usage of certain words and style characteristics 9. Human hairs Composition of various elements 10. Birds Lengths of various bones The reader will notice that in some cases all the variables are measured in the same scale (see 1 and 2 in Table 1.1). In other cases, measurements are in different scales (see 3 in Table 1.1). In a few techniques such as profile analysis (Sections 5.9 and 6.8), the variables must be commensurate, that is, similar in scale of measurement; however, most multivariate methods do not require this. Ordinarily the variables are measured simultaneously on each sampling unit. Typically, these variables are correlated. If this were not so, there would be little use for many of the techniques of multivariate analysis. We need to untangle the overlapping information provided by correlated variables and peer beneath the surface to see the underlying structure. Thus the goal of many multivariate approaches
22. 22. is simplification. We seek to express “what is going on” in terms of a reduced set of dimensions. Such multivariate techniques are exploratory; they essentially generate hypotheses rather than test them. On the other hand, if our goal is a formal hypothesis test, we need a technique that will (1) allow several variables to be tested and still preserve the significance level and (2) do this for any intercorrelation structure of the variables. Many such tests are available. As the two preceding paragraphs imply, multivariate analysis is concerned generally with two areas, descriptive and inferential statistics. In the descriptive realm, we often obtain optimal linear combinations of variables. The optimality criterion varies from one technique to another, depending on the goal in each case. Although linear combinations may seem too simple to reveal the underlying structure, we use them for two obvious reasons: (1) mathematical tractability (linear approximations are used throughout all science for the same reason) and (2) they often perform well in practice. These linear functions may also be useful as a follow-up to inferential procedures. When we have a statistically significant test result that compares several groups, for example, we can find the linear combination (or combinations) of variables that led to rejection. Then the contribution of each variable to these linear combinations is of interest. In the inferential area, many multivariate techniques are extensions of univariate procedures. In such cases we review the univariate procedure before presenting the analogous multivariate approach. Multivariate inference is especially useful in curbing the researcher’s natural tendency to read too much into the data. Total control is provided for experimentwise error rate; that is, no matter how many variables are tested simultaneously, the value of α (the significance level) remains at the level set by the researcher. Some authors warn against applying the common multivariate techniques to data for which the measurement scale is not interval or ratio. It has been found, however, that many multivariate techniques give reliable results when applied to ordinal data. For many years the applications lagged behind the theory because the computations were beyond the power of the available desk-top calculators. However, with modern computers, virtually any analysis one desires, no matter how many variables or observations are involved, can be quickly and easily carried out. Perhaps it is not premature to say that multivariate analysis has come of age. 1.2 PREREQUISITES The mathematical prerequisite for reading this book is matrix algebra. Calculus is not used [with a brief exception in equation (4.29)]. But the basic tools of matrix algebra are essential, and the presentation in Chapter 2 is intended to be sufficiently complete so that the reader with no previous experience can master matrix manipulation up to the level required in this book. The statistical prerequisites are basic familiarity with the normal distribution, t-tests, confidence intervals, multiple regression, and analysis of variance. These techniques are reviewed as each is extended to the analogous multivariate procedure. This is a multivariate methods text. Most of the results are given without proof. In a few cases proofs are provided, but the major emphasis is on heuristic explanations. Our goal is an intuitive grasp of multivariate analysis, in the same mode as other statistical methods courses. Some problems are algebraic in nature, but the majority involve data sets to be analyzed. 1.3 OBJECTIVES We have formulated three objectives that we hope this book will achieve for the reader. These objectives are based on long experience teaching a course in multivariate methods, consulting on multivariate problems with researchers in many fields, and guiding statistics graduate students as they consulted with similar clients. The first objective is to gain a thorough understanding of the details of various multivariate techniques, their purposes, their assumptions, their limitations, and so on. Many of these techniques are related, yet they differ in some essential ways. These similarities and differences are emphasized.
23. 23. The second objective is to be able to select one or more appropriate techniques for a given multivariate data set. Recognizing the essential nature of a multivariate data set is the first step in a meaningful analysis. Basic types of multivariate data are introduced in Section 1.4. The third objective is to be able to interpret the results of a computer analysis of a multivariate data set. Reading the manual for a particular program package is not enough to make an intelligent appraisal of the output. Achievement of the first objective and practice on data sets in the text should help achieve the third objective. 1.4 BASIC TYPES OF DATA AND ANALYSIS We will list four basic types of (continuous) multivariate data and then briefly describe some possible analyses. Some writers would consider this an oversimplification and might prefer elaborate tree diagrams of data structure. However, many data sets can fit into one of these categories, and the simplicity of this structure makes it easier to remember. The four basic data types are as follows: 1. A single sample with several variables measured on each sampling unit (subject or object). 2. A single sample with two sets of variables measured on each unit. 3. Two samples with several variables measured on each unit. 4. Three or more samples with several variables measured on each unit. Each data type has extensions, and various combinations of the four are possible. A few examples of analyses for each case will now be given: 1. A single sample with several variables measured on each sampling unit: a. Test the hypothesis that the means of the variables have specified values. b. Test the hypothesis that the variables are uncorrelated and have a common variance. c. Find a small set of linear combinations of the original variables that summarizes most of the variation in the data (principal components). d. Express the original variables as linear functions of a smaller set of underlying variables that account for the original variables and their inter- correlations (factor analysis). 2. A single sample with two sets of variables measured on each unit: a. Determine the number, the size, and the nature of relationships between the two sets of variables (canonical correlation). For example, we may wish to relate a set of interest variables to a set of achievement variables. How much overall correlation is there between these two sets? b. Find a model to predict one set of variables from the other set (multivariate multiple regression). 3. Two samples with several variables measured on each unit: a. Compare the means of the variables across the two samples (Hotelling’s T2- test). b. Find a linear combination of the variables that best separates the two samples (discriminant analysis). c. Find a function of the variables that will accurately allocate the units into the two groups (classification analysis). 4. Three or more samples with several variables measured on each unit:
24. 24. a. Compare the means of the variables across the groups (multivariate analysis of variance). b. Extension of 3b to more than two groups. c. Extension of 3c to more than two groups. CHAPTER 2 MATRIX ALGEBRA 2.1 INTRODUCTION This chapter introduces the basic elements of matrix algebra used in the remainder of this book. It is essentially a review of the requisite matrix tools and is not intended to be a complete development. However, it is sufficiently self-contained so that those with no previous exposure to the subject should need no other reference. Anyone unfamiliar with matrix algebra should plan to work most of the problems entailing numerical illustrations. It would also be helpful to explore some of the problems involving general matrix manipulation. With the exception of a few derivations that seemed instructive, most of the results are given without proof. Some additional proofs are requested in the problems. For the remaining proofs, see any general text on matrix theory or one of the specialized matrix texts oriented to statistics, such as Graybill (1969), Searle (1982), or Harville (1997). 2.2 NOTATION AND BASIC DEFINITIONS 2.2.1 Matrices, Vectors, and Scalars A matrix is a rectangular or square array of numbers or variables arranged in rows and columns. We use uppercase boldface letters to represent matrices. All entries in matrices will be real numbers or variables representing real numbers. The elements of a matrix are displayed in brackets. For example, the ACT score and GPA for three students can be conveniently listed in the following matrix: (2.1) The elements of A can also be variables, representing possible values of ACT and GPA for three students: (2.2) In this double-subscript notation for the elements of a matrix, the first subscript indicates the row; the second identifies the column. The matrix A in (2.2) could also be expressed as (2.3) where aij is a general element. With three rows and two columns, the matrix A in (2.1) or (2.2) is said to be 3 × 2. In general, if a matrix A has n rows and p columns, it is said to be n × p. Alternatively, we say the size of A is n × p.
25. 25. A vector is a matrix with a single column or row. The following could be the test scores of a student in a course in multivariate analysis: (2.4) Variable elements in a vector can be identified by a single subscript: (2.5) We use lowercase boldface letters for column vectors. Row vectors are expressed as where x′ indicates the transpose of x. The transpose operation is defined in Section 2.2.3. Geometrically, a vector with p elements identifies a point in a p-dimensional space. The elements in the vector are the coordinates of the point. In (2.35) in Section 2.3.3, we define the distance from the origin to the point. In Section 3.13, we define the distance between two vectors. In some cases, we will be interested in a directed line segment or arrow from the origin to the point. A single real number is called a scalar, to distinguish it from a vector or matrix. Thus 2, −4, and 125 are scalars. A variable representing a scalar will usually be denoted by a lowercase nonbolded letter, such as a = 5. A product involving vectors and matrices may reduce to a matrix of size 1 × 1, which then becomes a scalar. 2.2.2 Equality of Vectors and Matrices Two matrices are equal if they are the same size and the elements in corresponding positions are equal. Thus if A = (aij) and B = (bij), then A = B if aij = bij for all i and j. For example, let Then A = C. But even though A and B have the same elements, A ≠ B because the two matrices are not the same size. Likewise, A ≠ D because a23 ≠ d23. Thus two matrices of the same size are unequal if they differ in a single position. 2.2.3 Transpose and Symmetric Matrices The transpose of a matrix A, denoted by A′, is obtained from A by interchanging rows and columns. Thus the columns of A′ are the rows of A, and the rows of A′ are the columns of A. The following examples illustrate the transpose of a matrix or vector:
26. 26. The transpose operation does not change a scalar, since it has only one row and one column. If the transpose operator is applied twice to any matrix, the result is the original matrix: (2.6) If the transpose of a matrix is the same as the original matrix, the matrix is said to besymmetric; that is, A is symmetric if A = A′. For example, Clearly, all symmetric matrices are square. 2.2.4 Special Matrices The diagonal of a p × p square matrix A consists of the elements a11, a22, …, app. For example, in the matrix the elements 5, 9, and 1 lie on the diagonal. If a matrix contains zeros in all off-diagonal positions, it is said to be a diagonal matrix. An example of a diagonal matrix is This matrix could also be denoted as (2.7) A diagonal matrix can be formed from any square matrix by replacing off-diagonal elements by 0’s. This is denoted by diag(A). Thus for the above matrix A, we have (2.8) A diagonal matrix with a 1 in each diagonal position is called an identity matrix and is denoted by I. For example, a 3 × 3 identity matrix is given by (2.9) An upper triangular matrix is a square matrix with zeros below the diagonal, for example,
27. 27. (2.10) A lower triangular matrix is defined similarly. A vector of 1’s will be denoted by j: (2.11) A square matrix of 1’s is denoted by J. For example, a 3 × 3 matrix J is given by (2.12) Finally, we denote a vector of zeros by 0 and a matrix of zeros by O. For example, (2.13) 2.3 OPERATIONS 2.3.1 Summation and Product Notation For completeness, we review the standard mathematical notation for sums and products. The sum of a sequence of numbers a1, a2, …, an is indicated by If the n numbers are all the same, then a = a + a + … + a = na. The sum of all the numbers in an array with double subscripts, such as is indicated by This is sometimes abbreviated to The product of a sequence of numbers a1, a2, …, an is indicated by If the n numbers are all equal, the product becomes a = (a)(a) … (a) = an.
28. 28. 2.3.2 Addition of Matrices and Vectors If two matrices (or two vectors) are the same size, their sum is found by adding corresponding elements, that is, if A is n × p and B is n × p, then C = A + B is also n × pand is found as (cij) = (aij + bij). For example, Similarly, the difference between two matrices or two vectors of the same size is found by subtracting corresponding elements. Thus C = A − B is found as (cij) = (aij − bij). For example, If two matrices are identical, their difference is a zero matrix; that is, A = B implies A −B = O. For example, Matrix addition is commutative: (2.14) The transpose of the sum (difference) of two matrices is the sum (difference) of the transposes: (2.15) (2.16) (2.17) (2.18) 2.3.3 Multiplication of Matrices and Vectors In order for the product AB to be defined, the number of columns in A must be the same as the number of rows in B, in which case A and B are said to be conformable. Then the (ij)th element of C = AB is (2.19) Thus cij is the sum of products of the ith row of A and the jth column of B. We therefore multiply each row of A by each column of B, and the size of AB consists of the number of rows of A and the number of columns of B. Thus, if A is n × m and B is m× p, then C = AB is n × p. For example, if then
29. 29. Note that A is 4 × 3, B is 3 × 2, and AB is 4 × 2. In this case, AB is of a different size than either A or B. If A and B are both n × n, then AB is also n × n. Clearly, A2 is defined only if A is square. In some cases AB is defined, but BA is not defined. In the above example, BAcannot be found because B is 3 × 2 and A is 4 × 3 and a row of B cannot be multiplied by a column of A. Sometimes AB and BA are both defined but are different in size. For example, if A is 2 × 4 and B is 4 × 2, then AB is 2 × 2 and BA is 4 × 4. If A and B are square and the same size, then AB and BA are both defined. However, (2.20) except for a few special cases. For example, let Then Thus we must be careful to specify the order of multiplication. If we wish to multiply both sides of a matrix equation by a matrix, we must multiply “on the left” or “on the right” and be consistent on both sides of the equation. Multiplication is distributive over addition or subtraction: (2.21) (2.22) (2.23) (2.24) Note that, in general, because of (2.20), (2.25) Using the distributive law, we can expand products such as (A − B)(C − D) to obtain (2.26) The transpose of a product is the product of the transposes in reverse order: (2.27) Note that (2.27) holds as long as A and B are conformable. They need not be square. Multiplication involving vectors follows the same rules as for matrices. Suppose A isn × p, a is p × 1, b is p × 1, and c is n × 1. Then some possible products are Ab, c′A, a′b, b′a, and ab′. For example, let Then
30. 30. Note that Ab is a column vector, c′A is a row vector, c′Ab is a scalar, and a′b = b′a. The triple product c′Ab was obtained as c′(Ab). The same result would be obtained if we multiplied in the order (c′A)b: This is true in general for a triple product: (2.28) Thus multiplication of three matrices can be defined in terms of the product of two matrices, since (fortunately) it does not matter which two are multiplied first. Note thatA and B must be conformable for multiplication, and B and C must be conformable. For example, if A is n × p, B is p × q, and C is q × m, then both multiplications are possible and the product ABC is n × m. We can sometimes factor a sum of triple products on both the right and left sides. For example, (2.29) As another illustration, let X be n × p and A be n × n. Then (2.30) If a and b are both n × 1, then (2.31) is a sum of products and is a scalar. On the other hand, ab′ is defined for any size a andb and is a matrix, either rectangular or square:
31. 31. (2.32) Similarly, (2.33) (2.34) Thus a′a is a sum of squares and aa′ is a square (symmetric) matrix. The products a′aand aa′ are sometimes referred to as the dot product and matrix product, respectively. The square root of the sum of squares of the elements of a is the distance from the origin to the point a and is also referred to as the length of a: (2.35) As special cases of (2.33) and (2.34), note that if j is n × 1, then (2.36) where j and J were defined in (2.11) and (2.12). If a is n × 1 and A is n × p, then (2.37) (2.38) Thus a′j is the sum of the elements in a, j′A contains the column sums of A, and Ajcontains the row sums of A. In a′j, the vector j is n × 1; in j′A, the vector j is n × 1; and in Aj, the vector j is p × 1. Since a′b is a scalar, it is equal to its transpose: (2.39) This allows us to write (a′b)2 in the form (2.40) From (2.18), (2.26), and (2.39) we obtain (2.41) Note that in analogous expressions with matrices, however, the two middle terms cannot be combined: If a and x1, x2, …, xn are all p × 1 and A is p × p, we obtain the following factoring results as extensions of (2.21) and (2.29):
32. 32. (2.42) (2.43) (2.44) (2.45) We can express matrix multiplication in terms of row vectors and column vectors. Ifai′ is the ith row of A and bj is the jth column of B, then the (ij)th element of AB is ai′bj. For example, if A has three rows and B has two columns, then the product AB can be written as (2.46) This can be expressed in terms of the rows of A: (2.47) Note that the first column of AB in (2.46) is and likewise the second column is Ab2. Thus AB can be written in the form This result holds in general: (2.48) To further illustrate matrix multiplication in terms of rows and columns, let matrix, x be a p × 1 vector, and S be a p × p matrix. Then (2.49) (2.50) Any matrix can be multiplied by its transpose. If A is n × p, then Similarly, From (2.6) and (2.27), it is clear that both AA′ and A′A are symmetric. In the above illustration for AB in terms of row and column vectors, the rows of Awere denoted by ai′ and the columns of B by bj. If both rows and columns of a matrix Aare under discussion, as in AA′ and A′A, we will use the notation ai′ for rows and a(j)for columns. To illustrate, if A is 3 × 4, we have
33. 33. where, for example, With this notation for rows and columns of A, we can express the elements of A′A or of AA′ as products of the rows of A or of the columns of A. Thus if we write A in terms of its rows as then we have (2.51) (2.52) Similarly, if we express A in terms of columns as then (2.53)
34. 34. (2.54) Let A = (aij) be an n × n matrix and D be a diagonal matrix, D = diag(d1, d2, …, dn). Then, in the product DA, the ith row of A is multiplied by di, and in AD, the jth column of A is multiplied by dj. For example, if n = 3, we have (2.55) (2.56) (2.57) In the special case where the diagonal matrix is the identity, we have (2.58) If A is rectangular, (2.58) still holds, but the two identities are of different sizes. The product of a scalar and a matrix is obtained by multiplying each element of the matrix by the scalar: (2.59) For example, (2.60)
35. 35. (2.61) Since caij = aijc, the product of a scalar and a matrix is commutative: (2.62) Multiplication of vectors or matrices by scalars permits the use of linear combinations, such as If A is a symmetric matrix and x and y are vectors, the product (2.63) is called a quadratic form, while (2.64) is called a bilinear form. Either of these is, of course, a scalar and can be treated as such. Expressions such as are permissible (assuming A is positive definite; see Section 2.7). 2.4 PARTITIONED MATRICES It is sometimes convenient to partition a matrix into submatrices. For example, a partitioning of a matrix A into four submatrices could be indicated symbolically as follows: For example, a 4 × 5 matrix A could be partitioned as where If two matrices A and B are conformable and A and B are partitioned so that the submatrices are appropriately conformable, then the product AB can be found by following the usual row-by-column pattern of multiplication on the submatrices as if they were single elements; for example, (2.65)
36. 36. It can be seen that this formulation is equivalent to the usual row-by-column definition of matrix multiplication. For example, the (1, 1) element of AB is the product of the first row of A and the first column of B. In the (1, 1) element of A11B11 we have the sum of products of part of the first row of A and part of the first column of B. In the (1, 1) element of A12B21 we have the sum of products of the rest of the first row of Aand the remainder of the first column of B. Multiplication of a matrix and a vector can also be carried out in partitioned form. For example, (2.66) where the partitioning of the columns of A corresponds to the partitioning of the elements of b. Note that the partitioning of A into two sets of columns is indicated by a comma, A = (A1, A2). The partitioned multiplication in (2.66) can be extended to individual columns of Aand individual elements of b: (2.67) Thus Ab is expressible as a linear combination of the columns of A, the coefficients being elements of b. For example, let Then Using a linear combination of columns of A as in (2.67), we obtain We note that if A is partitioned as in (2.66), A = (A2, A2), the transpose is not equal to (A′1, A′2), but rather (2.68) 2.5 RANK Before defining the rank of a matrix, we first introduce the notion of linear independence and dependence. A set of vectors a1, a2, …, an is said to be linearly dependent if constants c1, c2, …, cn (not all zero) can be found such that (2.69) If no constants c1, c2, …, cn can be found satisfying (2.69), the set of vectors is said to be linearly independent.
37. 37. If (2.69) holds, then at least one of the vectors ai can be expressed as a linear combination of the other vectors in the set. Thus linear dependence of a set of vectors implies redundancy in the set. Among linearly independent vectors there is no redundancy of this type. The rank of any square or rectangular matrix A is defined as rank(A) = number of linearly independent rows of A = number of linearly independent columns of A. It can be shown that the number of linearly independent rows of a matrix is always equal to the number of linearly independent columns. If A is n × p, the maximum possible rank of A is the smaller of n and p, in which caseA is said to be of full rank (sometimes said full row rank or full column rank). For example, has rank 2 because the two rows are linearly independent (neither row is a multiple of the other). However, even though A is full rank, the columns are linearly dependent because rank 2 implies there are only two linearly independent columns. Thus, by(2.69), there exist constants c1, c2, and c3 such that (2.70) By (2.67), we can write (2.70) in the form or (2.71) A solution vector to (2.70) or (2.71) is given by any multiple of c = (14, −11, −12)′. Hence we have the interesting result that a product of a matrix A and a vector c is equal to 0, even though A ≠ O and c ≠ 0. This is a direct consequence of the linear dependence of the column vectors of A. Another consequence of the linear dependence of rows or columns of a matrix is the possibility of expressions such as AB = CB, where A ≠ C. For example, let Then All three of the matrices A, B, and C are full rank; but being rectangular, they have a rank deficiency in either rows or columns, which permits us to construct AB = CB withA ≠ C. Thus in a matrix equation, we cannot, in general, cancel matrices from both sides of the equation. There are two exceptions to this rule. One involves a nonsingular matrix to be defined in Section 2.6. The other special case occurs when the expression holds for all possible values of the matrix common to both sides of the equation. For example, (2.72) To see this, let × = (1, 0, …, 0)′. Then the first column of A equals the first column of B. Now let × = (0, 1, 0, …, 0)′, and the second column of A equals the second column of B. Continuing in this fashion, we obtain A = B. Suppose a rectangular matrix A is n × p of rank p, where p < n. We typically shorten this statement to “A is nx p of rank p < n.”
38. 38. 2.6 INVERSE If a matrix A is square and of full rank, then A is said to be nonsingular, and A has a unique inverse, denoted by A−1, with the property that (2.73) For example, let Then If A is square and of less than full rank, then an inverse does not exist, and A is said to be singular. Note that rectangular matrices do not have inverses as in (2.73), even if they are full rank. If A and B are the same size and nonsingular, then the inverse of their product is the product of their inverses in reverse order, (2.74) Note that (2.74) holds only for nonsingular matrices. Thus, for example, if A is n × p of rank p < n, then A′A has an inverse, but (A′A)−1 is not equal to A−1(A′)−1 because A is rectangular and does not have an inverse. If a matrix is nonsingular, it can be canceled from both sides of an equation, provided it appears on the left (right) on both sides. For example, if B is nonsingular, then since we can multiply on the right by B−1 to obtain Otherwise, if A, B, and C are rectangular or square and singular, it is easy to constructAB = CB, with A ≠ C, as illustrated near the end of Section 2.5. The inverse of the transpose of a nonsingular matrix is given by the transpose of the inverse: (2.75) If the symmetric nonsingular matrix A is partitioned in the form then the inverse is given by (2.76) where . A nonsingular matrix of the form B + cc′, where B is nonsingular, has as its inverse (2.77)
39. 39. 2.7 POSITIVE DEFINITE MATRICES The symmetric matrix A is said to be positive definite if x′Ax > 0 for all possible vectors x (except x = 0). Similarly, A is positive semidefinite if x′Ax ≥ 0 for all x ≠ 0. [A quadratic form x′Ax was defined in (2.63).] The diagonal elements aii of a positive definite matrix are positive. To see this, let x′ = (0, …, 0, 1, 0, …, 0) with a 1 in the ith position. Then x′Ax = aii > 0. Similarly, for a positive semidefinite matrix A, aii ≥ 0 for all i. One way to obtain a positive definite matrix is as follows: (2.78) This is easily shown: where z = Bx. Thus, which is positive (Bx cannot be 0 unless x = 0, because B is full rank). If B is less than full rank, then by a similar argument, B′B is positive semidefinite. Note that A = B′B is analogous to a = b2 in real numbers, where the square of any number (including negative numbers) is positive. In another analogy to positive real numbers, a positive definite matrix can be factored into a “square root” in two ways. We give one method below in (2.79) and the other in Section 2.11.8. A positive definite matrix A can be factored into (2.79) where T is a nonsingular upper triangular matrix. One way to obtain T is the Cholesky decomposition, which can be carried out in the following steps. Let A = (aij) and T = (tij) be n × n. Then the elements of T are found as follows: For example, let Then by the Cholesky method, we obtain
40. 40. 2.8 DETERMINANTS The determinant of an n × n matrix A is defined as the sum of all n! possible products of n elements such that 1. Each product contains one element from every row and every column, and 2. The factors in each product are written so that the column subscripts appear in order of magnitude and each product is then preceded by a plus or minus sign according to whether the number of inversions in the row subscripts is even or odd. An inversion occurs whenever a larger number precedes a smaller one. The symbol n! is defined as (2.80) The determinant of A is a scalar denoted by |A| or by det(A). The preceding definition is not useful in evaluating determinants, except in the case of 2 × 2 or 3 × 3 matrices. For larger matrices, other methods are available for manual computation, but determinants are typically evaluated by computer. For a 2 × 2 matrix, the determinant is found by (2.81) For a 3 × 3 matrix, the determinant is given by (2.82) This can be found by the following scheme. The three positive terms are obtained by and the three negative terms by The determinant of a diagonal matrix is the product of the diagonal elements; that is, if D = diag(d1, d2, …, dn), then (2.83) As a special case of (2.83), suppose all diagonal elements are equal, say, Then (2.84) The extension of (2.84) to any square matrix A is (2.85) Because the determinant is a scalar, we can carry out operations such as provided that |A| > 0 for |A|1/2 and that |A| ≠ 0 for 1/|A|. If the square matrix A is singular, its determinant is 0: (2.86)
41. 41. If A is near singular, then there exists a linear combination of the columns that is close to 0, and |A| is also close to 0. If A is nonsingular, its determinant is nonzero: (2.87) If A is positive definite, its determinant is positive: (2.88) If A and B are square and the same size, the determinant of the product is the product of the determinants: (2.89) For example, let Then The determinant of the transpose of a matrix is the same as the determinant of the matrix, and the determinant of the the inverse of a matrix is the reciprocal of the determinant: (2.90) (2.91) If a partitioned matrix has the form where A11 and A22 are square, but not necessarily the same size, then (2.92) For a general partitioned matrix, where A11 and A22 are square and nonsingular (not necessarily the same size), the determinant is given by either of the following two expressions: (2.93) (2.94) Note the analogy of (2.93) and (2.94) to the case of the determinant of a 2 × 2 matrix as given by (2.81): If B is nonsingular and c is a vector, then (2.95) 2.9 TRACE
42. 42. A simple function of an n × n matrix A is the trace, denoted by tr(A) and defined as the sum of the diagonal elements of A; that is, . The trace is, of course, a scalar. For example, suppose Then The trace of the sum of two square matrices is the sum of the traces of the two matrices: (2.96) An important result for the product of two matrices is (2.97) This result holds for any matrices A and B where AB and BA are both defined. It is not necessary that A and B be square or that AB equal BA. For example, let Then From (2.52) and (2.54), we obtain (2.98) where the aij′s are elements of the n × p matrix A. 2.10 ORTHOGONAL VECTORS AND MATRICES Two vectors a and b of the same size are said to be orthogonal if (2.99) Geometrically, orthogonal vectors are perpendicular [see (3.14) and the comments following (3.14)]. If a′a = 1, the vector a is said to be normalized. The vector a can always be normalized by dividing by its length, . Thus (2.100) is normalized so that c′c = 1. A matrix C = (c1, c2, …, cp) whose columns are normalized and mutually orthogonal is called an orthogonal matrix. Since the elements of C′C are products of columns of C[see (2.54)], which have the properties for all i and for all i = j, we have (2.101) If C satisfies (2.101), it necessarily follows that (2.102) from which we see that the rows of C are also normalized and mutually orthogonal. It is clear from (2.101) and (2.102) that C−1 = C′ for an orthogonal matrix C. We illustrate the creation of an orthogonal matrix by starting with
43. 43. whose columns are mutually orthogonal. To normalize the three columns, we divide by the respective lengths, , , and , to obtain Note that the rows also became normalized and mutually orthogonal so that C satisfies both (2.101) and (2.102). Multiplication by an orthogonal matrix has the effect of rotating axes; that is, if a point x is transformed to z = Cx, where C is orthogonal, then (2.103) and the distance from the origin to z is the same as the distance to x. 2.11 EIGENVALUES AND EIGENVECTORS 2.11.1 Definition For every square matrix A, a scalar λ and a nonzero vector × can be found such that (2.104) In (2.104), λ is called an eigenvalue of A and x is an eigenvector. To find λ and x, we write (2.104) as (2.105) If |A − λI| ≠ 0, then (A − λI) has an inverse and x = 0 is the only solution. Hence, in order to obtain nontrivial solutions, we set |A − λI| = 0 to find values of λ that can be substituted into (2.105) to find corresponding values of x. Alternatively, (2.69) and(2.71) require that the columns of A − λI be linearly dependent. Thus in (A − λI)x = 0, the matrix A − λI must be singular in order to find a solution vector x that is not 0. The equation |A − λI| = 0 is called the characteristic equation. If A is n × n, the characteristic equation will have n roots; that is, A will have n eigenvalues λ1, λ2, …, λn. The λ’s will not necessarily all be distinct or all nonzero. However, if A arises from computations on real (continuous) data and is nonsingular, the λ’s will all be distinct (with probability 1). After finding λ1, λ2, …, λn, the accompanying eigenvectors x1, x2, …, xn can be found using (2.105). If we multiply both sides of (2.105) by a scalar k and note by (2.62) that k and A − λIcommute, we obtain (2.106) Thus if x is an eigenvector of A, kx is also an eigenvector, and eigenvectors are unique only up to multiplication by a scalar. Hence we can adjust the length of x, but the direction from the origin is unique; that is, the relative values of (ratios of) the components of x = (x1, x2, …, xn)′ are unique. Typically, the eigenvector x is scaled so that x′x = 1. To illustrate, we will find the eigenvalues and eigenvectors for the matrix The characteristic equation is
44. 44. from which λ1 = 3 and λ2 = 2. To find the eigenvector corresponding to λ1 = 3, we use(2.105), As expected, either equation is redundant in the presence of the other, and there remains a single equation with two unknowns, x1 = x2. The solution vector can be written with an arbitrary constant, If c is set equal to to normalize the eigenvector, we obtain Similarly, corresponding to λ2 = 2, we have 2.11.2 I + A and I − A If λ is an eigenvalue of A and x is the corresponding eigenvector, then 1 + λ is an eigenvalue of I + A and 1 − λ is an eigenvalue of I − A. In either case, x is the corresponding eigenvector. We demonstrate this for I + A: 2.11.3 tr(A) and |A| For any square matrix A with eigenvalues λ1, λ2, …, λn, we have (2.107) (2.108) Note that by the definition in Section 2.9, tr(A) is also equal to , but aii ≠ λi. We illustrate (2.107) and (2.108) using the matrix from the illustration in Section 2.11.1, for which λ1 = 3 and λ2 = 2. Using (2.107), we obtain and from (2.108), we have By definition, we obtain
45. 45. 2.11.4 Positive Definite and Semidefinite Matrices The eigenvalues and eigenvectors of positive definite and positive semidefinite matrices have the following properties: 1. The eigenvalues of a positive definite matrix are all positive. 2. The eigenvalues of a positive semidefinite matrix are positive or zero, with the number of positive eigenvalues equal to the rank of the matrix. It is customary to list the eigenvalues of a positive definite matrix in descending order: λ1 > λ2 > … > λp. The eigenvectors x1, x2, …, xn are listed in the same order; x1corresponds to λ1, x2 corresponds to λ2, and so on. The following result, known as the Perron–Frobenius theorem, is of interest in Chapter 12: If all elements of the positive definite matrix A are positive, then all elements of the first eigenvector are positive. (The first eigenvector is the one associated with the first eigenvalue, λ1). 2.11.5 The Product AB If A and B are square and the same size, the eigenvalues of AB are the same as those ofBA, although the eigenvectors are usually different. This result also holds if AB andBA are both square but of different sizes, as when A is n × p and B is p × n. (In this case, the nonzero eigenvalues of AB and BA will be the same.) 2.11.6 Symmetric Matrix The eigenvectors of an n × n symmetric matrix A are mutually orthogonal. It follows that if the n eigenvectors of A are normalized and inserted as columns of a matrix C = (x1, x2, …, xn), then C is orthogonal. 2.11.7 Spectral Decomposition It was noted in Section 2.11.6 that if the matrix C = (x1, x2, …, xn) contains the normalized eigenvectors of an n × n symmetric matrix A, then C is orthogonal. Therefore, by (2.102), I = CC′, which we can multiply by A to obtain We now substitute C = (x1, x2, …, xn): (2.109) where (2.110) The expression A = CDC′ in (2.109) for a symmetric matrix A in terms of its eigenvalues and eigenvectors is known as the spectral decomposition of A.
46. 46. Since C is orthogonal and C′C = CC′ = I, we can multiply (2.109) on the left by C′ and on the right by C to obtain (2.111) Thus a symmetric matrix A can be diagonalized by an orthogonal matrix containing normalized eigenvectors of A, and by (2.110) the resulting diagonal matrix contains eigenvalues of A. 2.11.8 Square Root Matrix If A is positive definite, the spectral decomposition of A in (2.109) can be modified by taking the square roots of the eigenvalues to produce a square root matrix, (2.112) where (2.113) The square root matrix A1/2 is symmetric and serves as the square root of A: (2.114) 2.11.9 Square and Inverse Matrices Other functions of A have spectral decompositions analogous to (2.112). Two of these are the square and inverse of A. If the square matrix A has eigenvalues λ1, λ2, …, λnand accompanying eigenvectors x1, x2, …, xn, then A2 has eigenvalues and eigenvectors x1, x2, …, xn. If A is nonsingular, then A−1 has eigenvalues 1/λ1, 1/λ2, …, 1/λn and eigenvectors x1, x2, …, xn. If A is also symmetric, then (2.115) (2.116) where C = (x1, x2, …, xn) has as columns the normalized eigenvectors of A (and of A2and A−1), D2 = diag , and D−1 = diag(1/λ1, 1/λ2, …, 1/λn). 2.11.10 Singular Value Decomposition In Section 2.11.7 we expressed a symmetric matrix in terms of its eigenvalues and eigenvectors in the spectral decomposition. In a similar manner, we can express any (real) matrix A in terms of eigenvalues and eigenvectors of A′A and AA′. Let A be an n× p matrix of rank k. Then the singular value decomposition of A can be expressed as (2.117) where U is n × k, D is k × k, and V is p × k. The diagonal elements of the non-singular diagonal matrix D = diag(λ1, λ2, …, λk) are the positive square roots of , which are the nonzero eigenvalues of A′A or of AA′. The values λ1, λ2, …, λk are called the singular values of A. The k columns of U are the normalized eigenvectors of AA′corresponding to the eigenvalues . The k columns of V are the normalized eigenvectors of A′A corresponding to the eigenvalues . Since the columns of U and V are (normalized) eigenvectors of symmetric matrices, they are mutually orthogonal (see Section 2.11.6), and we have U′U = V′V = I.
47. 47. 2.12 KRONECKER AND VEC NOTATION When manipulating matrices with block structure, it is often convenient to use Kronecker and vec notation. The Kronecker product of an m × n matrix A with a p × qmatrix B is an mp × nq matrix that is denoted A B and is defined as (2.118) For example, because of the block structure in the matrix we can write where If A is an m × n matrix with columns a1, …, an, then we can refer to the elements of Ain vector (or “vec”) form using (2.119) so that vec A is an mn × 1 vector. If A is an m × m symmetric matrix, then the m2 elements in vec A will include m(m − 1)/2 pairs of identical elements (since aij = aji). In such settings it is often useful to denote the vector half (or “vech”) of a symmetric matrix in order to include only the unique elements of the matrix. If we separate elements from different columns using semicolons, we can define the “vech” operator with (2.120) so that vech A is an m(m + 1)/2 × 1 vector. Note that vech A can be obtained by finding vec A and then eliminating the m(m − 1)/2 elements above the diagonal of A.