Special Topics by PhD Program Concentration: Partial Least Squares
Special Topics 1Special Topics by PhD Program Concentration: Partial Least Squares Edgardo Donovan RES 610 – Dr. Joshua Shackman Module 4 – Case Analysis Monday, September 5, 2011
Special Topics 2 Special Topics by PhD Program Concentration: Partial Least Squares Partial Least Squares (PLS) is a variant to structural equation modeling bestsuited for early stages of exploratory research. PLS software is reasonably user friendlyand is ideal for exploring factor loadings and path coefficients for large sophisticatedmodels when a researcher is attempting to propose a hypothesis. Despite its limitations asa primary analytical tool, it is ideal for gathering initial measurements concerning dataand hypothesis in the logistical and planning researcher stages to test the viability of theinitial hypothetical model, research design, and estimating the impact logisticalconstraints (time, money, access) required to obtain larger and more precise sources ofdata. Given that much of group and organization research is constrained by eitherlimited sample sizes or nascent theoretical development, PLS can be of use to aresearcher in the early stages of a project when in depth theory has not been formulatedand available sample sizes are small due to initial logistical constraints. The partial leastsquares (PLS) data analytical technique was developed to help overcome these and otherchallenges facing researchers. PLS represents a powerful and effective means to testmultivariate structural models with latent variables. (Sosik 2009, p. 5). PLS first gainedpopularity in chemometric research and later industrial applications. It has since spread toresearch in education, marketing, and the social sciences (Statnotes 2008, p. 3). It issometimes called "Projection to Latent Structures" because of its general strategy. The Xvariables (the predictors) are reduced to principal components, as are the Y variables (the
Special Topics 3dependents). The components of X are used to predict the scores on the Y components,and the predicted Y component scores are used to predict the actual values of the Yvariables. In constructing the principal components of X, the PLS algorithm iterativelymaximizes the strength of the relation of successive pairs of X and Y component scoresby maximizing the covariance of each X-score with the Y variables. This strategy meansthat while the original X variables may be multicollinear, the X components used topredict Y will be orthogonal. Also, the X variables may have missing values, but therewill be a computed score for every case on every X component. Finally, PLS coefficientsmay be computed even when there may have been more original X variables thanobservations. In contrast, any of these three conditions (multicollinearity, missing values,and too few cases in relation to variables) may well render traditional OLS regressionestimates unreliable. Partial least squares (PLS) is also a method for constructingpredictive models when the factors are many and highly collinear. Note that the emphasisis on predicting the responses and not necessarily on trying to understand the underlyingrelationship between the variables. (Tobias 1997, p. 1) In principle, PLS can be used with many factors. However, if the number offactors gets too large, you are likely to get a model that fits the sampled data perfectly butthat will fail to predict new data well. This phenomenon is called over-fitting. In suchcases, although there are many manifest factors, there may be only a few underlying orlatent factors that account for most of the variation in the response. The general idea ofPLS is to try to extract these latent factors, accounting for as much of the manifest factorvariation 1 as possible while modeling the responses well. For this reason, the acronym
Special Topics 5and hypothetical and exploratory analysis is solid prior to designing a survey andcollecting any data. Partial least squares (PLS) regression/path analysis is thus an alternative to OLSregression, canonical correlation, or structural equation modeling (SEM) for analysis ofsystems of independent and response variables. In fact, PLS is sometimes called"component-based SEM," in contrast to the usual covariance-based structural equationmodeling. PLS is a predictive technique which can handle many independent variables,even when predictors display multicollinearity. Like canonical correlation or multivariateGLM, it can also relate the set of independent variables to a set of multiple dependent(response) variables. However, PLS is less than satisfactory as an explanatory techniquebecause it is low in power to filter out variables of minor causal importance (Statnotes2008, p. 1) Although PLS is used by researchers and practitioners in many scientificdisciplines, some misunderstanding remains among group and organization researchersregarding the legitimacy and usefulness of PLS. PLS has its advantages, limitations, andapplication to group and organization research using a data set collected in an experimenton the effects of leadership styles and communication format on the group potency ofcomputer-mediated work groups. (Sosik 2009, p. 5). Partial Least Squares (PLS) is a variant to structural equation modeling best suitedfor early stages of exploratory research. PLS software is reasonably user friendly and isideal for exploring factor loadings and path coefficients for large sophisticated modelswhen a researcher is attempting to propose a hypothesis. Despite its limitations as a
Special Topics 6primary analytical tool, it is ideal for gathering initial measurements concerning data andhypothesis in the logistical and planning researcher stages to test the viability of theinitial hypothetical model, research design, and estimating the impact logisticalconstraints (time, money, access) required to obtain larger and more precise sources ofdata.
Special Topics 7 BibliographySosik, J, Kahai, S. and Piovoso, M. (2009). Silver bullet or voodoo statistics? aprimer for using the partial least squares data analytic technique in group andorganization research. Group & Organization Management. 34(1):5-36 RetrievedAugust 16, 2007, from http://ft.csa.com.lb-proxy6.touro.edu/ids70/resolver.php?sessid=j39qpm158amnk5tm556bp2tab7&server=csaweb115v.csa.com&check=9a0c396ed6657394003abfa951c61028&db=sageman-set-c&key=1059-6011%2F10.1177_1059601108329198&mode=pdfStatNotes. (2008). Partial least squares. North Carolina State University.Retrieved March 2, 2008, fromhttp://faculty.chass.ncsu.edu/garson/PA765/pls.htmTemme, Dirk; Henning Kreis; & Lutz Hildebrandt (2006). PLS path modeling– asoftware review. sfb 649 discussion paper 2006-084. Berlin: Institute ofMarketing, Humboldt-Universität zu Berlin, Germany. Available athttp://edoc.hu-berlin.de/series/sfb-649-papers/2006-84/PDF/84.pdfTobias, Randall D. (1997). An introduction to partial least squares regression.Cary, NC: SAS Institute. Available athttp://ftp.sas.com/techsup/download/technote/ts509.pdf.
Special Topics 8University of Hamburg (2009). SmartPLS. Retrieved Sept. 30, 2009, fromhttp://www.smartpls.de