This document provides an introduction to multivariate analysis techniques. It discusses the importance of analyzing multiple variables simultaneously rather than individually to obtain a more comprehensive understanding. Multivariate analysis techniques are classified as either dependence techniques, which involve dependent and independent variables, or interdependence techniques, which analyze relationships between multiple variables simultaneously without distinguishing dependent and independent variables. Several commonly used dependence techniques are described in detail, including multiple regression, multiple discriminant analysis, canonical correlation analysis, multivariate analysis of variance/covariance, conjoint analysis, and structural equation modeling.
Econometrics notes (Introduction, Simple Linear regression, Multiple linear r...Muhammad Ali
Econometrics notes for BS economics students
Muhammad Ali
Assistant Professor of Statistics
Higher Education Department, KPK, Pakistan.
Email:Mohammadale1979@gmail.com
Cell#+923459990370
Skyp: mohammadali_1979
Overview of Multivariate Statistical MethodsThomasUttaro1
This is an overview of advanced multivariate statistical methods which have become very relevant in many domains over the last few decades. These methods are powerful and can exploit the massive datasets implemented today in meaningful ways. Typically analytics platforms do not deploy these statistical methods, in favor of straightforward metrics and machine learning, and thus they are often overlooked. Additional references are available as documented.
Multi-dimensional time series based approach for Banking Regulatory Stress Te...Genpact Ltd
Under regulatory paradigm of banking risk management, banks are required to perform stress testing of internally computed risk parameters to ensure holding of adequate amount of capital to offset the effects of downturn events. For this purpose, most of the contemporary stress-testing practices are limited to one dimensionality of the calculation, where endogenous risk parameters are predicted by modeling and scenario based values of exogenous parameters (macroeconomic variables).
To demonstrate our approaches we will use Sudoku puzzles, which are an excellent test bed for
evolutionary algorithms. The puzzles are accessible enough for people to enjoy. However the more complex
puzzles require thousands of iterations before an evolutionary algorithm finds a solution. If we were
attempting to compare evolutionary algorithms we could count their iterations to solution as an indicator
of relative efficiency. Evolutionary algorithms however include a process of random mutation for solution
candidates. We will show that by improving the random mutation behaviours we were able to solve
problems with minimal evolutionary optimisation. Experiments demonstrated the random mutation was at
times more effective at solving the harder problems than the evolutionary algorithms. This implies that the
quality of random mutation may have a significant impact on the performance of evolutionary algorithms
with Sudoku puzzles. Additionally this random mutation may hold promise for reuse in hybrid evolutionary
algorithm behaviours.
A researcher in attempting to run a regression model noticed a neg.docxevonnehoggarth79783
A researcher in attempting to run a regression model noticed a negative beta sign for an explanatory variable when s/he was expecting a positive sign based on theoretical considerations. What advice would you give to the researcher as to what is going on and what specific diagnostics would you look at? Explain conceptually and statisticallythe different ways you cancorrect for this problem.
Reason
One of the most common and important reasons for such situations is the existence of multicollinearity. Multicollinearity can happen if some of the independent variables are highly correlated to each other or to another variable that is not in the model.
Multicollinearity also has other symptoms such as
· Large variance for regression coefficients
· Non-significant individual coefficients while the general model is significant
· Change of marginal contributions depending on the variables in the model
· Large correlation coefficients in the correlation matrix of variables
It should however be noted that the general model can preserve its predictive ability and it is only the explanatory power that is lost
Before going to the solutions and measures the researcher can take it is wise to take a step back and see the underlying reason for the multicollinearity. An extreme case where two variables are identical gives the best understanding of problem
In this case we are trying to define y as a function of and while in reality . Therefore any linear combination of and is replaceable by infinite other linear combinations (ie )
It is simply understandable that while the y is predicted correctly in all the instances individual coefficients for and are meaningless.
Diagnosis
One of the most common diagnoses for multicollinearity is the variance inflation factor (VIF)
Where
And is the coefficient of multiple determination of regression of on other variables
The variance inflation factor therefore determines how much the variance of each coefficient inflates. when equals zero VIF equals 1 which suggests zero multicollinearity heuristic is that any value of VIF larger than 10 is alerting and a case of strong multicollinearity exists.
Solution
s
There are a few solutions for the multi Collinearity problem:
1- Ignoring the problem completely is possible for cases where we only care about the final model fit and prediction capability rather than individual coefficients and explanation power
2- Removing some of the correlated variables from the model, this can be justified since we can argue the effect of variable is however seen by similar highly correlated variables that are kept in the model
3- Principle component analysis (or any orthogonal transformation) can reduce the number of factors to a few orthogonal factors with no collinearity; however we should note that the interpretation of variables after a PC transformation is hard.
4- For cases where we intend to keep all the variables in the model without any major transformation, the Ridge regr.
Econometrics notes (Introduction, Simple Linear regression, Multiple linear r...Muhammad Ali
Econometrics notes for BS economics students
Muhammad Ali
Assistant Professor of Statistics
Higher Education Department, KPK, Pakistan.
Email:Mohammadale1979@gmail.com
Cell#+923459990370
Skyp: mohammadali_1979
Overview of Multivariate Statistical MethodsThomasUttaro1
This is an overview of advanced multivariate statistical methods which have become very relevant in many domains over the last few decades. These methods are powerful and can exploit the massive datasets implemented today in meaningful ways. Typically analytics platforms do not deploy these statistical methods, in favor of straightforward metrics and machine learning, and thus they are often overlooked. Additional references are available as documented.
Multi-dimensional time series based approach for Banking Regulatory Stress Te...Genpact Ltd
Under regulatory paradigm of banking risk management, banks are required to perform stress testing of internally computed risk parameters to ensure holding of adequate amount of capital to offset the effects of downturn events. For this purpose, most of the contemporary stress-testing practices are limited to one dimensionality of the calculation, where endogenous risk parameters are predicted by modeling and scenario based values of exogenous parameters (macroeconomic variables).
To demonstrate our approaches we will use Sudoku puzzles, which are an excellent test bed for
evolutionary algorithms. The puzzles are accessible enough for people to enjoy. However the more complex
puzzles require thousands of iterations before an evolutionary algorithm finds a solution. If we were
attempting to compare evolutionary algorithms we could count their iterations to solution as an indicator
of relative efficiency. Evolutionary algorithms however include a process of random mutation for solution
candidates. We will show that by improving the random mutation behaviours we were able to solve
problems with minimal evolutionary optimisation. Experiments demonstrated the random mutation was at
times more effective at solving the harder problems than the evolutionary algorithms. This implies that the
quality of random mutation may have a significant impact on the performance of evolutionary algorithms
with Sudoku puzzles. Additionally this random mutation may hold promise for reuse in hybrid evolutionary
algorithm behaviours.
A researcher in attempting to run a regression model noticed a neg.docxevonnehoggarth79783
A researcher in attempting to run a regression model noticed a negative beta sign for an explanatory variable when s/he was expecting a positive sign based on theoretical considerations. What advice would you give to the researcher as to what is going on and what specific diagnostics would you look at? Explain conceptually and statisticallythe different ways you cancorrect for this problem.
Reason
One of the most common and important reasons for such situations is the existence of multicollinearity. Multicollinearity can happen if some of the independent variables are highly correlated to each other or to another variable that is not in the model.
Multicollinearity also has other symptoms such as
· Large variance for regression coefficients
· Non-significant individual coefficients while the general model is significant
· Change of marginal contributions depending on the variables in the model
· Large correlation coefficients in the correlation matrix of variables
It should however be noted that the general model can preserve its predictive ability and it is only the explanatory power that is lost
Before going to the solutions and measures the researcher can take it is wise to take a step back and see the underlying reason for the multicollinearity. An extreme case where two variables are identical gives the best understanding of problem
In this case we are trying to define y as a function of and while in reality . Therefore any linear combination of and is replaceable by infinite other linear combinations (ie )
It is simply understandable that while the y is predicted correctly in all the instances individual coefficients for and are meaningless.
Diagnosis
One of the most common diagnoses for multicollinearity is the variance inflation factor (VIF)
Where
And is the coefficient of multiple determination of regression of on other variables
The variance inflation factor therefore determines how much the variance of each coefficient inflates. when equals zero VIF equals 1 which suggests zero multicollinearity heuristic is that any value of VIF larger than 10 is alerting and a case of strong multicollinearity exists.
Solution
s
There are a few solutions for the multi Collinearity problem:
1- Ignoring the problem completely is possible for cases where we only care about the final model fit and prediction capability rather than individual coefficients and explanation power
2- Removing some of the correlated variables from the model, this can be justified since we can argue the effect of variable is however seen by similar highly correlated variables that are kept in the model
3- Principle component analysis (or any orthogonal transformation) can reduce the number of factors to a few orthogonal factors with no collinearity; however we should note that the interpretation of variables after a PC transformation is hard.
4- For cases where we intend to keep all the variables in the model without any major transformation, the Ridge regr.
7
Repeated Measures Designs
for Interval Data
Learning Objectives
After reading this chapter, you should be able to:
• Explain the advantages and drawbacks of using data from non-independent groups.
• Complete a paired-samples t-test.
• Complete a within-subjects F.
• Describe “power” as it relates to statistical testing.
iStockphoto/Thinkstock
tan81004_07_c07_163-192.indd 163 2/22/13 3:41 PM
CHAPTER 7Introduction
Chapter Outline
7.1 Dependent Groups Designs
Reconsidering the t and F ratios
An Example
A Matched Pairs Example
Comparing the Paired-Samples t-Test to the Independent Samples t-Test
The Power of the Dependent Groups Test
The Dependent Groups t-Test on Excel
The Alternate Approaches to Dependent t-Tests
7.2 The Within-Subjects F
Managing Error Variance in the Within-Subjects F
A Within-Subjects F Example
Calculating the Within-Subjects F
Understanding the Result
Comparing the Within-Subjects F and the One-Way ANOVA
Another Within-Subjects F Example
A Within-Subjects F in Excel
Chapter Summary
Introduction
Some of the most critical questions in management relate to change over time. For exam-ple, managers are deeply interested in assessing sales growth, shifts in shopping trends,
improvements in employee attitudes, increases in employee performance, and decreases in
absenteeism or turnover. They are also often keen to find out the influence of various
managerial decisions and business strategies on these and many other change-oriented
outcomes. However, none of the analyses completed to this point address these change-
related questions, because these analyses do not accommodate repeated measures of the
same variables within the same group of subjects over time. For instance, the t-tests and
ANOVAs discussed so far compared independent groups, groups that have completely
separate subjects. Each subject was only measured once on each variable of interest. The
same group of subjects was not measured repeatedly on the same variables to assess
change over time.
Another important issue is that independent samples t-tests and ANOVAs assume that
the groups being compared are equivalent on most aspects to begin with, except for the
independent (grouping or treatment) variable being investigated. When groups are large
and individuals are randomly selected, this is usually a reasonable assumption, because
any differences between groups tend to be relatively unimportant. The logic behind ran-
dom selection is that when groups are randomly drawn from the same population they
will differ only by chance—the larger the random sample, the lower the probability of
a substantial pre-existing difference. However, when groups are relatively small it can
be difficult to determine whether a difference in the measures of the dependent variable
occurred because the independent variable had a different impact on the different groups
or because there were differences between the groups to begin with.
tan81004_07.
Advanced StatisticsUnit 5There are several r.docxnettletondevon
Advanced Statistics
Unit 5
There are several related
topics in this unit…
Types of Variables in Analysis
Univariate and Multivariate
Statistics Overview
Univariate Statistics
Multivariate Statistics
Independent Variables (IV)
This is the variable thought to influence or cause a change in the value of another variable.
For example, if you do not get enough sleep you will experience fatigue and drowsiness during work. Lack of sleep, then, is the independent variable thought to affect fatigue and drowsiness.
Dependent Variables (DV)
This is the variable that is thought to be changed or affected by another (independent) variable. Said another way, the value of the dependent variable is responsive to or determined by changes in the independent variable.
In the example above fatigue and drowsiness are the variables affected. We will experience more fatigue and drowsiness if we have less sleep.
Confounding Variables
This is a variable that confounds, or confuses, the relationship between the independent and dependent variables. Or we can think of it this way…something other than the independent variable is accounting for changes in the dependent variable.
For example, how engaging and interesting a meeting is (vs. boring) will affect whether or not you feel fatigue and drowsiness during the meeting. Thus, lack of sleep is not accounting for fatigue or drowsiness. Rather the nature of the meeting or a combination of lack of sleep and the nature of the meeting are causing fatigue and drowsiness.
Types of Variables in Analysis
Statistics
Univariate and Multivariate
Statistics Overview
Statistics
We differentiate statistics as univariate or multivariate depending on the
number of dependent variables involved in the statistical analysis.
When there is a single dependent variable we use a univariate statistic.
When there is more than one dependent variable we use a multivariate statistic.
We also need to consider how both the dependent and independent variables
were measured in order to determine what statistic is appropriate. Remember
that we can measure numerically (interval and ratio level of measurement) or
we can measure simply by differentiating between types (nominal level of
measurement).
Univariate Statistics
Statistics
There are two groups of univariate statistics we commonly use
when we have a single numerical dependent variable.
The first set are appropriate when we have a nominal/categorical
independent variable. This would include statistics that compare
categories or groups like men/women, highly
satisfied/dissatisfied employees, youth/seniors, etc.
These include…
t-test
ANOVA
ANCOVA
and Factorial Analysis of Variance
Univariate Statistics
Statistics
We use the following statistics when we have a single numerical dependent
variable and we want to make…
t-test a simple comparison between two groups
ANOVA (a one-way analysis of variance)
a comparison betwe.
ABSTRACT : This paper critically examined a broad view of Structural Equation Model (SEM) with a view
of pointing out direction on how researchers can employ this model to future researches, with specific focus on
several traditional multivariate procedures like factor analysis, discriminant analysis, path analysis. This study
employed a descriptive survey and historical research design. Data was computed viaDescriptive Statistics,
Correlation Coefficient, Reliability. The study concluded that Novice researchers must take care of assumptions
and concepts of Structure Equation Modeling, while building a model to check the proposed hypothesis. SEM is
more or less an evolving technique in the research, which is expanding to new fields. Moreover, it is providing
new insights to researchers for conducting longitudinal investigations.
.
Factor Analysis is a statistical tool that measures the impact of a few un-observed variables called factors on a large number of observed variables. It is often used to determine a linear relationship between variables before subjecting them to further analysis.
Chapter 12Choosing an Appropriate Statistical TestiStockph.docxmccormicknadine86
Chapter 12
Choosing an Appropriate Statistical Test
iStockphoto/ThinkstockLearning Objectives
After reading this chapter, you will be able to. . .
· understand the importance of using the proper statistical analysis.
· identify the type of analysis based on four critical questions.
· use the decision tree to identify the correct statistical test.
Here we are in the final chapter that will pull all prior chapters together. Chapters 1 to 3 discussed descriptive statistics while the latterchapters, 4 to 11, discussed inferential statistics. Each of the inferential chapters presented a statistical concept then conducted the appropriateanalysis to be able to test a hypothesis. The big question for students learning statistics is, "How do I know if I'm using the correct statisticaltest?" For experienced statisticians this question is easy to answer as it is based on a few criteria. However, to a student just learning statisticsor to the novice researcher, this question is a legitimate one. Many statistical reference texts include a guide that asks specific questionsregarding the type of research question, design, number and scales of measurement of variables, and statistical assumption of the data thatallows you to use an elegant chart known as a decision tree. Based on the answers to these questions, the decision tree is used to helpdetermine the type of analysis to be used for the research, thereby helping you answer this big question.
12.1 Considerations
To make the correct decisions based on the use of a decision tree, there are four specific questions that must be answered. These questions areas follows:
· What is your overarching research question?
· How many independent, dependent, and covariate variables are used in the study?
· What are the scales of measurement of each of your variables?
· Are there violations of statistical assumptions?
If you are able to answer these specific questions, then you will be able to determine the proper analysis for your study. These questions arecritically important, and if they cannot be answered, then not enough thought has gone into the research. That said, let us discuss each ofthese questions so that they can be considered and answered in the use of the decision tree.
What Is Your Overarching Research Question?Try It!
Derive your ownresearch question foryour Master's Thesisor DoctoralDissertation. Have a colleague orprofessor read it. What are theirthoughts or suggestions forimprovements?
Answering this question seems simple enough as all research has an overarching research questionthat drives the study, especially since this dictates the type of quantitative methodology. There arekey words in every research question that help determine the appropriate type of analysis. Forinstance, if the research question states, "What are the effects of job satisfaction on employeeproductivity?" the keyword is "effects" as in the cause and effect of job satisfaction (theindependent variable) on productivity (th ...
AHP technique a way to show preferences amongst alternativesijsrd.com
This article presents a review of the applications of Analytic Hierarchy Process (AHP). AHP is a multiple criteria decision-making tool that has been used in almost all the applications related with decision-making. Decisions involve many intangibles that need to be traded off. The Analytic Hierarchy Process (AHP) is a theory of measurement through pairwise comparisons and relies on the judgements of experts to derive priority scales. It is these scales that measure intangibles in relative terms. The comparisons are made using a scale of absolute judgements that represents how much more; one element dominates another with respect to a given attribute. The judgements may be inconsistent, and how to measure inconsistency and improve the judgements, when possible to obtain better consistency is a concern of the AHP. The derived priority scales are synthesised by multiplying them by the priority of their parent nodes and adding for all such nodes. An illustration is also included.
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
Honest Reviews of Tim Han LMA Course Program.pptxtimhan337
Personal development courses are widely available today, with each one promising life-changing outcomes. Tim Han’s Life Mastery Achievers (LMA) Course has drawn a lot of interest. In addition to offering my frank assessment of Success Insider’s LMA Course, this piece examines the course’s effects via a variety of Tim Han LMA course reviews and Success Insider comments.
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Francesca Gottschalk from the OECD’s Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...Levi Shapiro
Letter from the Congress of the United States regarding Anti-Semitism sent June 3rd to MIT President Sally Kornbluth, MIT Corp Chair, Mark Gorenberg
Dear Dr. Kornbluth and Mr. Gorenberg,
The US House of Representatives is deeply concerned by ongoing and pervasive acts of antisemitic
harassment and intimidation at the Massachusetts Institute of Technology (MIT). Failing to act decisively to ensure a safe learning environment for all students would be a grave dereliction of your responsibilities as President of MIT and Chair of the MIT Corporation.
This Congress will not stand idly by and allow an environment hostile to Jewish students to persist. The House believes that your institution is in violation of Title VI of the Civil Rights Act, and the inability or
unwillingness to rectify this violation through action requires accountability.
Postsecondary education is a unique opportunity for students to learn and have their ideas and beliefs challenged. However, universities receiving hundreds of millions of federal funds annually have denied
students that opportunity and have been hijacked to become venues for the promotion of terrorism, antisemitic harassment and intimidation, unlawful encampments, and in some cases, assaults and riots.
The House of Representatives will not countenance the use of federal funds to indoctrinate students into hateful, antisemitic, anti-American supporters of terrorism. Investigations into campus antisemitism by the Committee on Education and the Workforce and the Committee on Ways and Means have been expanded into a Congress-wide probe across all relevant jurisdictions to address this national crisis. The undersigned Committees will conduct oversight into the use of federal funds at MIT and its learning environment under authorities granted to each Committee.
• The Committee on Education and the Workforce has been investigating your institution since December 7, 2023. The Committee has broad jurisdiction over postsecondary education, including its compliance with Title VI of the Civil Rights Act, campus safety concerns over disruptions to the learning environment, and the awarding of federal student aid under the Higher Education Act.
• The Committee on Oversight and Accountability is investigating the sources of funding and other support flowing to groups espousing pro-Hamas propaganda and engaged in antisemitic harassment and intimidation of students. The Committee on Oversight and Accountability is the principal oversight committee of the US House of Representatives and has broad authority to investigate “any matter” at “any time” under House Rule X.
• The Committee on Ways and Means has been investigating several universities since November 15, 2023, when the Committee held a hearing entitled From Ivory Towers to Dark Corners: Investigating the Nexus Between Antisemitism, Tax-Exempt Universities, and Terror Financing. The Committee followed the hearing with letters to those institutions on January 10, 202
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
1. P a g e 1 | 8
DESCRIPTION OF THE TOPIC
Items Description of the Topic
Course Data Analysis for Social Science Teachers
Topic Introduction to Data Analysis
Module Id 1.1
Introduction
In the recent past, quite a bit of importance has been given to data analysis in
research. One of the possible reasons is that empirical evidence establishes a firm grounding
to either accept or reject the proposed hypotheses. The choice of the statistical technique
depends on the nature of the research problem or question and also on the nature of the data
set.
The research questions to solve a research gap or problem may be related to
identifying the degree of relationships among variables, checking for the significance of
group differences, predicting of group memberships or structure, or it could be time-related.
In order to identify associations between two or more variables, depending on
whether their nature of being parametric or non-parametric, correlation, and regression or chi-
square techniques may be adopted. This can be done as a Bi-variate correlation and
regression, multiple correlation and regression, Canonical correlation, Multiple Discriminant
Analysis, and Log-it regression. The bi-variate correlation is a good starting point to identify
the degree of relationship between two continuous variables, such as job and family
satisfaction where either of them can be treated as a DV and IV as the research question may
be. But bi-variate regression would require one of them to be defined as the DV and the other
as the IV. Although these are not multivariate techniques, they form the basis of the
Multivariate Analysis (MVA).
1. Importance of Multivariate Analysis (MVA)
If watching a movie needs to be a pleasant experience, the lighting, the projected film
light and sound effects in the theatre must be optimum. The other factors that may contribute
to a pleasant viewing experience may include, but not limited to, seating arrangements, air-
conditioning and hall odour. If one has to study or measure the pleasantness of watching a
movie in a theatre, all of the above factors must be studied together and not in isolation.
There is a possibility of an unpleasant parking experience that may negatively impact the
pleasantness of watching a movie. So, the real value of measuring the pleasantness of a
movie-watching experience lies in measuring all the influencing factors together.
This is exactly what Multivariate analysis is all about. So, analysis of multiple
variables simultaneously would result in a better picture to arrive at inferences instead of
multiple uni-variate analyses done with the individual variables. Statistical Techniques that
simultaneously analyse multiple measurements of the observed variables are known as
Multivariate Analysis (MVA). We may perform MVA by using multiple variables in a single
relationship or in multiple relationships.
2. P a g e 2 | 8
In a truly multivariate scenario, all variables must be:
i. Random in nature,
ii. Inter-related, and
iii. Interpreted in unison.
Reading the paper related to testing the Greenhaus and Allen model by Pattusamy and
Jacob (2015) will help in understanding our forthcoming discussions and answering a few
questions in the end. The theoretical model is shown in Figure 1.
Figure 1 - Theoretical Model
From Figure 1, it is seen that family-work conflict (FWC) will have a negative effect
on job satisfaction (JS) while family-work facilitation (FWF) will have a positive effect on
job satisfaction. Similarly, work-family facilitation (WFF) will have a positive effect on
family satisfaction (FS) while work-family conflict (WFC) will have a negative effect on
family satisfaction. Both job and family satisfaction will influence feelings of work-family
balance positively which in turn will positively influence life satisfaction (LS). All the above
statements have been hypothesized and can be stated conclusively if we have empirical data
to establish the stated hypotheses. The use of appropriate statistical methods will facilitate the
data analysis to arrive at well-grounded inferences and conclusions.
Univariate statistical tests involve one dependent variable. Examples include, but are
not limited to, t-tests of means, analysis of variance (ANOVA), analysis of covariance and
simple linear regression (with one dependent and one independent variable). Having said so
much about the importance of data analysis, let us have a quick look at a few multivariate
techniques that we are likely to study in detail during the course of this study.
The next section leads us to the classification of MVA.
3. P a g e 3 | 8
2. Classification of MVA
MVA can be classified as Dependence techniques and Interdependence techniques.
2.1 Dependence techniques (used when there are one or more dependent variables and
independent variables. Eg. Multiple regression analysis)
i. Multiple regression and multiple correlation
ii. Multiple Discriminant Analysis (MDA) and Logistic Regression
iii. Canonical Correlation Analysis
iv. Multivariate Analysis of Variance and Covariance
v. Conjoint Analysis
vi. Structural Equation Modelling (SEM) and Confirmatory Factor Analysis (CFA)
2.1.1 Multiple Regression
Let us presume that some previous research has established that cars with higher
engine capacity and higher unladen weight offer lesser fuel efficiency (possibly validated
using a correlation analysis). If a researcher wants to predict the fuel efficiency based on
engine capacity and unladen weight, then fuel efficiency is treated as the dependent variable
while engine capacity and unladen weight are treated as the independent variables. The
researcher collects data on fuel efficiency, engine capacity and an unladen weight of about
100 cars or more (that run on the same type of fuel) and would possibly use the multiple
regression (MR) method to predict fuel efficiency. In order to use the MR method the
dependent and the independent variables (two or more) must be metric data.
2.1.2 Multiple Discriminant Analysis (MDA) and Logit Analysis
If the dependent variable is dichotomous (Yes/No, Men / Women) type, then MDA is
an appropriate technique. The independent variables need to be metric data. MDA helps to
understand group differences and to predict the possibility that an observation or object
would belong to a specific group. An example that we had discussed in MR in the previous
section, suppose we had data on the engine capacity and unladen weight of about 100 plus
cars (that run on the same type of fuel) and if we want to classify them as Big and Small cars,
then MDA would be a relevant technique.
Logit Analysis also is known as Logistics regression is a combination of MR and
MDA. Although the regression principle is similar to that of MR, the DV in Logit regression
need not be metric as in the case of MR but can be a dichotomous variable as in MDA.
Another distinguishing fact of Logit regression is that it can accommodate both metric and
non-metric IVs and overlook the multivariate normality assumption.
4. P a g e 4 | 8
2.1.3 Canonical Correlation Analysis
If there are multiple metric dependent and metric independent variables to be
correlated and regressed, then the right tool is Canonical Correlation Analysis. We actually
try to determine the associations between two sets of variables. For example, we might study
the relationship between a number of indices of fuel efficiency (the DVs such as Indicated
Horse Power (IHP) and Brake Horse Power (BHP)) and the IVs (such as engine capacity,
unladen weight of the car, and age of the car).
2.1.4 Multivariate Analysis of Variance and Covariance
In-order to simultaneously explore the relationship between multiple categorical
independent variables, which are also called treatments, and two or more metric dependent
variables, an ideal technique would be the Multivariate Analysis of Variance and Covariance
(MANOVA). If the analysis requires the elimination of the effect of the uncontrolled metric
independent variables, which are known as covariates, on the dependent variables, then the
multivariate analysis of covariance (MANCOVA) is used. Both MANOVA and MANCOVA
may be done as one way or factorial. In our car example with fuel efficiency as the DV, age
of the car can be treated as a covariate.
2.1.5 Conjoint Analysis
Conjoint Analysis is a contemporary dependence technique that would help a decision-maker
(product design head) evaluate the importance of attributes (typically product attributes)
along with its levels. Let us say we have three attributes of a car, namely, airbags (2, 4, or 6
airbags), speakers for infotainment (2, 4 or 6 speakers) and steering wheel height adjustment
(low, medium and high). If we want to know popular combinations preferred by car
enthusiasts, we may have to ask them to rate all of the 27 combinations. For example a car
enthusiast may prefer 6 airbags, 4 speakers and medium height for his steering wheel.
Likewise, there are 27 possible combinations. However, using conjoint analysis it is possible
to capture the ratings of the prospective car buyer with just 9 or more combinations. The
conjoint analysis helps a great deal is product design simulation studies.
2.1.6 Structural Equation Modelling (SEM) and Confirmatory Factor Analysis
(CFA)
While multiple regression examines a single relationship between a DV and multiple
IVs in an SEM, it is possible to examine multiple relationships simultaneously. Generally, a
CFA is done prior to the SEM. The SEM consists of the structural and the measurement
model. The structural model may have one or more DVs and one or more IVs with all
relationships defined. Each of the DVs and IVS may be either uni- or multi-dimensional and
each of the dimensions may be measured using scale items for indicators. The CFA will show
the contribution of each scale item to its dimension and the extent to which it measures the
same. By this the measurement model is evaluated. After the validity and reliability of the
5. P a g e 5 | 8
measurement model are established, the structural model is evaluated to establish and prove
or disprove hypotheses. Hence, SEM supports simultaneous assessment of relationships and
accommodates multi-item scales.
2. 2 Interdependence techniques (absence of dependent or independent variables but
involves techniques to simultaneously analyze all variables together in the set. Eg. Factor
Analysis).
a) Factor Analysis (both Principal Component Analysis and Common Factor Analysis)
b) Cluster Analysis
c) Perceptual Mapping (also called as Multidimensional Scaling)
d) Correspondence Analysis
2.2.1 Factor Analysis
The objective of factor analysis is to reduce the number of measured variables into
meaningful factors (or variates) with minimal loss of information. This can either be done by
the PCA method or by common factor analysis. Suppose a prospective car buyer is
considering the color of the car, the aerodynamic design, body-colored bumpers, height-
adjustable steering column, driver seat height adjustment, touch screen for infotainment, ABS
and Airbags. If the opinion of the car buyer is captured using a 7 point Likert scale, either
PCA or common factor analysis may group these eight variables in three groups, namely,
external features (colour of the car, the aerodynamic design, body-colored bumpers), internal
features (height-adjustable steering column, driver seat height adjustment, touch screen for
infotainment) and safety features (ABS and Airbags). So factor analysis helps us to reduce
eight variables into three meaningful factors (variates).
2.2.2 Cluster Analysis
In the car example that we have been discussing so far, suppose we have the data on
engine capacities of about 130 cars with the engine capacities ranging from a minimum of
799cc to 2399cc and we want these 130 cars to be placed in three groups, namely, small,
medium and large cars, cluster analysis would be a recommended technique. The Cluster
analysis algorithm places the objects in homogeneous groups depending on the characteristics
specified by the researcher. In our example, the cars would be placed in groups based on
engine capacity. Clustering can be done based on multiple characteristics too. Either
hierarchical or non-hierarchical clustering procedures may be adopted. Basically hierarchical
methods could be either agglomerative or divisive. The algorithms followed in the
hierarchical methods are single, complete and average linkage methods. The other methods
are the Centroid and Ward methods. Alternatively the non-hierarchical clustering popularly
follows the k-means algorithm and places objects in cluster groups once the number of
clusters is specified. The decision on whether to adopt the hierarchical or non-hierarchical
procedure depends on the choice of the researcher and the problem defined.
6. P a g e 6 | 8
2.2.3 Perceptual Mapping
If we consider two dimensions of the car, namely, fuel efficiency and driving comfort
and we want to know how the brands of cars currently available in the market are positioned
in the minds of the car enthusiasts and perceived by the car enthusiasts, the right technique is
Perceptual Mapping (PM) also known as Multi-dimensional Scaling (MDS). MDS
typically helps a researcher to determine the perceived relative image of the cars (in this case)
considering the two dimensions. In MDS, unlike in factor or cluster analysis, a solution can
be obtained for each respondent and there is no variate. The researcher makes choices
between similarity and preference data, disaggregate and aggregate analysis and on whether
to use the Compositional or decompositional methods. Although earlier MDS programs were
predominantly non-metric in output, the contemporary programs provide metric output.
2.2.4 Correspondence Analysis
If we have non-metric data such as colors of the cars, classification of car size such as
small, medium and large and we want to position the cars in a perceptual map, then the
technique to be adopted is the Correspondence Analysis (CA). It starts with a cross-tabulation
of the two attributes, namely, colors and car size; after that it carries out a non-metric to
metric conversion, and then leads to dimension reduction and finally the perceptual map is
prepared. CA is the best option for a multivariate representation of interdependence for non-
metric data.
3. Nature of Data
The following table gives a summary of the nature of data:
Name of the Multivariate
Technique
Nature of the Data
DV IV
Canonical Correlation Metric, Non-metric Metric, Non-metric
MANOVA Metric Non-metric
ANOVA Metric Non-metric
MDA Non-metric Metric
Multiple Regression Metric Metric, Non-metric
Conjoint Analysis Non-metric, Metric Non-metric
SEM Metric Metric, Non-metric
4 Some Generic Tips to Perform Multivariate Analysis
7. P a g e 7 | 8
While performing MVA on the research problem, it would help if the researcher observes the
following tips:
1. Ensure that both statistical and practical significance exists in the research being done.
2. The sample size should be adequate but neither under sized nor over sized.
3. Clearly, understand the nature of the data.
4. Use a minimum number of variables in the model to obtain the desired results.
5. Identify and eliminate errors.
6. Ensure a fool-proof validation of the results.
I hope the above content gives you a fair idea of the existing multivariate techniques
that we would be covering in our course and a snapshot of their applications. For further
learning, may I also suggest the open courseware by Cynthia et al., (2011), titled “Statistical
Thinking and Data Analysis”.
Although at the beginning of this discussion, I had suggested the reading of the paper
by Pattusamy and Jacob (2015), throughout the discussion I used examples relating to cars. If
you have understood the application of the discussed MVA tests with the variables in the car
example, you should be able to answer a few fundamental questions relating to data analysis
with respect to the variables in the paper. Here are your challenges.
Self-Assessment:
You could suggest appropriate statistical tests to answer the following research
questions. It does help if you could also justify your choice of the technique.
1. Are men more satisfied with their jobs than women?
2. Does life satisfaction vary with age?
3. Will feelings of work-life balance influence the relationship between job
satisfaction and life satisfaction?
4. Would there be a difference in the strength of the relationship between family
satisfaction and life satisfaction between men and women?
5. Would it be possible to categorize men who are highly and moderately satisfied in
their lives?
8. P a g e 8 | 8
References
1. Barbara G.T and Linda S.F, Using Multivariate Statistics, 6th Edition, Pearson Education
Inc, pp. 612-680.
2. Cynthia Rudin, Allison Chang, and Dimitrios Bisias. 15.075J Statistical Thinking and
Data Analysis. Fall 2011. Massachusetts Institute of Technology: MIT
OpenCourseWare, https://ocw.mit.edu. License: Creative Commons BY-NC-SA.
3. Hair J.F, Black W.C, Babin B.J and Anderson R.E, Multivariate Data Analysis, 7th
Edition, Pearson Education (South Asia), pp. 89-149.
4. Murugan Pattusamy and Jayanth Jacob, A test of Greenhaus and Allen (2011) model on
Work Family Balance, Current Psychology, Springer, 2015.
5. Zumbo B.D. (2014) Univariate Tests. In: Michalos A.C. (eds) Encyclopedia of Quality
of Life and Well-Being Research. Springer, Dordrecht
***************************************************************************