This document discusses standard and hierarchical multiple regression. It provides examples using data on academic achievement (GPA) predicted from minutes spent studying, motivation, and anxiety. Standard multiple regression is used to assess how much variance in GPA is explained collectively by the three predictors. Specifically, it finds the predictors explain 65% of variance in GPA. It also describes interpreting individual predictor importance through coefficients like beta weights. Hierarchical regression is mentioned but not demonstrated.
Correlation by Neeraj Bhandari ( Surkhet.Nepal )Neeraj Bhandari
The regression coefficients are 0.8 and 0.2.
The coefficient of correlation r is the geometric mean of the regression coefficients, which is:
√(0.8 × 0.2) = 0.4
Therefore, the value of the coefficient of correlation is 0.4.
This presentation covered the following topics:
1. Definition of Correlation and Regression
2. Meaning of Correlation and Regression
3. Types of Correlation and Regression
4. Karl Pearson's methods of correlation
5. Bivariate Grouped data method
6. Spearman's Rank correlation Method
7. Scattered diagram method
8. Interpretation of correlation coefficient
9. Lines of Regression
10. regression Equations
11. Difference between correlation and regression
12. Related examples
This document provides an introduction to regression analysis and statistical methods. It discusses that regression analysis estimates the linear relationship between dependent and independent variables. Multiple linear regression allows studying the relationship between one dependent variable and two or more independent variables. The accuracy of regression models can be evaluated using measures like R-squared and testing overall model significance. Diagnostic tests of assumptions like independence of errors, normality, homoscedasticity and absence of multicollinearity/influential outliers are important.
This document discusses linear regression analysis. It defines simple and multiple linear regression, and explains that regression examines the relationship between independent and dependent variables. The document provides the equations for linear regression analysis, and discusses calculating the slope, intercept, standard error of the estimate, and coefficient of determination. It explains that regression analysis is widely used for prediction and forecasting in areas like advertising and product sales.
The document discusses two common measures of the relationship between two sets of scores: Pearson's Product-Moment Correlation and Spearman's Rho. Pearson's correlation measures the linear relationship between metric variables and involves calculating the covariance between the variables and dividing by the product of their standard deviations. Spearman's Rho measures the monotonic relationship between ordinal or ranked variables and involves calculating the difference between the ranks of each variable and finding the average squared difference. Both measures result in a correlation coefficient between -1 and 1, where values closer to 1 or -1 indicate a strong relationship and values near 0 indicate no relationship.
This document provides an introduction to regression and correlation analysis. It discusses simple and multiple linear regression models, how to interpret regression coefficients, and how to check the assumptions and adequacy of regression models. Key aspects covered include computing the regression line using the least squares method, interpreting the slope and intercept, checking the normality of residuals, and examining residual plots to validate the model. The goal of regression analysis is to model the relationship between a dependent variable and one or more independent variables.
This document discusses various types and methods of measuring correlation between two variables. It describes correlation as a statistical tool to measure the degree of relationship between variables. Some key methods covered include scatter diagrams, Karl Pearson's coefficient of correlation, and Spearman's rank correlation coefficient. Positive and negative correlation examples are provided. The document also differentiates between simple, multiple, partial, and total correlation, as well as linear and non-linear correlation.
Correlation by Neeraj Bhandari ( Surkhet.Nepal )Neeraj Bhandari
The regression coefficients are 0.8 and 0.2.
The coefficient of correlation r is the geometric mean of the regression coefficients, which is:
√(0.8 × 0.2) = 0.4
Therefore, the value of the coefficient of correlation is 0.4.
This presentation covered the following topics:
1. Definition of Correlation and Regression
2. Meaning of Correlation and Regression
3. Types of Correlation and Regression
4. Karl Pearson's methods of correlation
5. Bivariate Grouped data method
6. Spearman's Rank correlation Method
7. Scattered diagram method
8. Interpretation of correlation coefficient
9. Lines of Regression
10. regression Equations
11. Difference between correlation and regression
12. Related examples
This document provides an introduction to regression analysis and statistical methods. It discusses that regression analysis estimates the linear relationship between dependent and independent variables. Multiple linear regression allows studying the relationship between one dependent variable and two or more independent variables. The accuracy of regression models can be evaluated using measures like R-squared and testing overall model significance. Diagnostic tests of assumptions like independence of errors, normality, homoscedasticity and absence of multicollinearity/influential outliers are important.
This document discusses linear regression analysis. It defines simple and multiple linear regression, and explains that regression examines the relationship between independent and dependent variables. The document provides the equations for linear regression analysis, and discusses calculating the slope, intercept, standard error of the estimate, and coefficient of determination. It explains that regression analysis is widely used for prediction and forecasting in areas like advertising and product sales.
The document discusses two common measures of the relationship between two sets of scores: Pearson's Product-Moment Correlation and Spearman's Rho. Pearson's correlation measures the linear relationship between metric variables and involves calculating the covariance between the variables and dividing by the product of their standard deviations. Spearman's Rho measures the monotonic relationship between ordinal or ranked variables and involves calculating the difference between the ranks of each variable and finding the average squared difference. Both measures result in a correlation coefficient between -1 and 1, where values closer to 1 or -1 indicate a strong relationship and values near 0 indicate no relationship.
This document provides an introduction to regression and correlation analysis. It discusses simple and multiple linear regression models, how to interpret regression coefficients, and how to check the assumptions and adequacy of regression models. Key aspects covered include computing the regression line using the least squares method, interpreting the slope and intercept, checking the normality of residuals, and examining residual plots to validate the model. The goal of regression analysis is to model the relationship between a dependent variable and one or more independent variables.
This document discusses various types and methods of measuring correlation between two variables. It describes correlation as a statistical tool to measure the degree of relationship between variables. Some key methods covered include scatter diagrams, Karl Pearson's coefficient of correlation, and Spearman's rank correlation coefficient. Positive and negative correlation examples are provided. The document also differentiates between simple, multiple, partial, and total correlation, as well as linear and non-linear correlation.
Finding the relationship between two quantitative variables without being able to infer causal relationships
Correlation is a statistical technique used to determine the degree to which two variables are related
This document provides an overview of regression models and analysis techniques. It introduces simple and multiple linear regression, as well as logistic regression. It discusses assessing regression models, cross-validation, model selection, and using regression models for prediction. Additionally, it covers the similarities and differences between linear and logistic regression, and assessing correlation without inferring causation. Scatter plots, correlation coefficients, and computing regression equations are also summarized.
- A sample is a small group selected from a population to represent that population. Sampling provides benefits like being less time-consuming, less expensive, and allowing results to be repeated.
- There are two main types of samples: probability and non-probability. Probability samples include simple random, systematic, stratified, and cluster samples. Sample size is determined based on factors like the type of study, expected results, costs, and available resources.
- Inferential statistics allow generalization from a sample to a population through hypothesis testing and significance tests. Tests include t-tests, F-tests, chi-squared tests, and correlation/regression to analyze relationships between variables. Significant results suggest differences are likely not due to chance
Logistic regression vs. logistic classifier. History of the confusion and the...Adrian Olszewski
Despite the wrong (yet widespread) claim, that "logistic regression is not a regression", it's one of the key regression tool in experimental research, like the clinical trials. It is used also for advanced testing hypotheses.
The logistic regression is part of the GLM (Generalized Linear Model) regression framework. I expanded this topic here: https://medium.com/@r.clin.res/is-logistic-regression-a-regression-46dcce4945dd
Regression analysis is a statistical technique for predicting a dependent variable based on one or more independent variables. Simple linear regression fits a straight line to the data to predict a continuous dependent variable (y) from a single independent variable (x). The output is an equation of the form y= b0 + b1x + ε, where b0 is the y-intercept, b1 is the slope, and ε is the error. Multiple linear regression extends this to include more than one independent variable. Regression analysis calculates the "best fit" line that minimizes the residuals, or differences between predicted and observed y values.
The document provides information about performing chi-square tests and choosing appropriate statistical tests. It discusses key concepts like the null hypothesis, degrees of freedom, and expected versus observed values. Examples are provided to illustrate chi-square tests for goodness of fit and comparison of proportions. The document also compares parametric and non-parametric tests, providing examples of when each would be used.
This document discusses correlation, regression, and issues that can arise when performing regression analysis. It defines correlation and covariance, and how to interpret a scatter plot. It explains how to test for statistical significance of correlation and establish if a linear relationship exists between variables. Simple and multiple linear regression are explained, including assumptions, model construction, and importance of regression coefficients. It discusses how to assess the importance of independent variables in explaining the dependent variable using t-tests, F-tests, R-squared, and adjusted R-squared. Potential issues like heteroskedasticity and multicollinearity are also summarized.
The document defines and provides information about correlation coefficients. It discusses how correlation coefficients measure the strength and direction of linear relationships between two variables. The range of correlation coefficients is from -1 to 1, where values closer to -1 or 1 indicate stronger linear relationships and a value of 0 indicates no linear relationship. It also provides the formula to calculate correlation coefficients and an example of calculating the correlation coefficient for age and blood pressure data.
The document provides an introduction to regression analysis and performing regression using SPSS. It discusses key concepts like dependent and independent variables, assumptions of regression like linearity and homoscedasticity. It explains how to calculate regression coefficients using the method of least squares and how to perform regression analysis in SPSS, including selecting variables and interpreting the output.
This document provides an overview of regression analysis, including what regression is, how it works, assumptions of regression, and how to assess the model fit and check assumptions. Regression allows us to predict a dependent variable from one or more independent variables. Key steps discussed include checking the normality, homoscedasticity and independence of residuals, identifying influential observations, and addressing issues like multicollinearity. Graphical methods like normal probability plots and scatter plots of residuals are presented as ways to check assumptions.
The document discusses correlation and regression analysis. It provides examples to calculate the simple correlation coefficient (r) between two quantitative variables, finding the correlation between weight and blood pressure using a sample data set. It also explains how to find the regression equation between two variables and use it to predict outcomes. For example, the regression equation is used to predict weight given age using another sample data set.
This document discusses linear correlation and linear regression. It defines linear correlation as showing the linear relationship between two continuous variables, while linear regression is a multivariate technique used when the outcome is continuous that provides slopes. Linear regression assumes a linear relationship between an independent and dependent variable, normally distributed dependent variable values, equal variances, and independence of observations. Least squares estimation is used to calculate the intercept and slope that minimize the squared differences between observed and predicted dependent variable values. The slope's significance can be tested using a t-test.
This document discusses linear correlation and linear regression. It defines linear correlation as showing the linear relationship between two continuous variables, while linear regression analyzes the relationship between a continuous outcome (dependent) variable and one or more independent (predictor) variables. Linear regression finds the line of best fit to model this relationship and estimates coefficients that can be tested for statistical significance. The assumptions of linear regression include a linear relationship between variables, normally distributed errors, homogeneity of variance, and independent observations.
This document discusses linear correlation and linear regression. It defines linear correlation as showing the linear relationship between two continuous variables, while linear regression analyzes the relationship between a continuous outcome (dependent) variable and one or more independent (predictor) variables. Linear regression finds the line of best fit to model this relationship and estimates coefficients that can be used to predict the outcome variable based on the independent variables. Key assumptions of linear regression include a linear relationship between variables, normally distributed errors, homogeneity of variance, and independence of observations. The significance of regression coefficients can be tested using t-tests and the standard error of the coefficients is also discussed.
Slideset Simple Linear Regression models.pptrahulrkmgb09
This document discusses linear correlation and linear regression. It defines linear correlation as showing the linear relationship between two continuous variables, while linear regression is a multivariate technique used when the outcome is continuous that provides slopes. Linear regression assumes a linear relationship between an independent and dependent variable, normally distributed dependent variable values, equal variances, and independence of observations. It estimates a slope and intercept through least squares estimation to minimize the squared distances between observed and predicted dependent variable values. The significance of the estimated slope can be tested using a t-test.
This document discusses linear correlation and linear regression. It defines linear correlation as showing the linear relationship between two continuous variables, while linear regression analyzes the relationship between a continuous outcome (dependent) variable and one or more independent (predictor) variables. Linear regression finds the line of best fit to model this relationship and estimates coefficients that can be tested for statistical significance. The assumptions of linear regression include a linear relationship between variables, normally distributed errors, homogeneity of variance, and independent observations.
This document discusses linear correlation and linear regression. It defines linear correlation as showing the linear relationship between two continuous variables, while linear regression is a multivariate technique used when the outcome is continuous that provides slopes. Linear regression assumes a linear relationship between an independent and dependent variable, normally distributed errors, equal variances, and independence of observations. The slope is estimated using least squares to minimize the squared differences between observed and predicted values of the dependent variable. Significance of the slope is tested using a t-test.
Lesson 27 using statistical techniques in analyzing datamjlobetos
The document discusses statistical techniques for analyzing data, including scatter diagrams, correlation coefficients, regression analysis, and chi-square tests. It provides examples of using scatter diagrams to visualize the relationship between two variables, calculating the Pearson correlation coefficient to determine the strength of linear relationships, and using simple linear regression to find the regression equation that best predicts a dependent variable from an independent variable. It also explains how to perform a chi-square test to analyze relationships between categorical variables by comparing observed and expected frequencies.
This document discusses linear correlation and linear regression. It defines linear correlation as showing the linear relationship between two continuous variables, while linear regression is a multivariate technique used when the outcome is continuous that provides slopes. Linear regression assumes a linear relationship between the predictor and outcome variables, normality of the outcome at each value of the predictor, equal variances of the outcome, and independence of observations. It also discusses calculating the slope and intercept via least squares estimation to find the line that best fits the data by minimizing residuals.
Finding the relationship between two quantitative variables without being able to infer causal relationships
Correlation is a statistical technique used to determine the degree to which two variables are related
This document provides an overview of regression models and analysis techniques. It introduces simple and multiple linear regression, as well as logistic regression. It discusses assessing regression models, cross-validation, model selection, and using regression models for prediction. Additionally, it covers the similarities and differences between linear and logistic regression, and assessing correlation without inferring causation. Scatter plots, correlation coefficients, and computing regression equations are also summarized.
- A sample is a small group selected from a population to represent that population. Sampling provides benefits like being less time-consuming, less expensive, and allowing results to be repeated.
- There are two main types of samples: probability and non-probability. Probability samples include simple random, systematic, stratified, and cluster samples. Sample size is determined based on factors like the type of study, expected results, costs, and available resources.
- Inferential statistics allow generalization from a sample to a population through hypothesis testing and significance tests. Tests include t-tests, F-tests, chi-squared tests, and correlation/regression to analyze relationships between variables. Significant results suggest differences are likely not due to chance
Logistic regression vs. logistic classifier. History of the confusion and the...Adrian Olszewski
Despite the wrong (yet widespread) claim, that "logistic regression is not a regression", it's one of the key regression tool in experimental research, like the clinical trials. It is used also for advanced testing hypotheses.
The logistic regression is part of the GLM (Generalized Linear Model) regression framework. I expanded this topic here: https://medium.com/@r.clin.res/is-logistic-regression-a-regression-46dcce4945dd
Regression analysis is a statistical technique for predicting a dependent variable based on one or more independent variables. Simple linear regression fits a straight line to the data to predict a continuous dependent variable (y) from a single independent variable (x). The output is an equation of the form y= b0 + b1x + ε, where b0 is the y-intercept, b1 is the slope, and ε is the error. Multiple linear regression extends this to include more than one independent variable. Regression analysis calculates the "best fit" line that minimizes the residuals, or differences between predicted and observed y values.
The document provides information about performing chi-square tests and choosing appropriate statistical tests. It discusses key concepts like the null hypothesis, degrees of freedom, and expected versus observed values. Examples are provided to illustrate chi-square tests for goodness of fit and comparison of proportions. The document also compares parametric and non-parametric tests, providing examples of when each would be used.
This document discusses correlation, regression, and issues that can arise when performing regression analysis. It defines correlation and covariance, and how to interpret a scatter plot. It explains how to test for statistical significance of correlation and establish if a linear relationship exists between variables. Simple and multiple linear regression are explained, including assumptions, model construction, and importance of regression coefficients. It discusses how to assess the importance of independent variables in explaining the dependent variable using t-tests, F-tests, R-squared, and adjusted R-squared. Potential issues like heteroskedasticity and multicollinearity are also summarized.
The document defines and provides information about correlation coefficients. It discusses how correlation coefficients measure the strength and direction of linear relationships between two variables. The range of correlation coefficients is from -1 to 1, where values closer to -1 or 1 indicate stronger linear relationships and a value of 0 indicates no linear relationship. It also provides the formula to calculate correlation coefficients and an example of calculating the correlation coefficient for age and blood pressure data.
The document provides an introduction to regression analysis and performing regression using SPSS. It discusses key concepts like dependent and independent variables, assumptions of regression like linearity and homoscedasticity. It explains how to calculate regression coefficients using the method of least squares and how to perform regression analysis in SPSS, including selecting variables and interpreting the output.
This document provides an overview of regression analysis, including what regression is, how it works, assumptions of regression, and how to assess the model fit and check assumptions. Regression allows us to predict a dependent variable from one or more independent variables. Key steps discussed include checking the normality, homoscedasticity and independence of residuals, identifying influential observations, and addressing issues like multicollinearity. Graphical methods like normal probability plots and scatter plots of residuals are presented as ways to check assumptions.
The document discusses correlation and regression analysis. It provides examples to calculate the simple correlation coefficient (r) between two quantitative variables, finding the correlation between weight and blood pressure using a sample data set. It also explains how to find the regression equation between two variables and use it to predict outcomes. For example, the regression equation is used to predict weight given age using another sample data set.
This document discusses linear correlation and linear regression. It defines linear correlation as showing the linear relationship between two continuous variables, while linear regression is a multivariate technique used when the outcome is continuous that provides slopes. Linear regression assumes a linear relationship between an independent and dependent variable, normally distributed dependent variable values, equal variances, and independence of observations. Least squares estimation is used to calculate the intercept and slope that minimize the squared differences between observed and predicted dependent variable values. The slope's significance can be tested using a t-test.
This document discusses linear correlation and linear regression. It defines linear correlation as showing the linear relationship between two continuous variables, while linear regression analyzes the relationship between a continuous outcome (dependent) variable and one or more independent (predictor) variables. Linear regression finds the line of best fit to model this relationship and estimates coefficients that can be tested for statistical significance. The assumptions of linear regression include a linear relationship between variables, normally distributed errors, homogeneity of variance, and independent observations.
This document discusses linear correlation and linear regression. It defines linear correlation as showing the linear relationship between two continuous variables, while linear regression analyzes the relationship between a continuous outcome (dependent) variable and one or more independent (predictor) variables. Linear regression finds the line of best fit to model this relationship and estimates coefficients that can be used to predict the outcome variable based on the independent variables. Key assumptions of linear regression include a linear relationship between variables, normally distributed errors, homogeneity of variance, and independence of observations. The significance of regression coefficients can be tested using t-tests and the standard error of the coefficients is also discussed.
Slideset Simple Linear Regression models.pptrahulrkmgb09
This document discusses linear correlation and linear regression. It defines linear correlation as showing the linear relationship between two continuous variables, while linear regression is a multivariate technique used when the outcome is continuous that provides slopes. Linear regression assumes a linear relationship between an independent and dependent variable, normally distributed dependent variable values, equal variances, and independence of observations. It estimates a slope and intercept through least squares estimation to minimize the squared distances between observed and predicted dependent variable values. The significance of the estimated slope can be tested using a t-test.
This document discusses linear correlation and linear regression. It defines linear correlation as showing the linear relationship between two continuous variables, while linear regression analyzes the relationship between a continuous outcome (dependent) variable and one or more independent (predictor) variables. Linear regression finds the line of best fit to model this relationship and estimates coefficients that can be tested for statistical significance. The assumptions of linear regression include a linear relationship between variables, normally distributed errors, homogeneity of variance, and independent observations.
This document discusses linear correlation and linear regression. It defines linear correlation as showing the linear relationship between two continuous variables, while linear regression is a multivariate technique used when the outcome is continuous that provides slopes. Linear regression assumes a linear relationship between an independent and dependent variable, normally distributed errors, equal variances, and independence of observations. The slope is estimated using least squares to minimize the squared differences between observed and predicted values of the dependent variable. Significance of the slope is tested using a t-test.
Lesson 27 using statistical techniques in analyzing datamjlobetos
The document discusses statistical techniques for analyzing data, including scatter diagrams, correlation coefficients, regression analysis, and chi-square tests. It provides examples of using scatter diagrams to visualize the relationship between two variables, calculating the Pearson correlation coefficient to determine the strength of linear relationships, and using simple linear regression to find the regression equation that best predicts a dependent variable from an independent variable. It also explains how to perform a chi-square test to analyze relationships between categorical variables by comparing observed and expected frequencies.
This document discusses linear correlation and linear regression. It defines linear correlation as showing the linear relationship between two continuous variables, while linear regression is a multivariate technique used when the outcome is continuous that provides slopes. Linear regression assumes a linear relationship between the predictor and outcome variables, normality of the outcome at each value of the predictor, equal variances of the outcome, and independence of observations. It also discusses calculating the slope and intercept via least squares estimation to find the line that best fits the data by minimizing residuals.
Gender and Mental Health - Counselling and Family Therapy Applications and In...PsychoTech Services
A proprietary approach developed by bringing together the best of learning theories from Psychology, design principles from the world of visualization, and pedagogical methods from over a decade of training experience, that enables you to: Learn better, faster!
Temple of Asclepius in Thrace. Excavation resultsKrassimira Luka
The temple and the sanctuary around were dedicated to Asklepios Zmidrenus. This name has been known since 1875 when an inscription dedicated to him was discovered in Rome. The inscription is dated in 227 AD and was left by soldiers originating from the city of Philippopolis (modern Plovdiv).
Walmart Business+ and Spark Good for Nonprofits.pdfTechSoup
"Learn about all the ways Walmart supports nonprofit organizations.
You will hear from Liz Willett, the Head of Nonprofits, and hear about what Walmart is doing to help nonprofits, including Walmart Business and Spark Good. Walmart Business+ is a new offer for nonprofits that offers discounts and also streamlines nonprofits order and expense tracking, saving time and money.
The webinar may also give some examples on how nonprofits can best leverage Walmart Business+.
The event will cover the following::
Walmart Business + (https://business.walmart.com/plus) is a new shopping experience for nonprofits, schools, and local business customers that connects an exclusive online shopping experience to stores. Benefits include free delivery and shipping, a 'Spend Analytics” feature, special discounts, deals and tax-exempt shopping.
Special TechSoup offer for a free 180 days membership, and up to $150 in discounts on eligible orders.
Spark Good (walmart.com/sparkgood) is a charitable platform that enables nonprofits to receive donations directly from customers and associates.
Answers about how you can do more with Walmart!"
Leveraging Generative AI to Drive Nonprofit InnovationTechSoup
In this webinar, participants learned how to utilize Generative AI to streamline operations and elevate member engagement. Amazon Web Service experts provided a customer specific use cases and dived into low/no-code tools that are quick and easy to deploy through Amazon Web Service (AWS.)
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) CurriculumMJDuyan
(𝐓𝐋𝐄 𝟏𝟎𝟎) (𝐋𝐞𝐬𝐬𝐨𝐧 𝟏)-𝐏𝐫𝐞𝐥𝐢𝐦𝐬
𝐃𝐢𝐬𝐜𝐮𝐬𝐬 𝐭𝐡𝐞 𝐄𝐏𝐏 𝐂𝐮𝐫𝐫𝐢𝐜𝐮𝐥𝐮𝐦 𝐢𝐧 𝐭𝐡𝐞 𝐏𝐡𝐢𝐥𝐢𝐩𝐩𝐢𝐧𝐞𝐬:
- Understand the goals and objectives of the Edukasyong Pantahanan at Pangkabuhayan (EPP) curriculum, recognizing its importance in fostering practical life skills and values among students. Students will also be able to identify the key components and subjects covered, such as agriculture, home economics, industrial arts, and information and communication technology.
𝐄𝐱𝐩𝐥𝐚𝐢𝐧 𝐭𝐡𝐞 𝐍𝐚𝐭𝐮𝐫𝐞 𝐚𝐧𝐝 𝐒𝐜𝐨𝐩𝐞 𝐨𝐟 𝐚𝐧 𝐄𝐧𝐭𝐫𝐞𝐩𝐫𝐞𝐧𝐞𝐮𝐫:
-Define entrepreneurship, distinguishing it from general business activities by emphasizing its focus on innovation, risk-taking, and value creation. Students will describe the characteristics and traits of successful entrepreneurs, including their roles and responsibilities, and discuss the broader economic and social impacts of entrepreneurial activities on both local and global scales.
How to Make a Field Mandatory in Odoo 17Celine George
In Odoo, making a field required can be done through both Python code and XML views. When you set the required attribute to True in Python code, it makes the field required across all views where it's used. Conversely, when you set the required attribute in XML views, it makes the field required only in the context of that particular view.
A Visual Guide to 1 Samuel | A Tale of Two HeartsSteve Thomason
These slides walk through the story of 1 Samuel. Samuel is the last judge of Israel. The people reject God and want a king. Saul is anointed as the first king, but he is not a good king. David, the shepherd boy is anointed and Saul is envious of him. David shows honor while Saul continues to self destruct.
1. 1
psyc3010 lecture 8
standard and hierarchical multiple regression
last week: correlation and regression
Next week: moderated regression
2. 2
last week we revised correlation & regression
and took a look at some of the underlying
principles of these methods [partitioning
variance into SS regression (Ŷ - Y) and SS
residual (Y - Ŷ).]
We extended these ideas to the multiple
predictor case (multiple regression) and
touched upon indices of predictor importance
these week we go through two full examples of
multiple regression
– standard regression
– heirarchical regression
last week this week
3. 3
Indices of predictor importance:
r [Pearson or zero-order correlation] – a scale free measure of
association – the standardised covariance between two factors
r2 [the coefficient of determination] – the proportion of variability in one
factor (e.g., the DV) accounted for by another (e.g. an IV).
b [unstandardised slope or unstandardised regression coefficient] – a
scale dependent measure of association, the slope of the regression line –
the change in units of Y expected with a 1 unit increase in X
[standardised slope or standardised regression coefficient] – a scale
free measure of association, the slope of the regression line if all variables
are standardised – the change in standard deviations in Y expected with a
1 standard deviation increase in X, controlling for all other predictors. = r
in bivariate regression (when there is only one IV).
pr2 [partial correlation squared] – a scale free measure of association
controlling for other IVs -- the proportion of residual variance in the DV
(after other IVs are controlled for) uniquely accounted for by the IV.
sr2 [semi-partial correlation squared] – a scale free measure of
association controlling for other IVs -- the proportion of total variance in
the DV uniquely accounted for by the IV.
4. 4
Comparing the different rs
The zero-order (Pearson’s) correlation between IV
and DV ignores extent to which IV is correlated with
other IVs.
The semi-partial correlation deals with unique effect of
IV on total variance in DV – usually what we are
interested in.
– Conceptually similar to eta squared
– Confusion alert: in SPSS the semi-partial r is called the part
correlation. No one else does this though.
The partial correlation deals with unique effect of the IV
on residual variance in DV. More difficult to interpret –
most useful when other IVs = control variables.
– Conceptually similar to „partial eta squared‟
Generally r > spr and pr > spr
7. 7
the linear model – one predictor
(2D space)
predictor (X)
criterion
(Y)
Ŷ = bX + a
8. 8
the linear model – two predictors
(3D space)
criterion
(Y)
Ŷ = b1X1 + b2X2 + a
9. 9
the linear model – 2 predictors
criterion scores are predicted using the best
linear combination of the predictors
– similar to the line-of-best-fit idea, but it becomes the
plane-of-best-fit
– equation derived according to the least-squares
criterion – such that (Y-Ŷ)2 is minimized
• b1 is the slope of the plane relative to the X1 axis,
• b2 is the slope relative to the X2 axis,
• a is the point where the plane intersects the Y axis (when X1
and X2 are equal to zero)
the idea extends to 3+ predictors but becomes
tricky to represent graphically (i.e., hyperspace)
10. 10
example
new study...examine the amount of variance in
academic achievement (GPA) accounted for
by…
– Minutes spent studying per week (questionnaire measure)
– motivation (questionnaire measure)
– anxiety (questionnaire measure)
can use multiple regression to asses how much
variance the predictors explain as a set (R2 )
can also assess the relative importance of each
predictor (r, b, , pr2, sr2).
12. 12
preliminary statistics
Mean SD N alpha
study time 97.967 8.915 30 .88
motivation 14.533 4.392 30 .75
anxiety 4.233 1.455 30 .85
GPA 5.551 2.163 30 .82
ST MOT ANX GPA
study time 1.00
motivation .313 1.00
anxiety .256 .536 1.00
GPA .637 .653 .505 1.00
Descriptive Statistics
Correlations
13. 13
preliminary statistics
Mean SD N alpha
study time 97.967 8.915 30 .88
motivation 14.533 4.392 30 .75
anxiety 4.233 1.455 30 .85
GPA 5.551 2.163 30 .82
IQ MOT ANX GPA
IQ 1.00
motivation .313 1.00
anxiety .256 .536 1.00
GPA .637 .653 .505 1.00
Descriptive Statistics
Correlations
means and standard deviations are used to obtain
regression estimates, and are reported as
preliminary stats when one conducts MR. They
are needed to interpret coefficients, although
descriptively they are not as critical in MR as they
are for t-tests and anova
14. 14
preliminary statistics
Mean SD N alpha
IQ 97.967 8.915 30 .88
motivation 14.533 4.392 30 .75
anxiety 4.233 1.455 30 .85
GPA 5.551 2.163 30 .82
ST MOT ANX GPA
study time 1.00
motivation .313 1.00
anxiety .256 .536 1.00
GPA .637 .653 .505 1.00
Descriptive Statistics
Correlations
Cronbach’s is an index of internal
consistency (reliability) for a continuous
scale
best to use scales with high reliability (
>.70) if available – less error variance
15. 15
preliminary statistics
Mean SD N alpha
IQ 97.967 8.915 30 .88
motivation 14.533 4.392 30 .75
anxiety 4.233 1.455 30 .85
GPA 75.533 15.163 30 .82
IQ MOT ANX GPA
IQ 1.00
motivation .313 1.00
anxiety .256 .536 1.00
GPA .637 .653 .505 1.00
Descriptive Statistics
Correlations
the correlation matrix, tells you the extent to
which each predictor is related to the criterion
(called validities), as well as intercorrelations
among predictors (collinearities).
to maximise R2 we want predictors that have high
validities and low collinearity
19. 19
regression
solution
calculation for multiple regression requires the solution of a set of
parameters (one slope for each predictor – b values)
E.g. with 2 IVs, the 2 slopes define the plane-of-best-fit that goes
through the 3-dimensional space described by plotting the DV
against each IVs
Pick bs so that deviations of dots from the plane are minimized
– these weights are derived through matrix algebra - beyond the scope
of this course
understand how with one variable, we model Y hat with a line
described by 2 parameters (bX + a);
with two, model Y hat as a plane described by 3 parameters
(b1X1 + b2X2 + a)
with p predictors, model Yhat as a p-dimensional hyperspace blob
with p + 1 parameters (constant, and a slope for each IV).
So Ŷ, the predicted value of Y, is modeled with a linear composite
formed by multiplying each predictor by its regression weight /
slope / coefficient (just like a linear contrast) and adding the
constant:
Ŷ = .79ST + 1.45MOT + 1.68ANX – 95.02
the criterion (GPA) is regressed on this linear composite
Ŷ = b1X1 + b2X2 + a
20. 20
the linear model – two predictors
(3D space)
criterion
(Y)
Ŷ = b1X1 + b2X2 + a
22. 22
the linear composite
criterion
Ŷ
…so we end up with two overlapping variables just like
in bivariate regression (only one is blue and weird and wibbly,
graphically symbolising that underlying the linear relationship
between the DV and Y hat, the linear composite, is a 4-dimensional
space defined by the 3 IVs and the DV)
23. 23
the model: R and R2
Despite the underlying complexity, the multiple
correlation coefficient (R) is just a bivariate correlation
between the criterion (GPA) and the best linear
combination of the predictors (Ŷ)
i.e., R2 = r2
YŶ
where Ŷ = .79ST + 1.45MOT + 1.68ANX – 95.02
accordingly, we can treat the model R exactly like r, ie:
i. calculate R adjusted:
ii. square R to obtain amount of variance accounted for in Y by
our linear composite (Ŷ)
iii. test for statistical significance
2
N
)
1
N
)(
R
1
(
1
2
24. 24
the model: R and R2
1. In this example, R = .81
– so R adj =
= .798
2. R2 = .65 (.638 adjusted)
“…therefore, 65% of the variance in participants’
GPA was explained by the combination of their
study time, motivation, and anxiety.”
2
30
)
1
30
)(
65
.
1
(
1
25. 25
3. The overall model (R2) is tested for significance –
– H0 – the relationship between the predictors (as a group) and
the criterion is zero
– H1 – the relationship between the predictors (as a group) and
the criterion is different from zero
the model: R and R2
23
.
16
)
6518
.
1
(
3
6518
.
)
1
3
30
(
)
R
1
(
p
R
)
1
p
N
(
2
2
F
1
p
,
p
N
df
r N 2
1 r2
t =
Reminder of t-test for r
26. 26
F = df = p, N – p – 1
=
= variance accounted for / df
variance not accounted for (error) / df
= MS REGRESSION
MS RESIDUAL
Test of R2
(analysis of regression)
)
1
(
)
1
(
2
2
R
p
R
p
N
)
1
/(
)
1
(
/
2
2
p
N
R
p
R
What we know
(can account for)
What we don’t know
(can’t account for)
27. 27
Or perform same test via analysis of regression:
SSY = SSRegression + SSResidual
SSY = (Y-Y)2 = (5.5 - 5.551)2 + (5.7 - 5.551)2 …
= 6667.46
SSRegression = (Ŷ - Y)2 = (6.22 – 5.551)2 + …
= 4346.03
SSResidual = (Y - Ŷ)2
= SSY - SSRegression = 6667.46 - 4346.03
= 2321.43
the model: R and R2
28. 28
Summary Table for Analysis of Regression:
the model: R and R2
1
p
,
p
N
df
Model
Sums of
Squares df
Mean
Square F sig
Regression 4346.03 3 1448.68 16.23 .000
Residual 2321.43 26 89.29
Total 6667.46 29
The model including study time, motivation, and
anxiety accounted for significant variation in
participants’ GPA, F(3, 26) = 16.23, p < .001, R2 = .65.
30. 30
individual predictors
we already have our bivariate correlations (r)
between each predictor and the criterion. In
addition, SPSS gives us:
– b – (unstandardised) partial regression coefficient
– - standardised partial regression coefficient
– pr – partial correlation coefficient
– sr – semi-partial correlation coefficient
(as the calculations for these are all matrix algebra we
will bypass that….)
31. 31
criterion
Predictor p
predictor2
pr2
pr – partial correlation coefficient
pr is the correlation between predictor p
and the criterion, with the variance shared
with the other predictors partialled out
Can write r01.2 [partial r between 0 and 1
excluding shared variance with 2]
pr2 indicates the proportion of residual
variance in the criterion (DV variance left
unexplained by the other predictors) that is
explained by predictor p
prST = .581; prIQ
2 = 33.7%
prMOT = .562 ; prMOT
2 = 31.5%
prANX = .293; prANX
2 = 8.5%.
32. 32
predictor2
sr – semi-partial correlation coefficient
sr2
predictorp
criterion
sr is the correlation between predictor p
and the criterion, with the variance shared
with the other predictors partialled out
of predictor p
Can write r0(1.2) [partial r between 0 and (1
excluding 2)]
sr2 indicates the unique contribution to the
total variance in the DV explained by
predictor p
srST = .469; srIQ
2 = 21.9%
srMOT = .411; srMOT
2 = 16.9%
srANX = .224; srANX
2 = 5%
shared variance ≈ 21% (R2 - ∑sr2, 65-44%)
33. 33
Ŷ = b1X1+ b2X2 + a
Ŷ = bY1.2X1 + bY2. 1X2 + a
bY1.2 first-order coefficient
bY1.2 ≠ bY1 unless r12 = 0
Ŷ = b1X1 + b2X2 + b3X3 + a
Ŷ = bY1.23X1 + bY2.13X2 + bY3.12X3 + a
bY1.23 second-order coefficient
Zero-order coefficient – doesn‟t
take other IVs into account
First-order coefficient – takes
1 other IV into account
Takes 2 other IVs into account
All reported coefficients (e.g. in SPSS) are highest order coefficients
34. 34
tests of bs:
test importance of the predictor in the context of all
the other predictors
divide b by its standard error. df = N – p – 1
tb1 = .789247 = 3.785* ST
.208497
tb2 = 1.453540 = 3.000* MOT
.484508
tb3 = 1.678871 = 1.168 ANX
1.437221
ST contributes significantly to prediction of DV, after controlling for the other
predictors, and so does MOT
though a valid zero-order predictor of DV, anx does not contribute to the
prediction, given ST and MOT
35. 35
Importance of predictors
can't rely on rs (zero-order), because the
predictors are interrelated
(predictor with a significant r may contribute nothing, once others are
included; e.g., ANX)
partial regression coefficient (bs):
adjusted for correlation of the predictor with the other
predictors
but
can't use relative magnitude of bs, because scale-
bound
(importance of a given b depends on unit and variability of measure)
36. 36
Standardized regression
coefficients (β s):
rough estimate of relative contribution of
predictors, because use same metric
can compare β s within a regression
equation (but not necessarily across
groups & settings – in that standard
deviation of variables change)
37. 37
β 1 = b1 .
when IVs are not correlated:
β = r
when IVs are correlated:
β s (magnitudes, signs) are affected by pattern of
correlations among the predictors
ZY = β1Z1 + β2Z2 + β3Z3 +... + βpZp
ZY = .46 ZST + .42 ZMOT + .16 ZANX
a one-SD increase in ST (with all other variables held
constant) is associated with an increase of .46 SDs in DV
Standardized regression
coefficients:
Y
s
s1
39. 39
standard
– all predictors are entered simultaneously
– each predictor is evaluated in terms of what it adds
to prediction beyond that afforded by all others
– most appropriate when IVs are not intercorrelated
hierarchical
– predictors are entered sequentially in a pre-
specified order
– each predictor is evaluated in terms of what it adds
to prediction at its point of entry
– order of prediction based upon logic/theory
standard vs hierarchical regression
40. 40
standard vs hierarchical
multiple regression
criterion
predictor1
predictor2
model
predictor1
predictor2
criterion
standard multiple regression:
•Model R2 assessed in 1 step
•b for each IV based on unique contribution only
IV1
IV2
41. 41
standard vs hierarchical
multiple regression
predictor2
predictor1
predictor2
criterion
step 1
step
2
IV1 in
block1
IV2
hierarchical multiple regression:
• Model R2 assessed in > 1 step
• Each step (“block”) add more IVs
• b for first IV based on total contribution; later IV on unique contribution
predictor1
predictor2
criterion
step 1
step
2
step 1
step
2
42. 42
some rationales for order of entry:
1. to partial out the effect of a control variable not of interest to
the study
– exactly the same idea as ancova – your „covariate‟ in this
case is the predictor entered at step 1
2. to build a sequential model according to some theory
– e.g., broad measure of personality entered at step 1, more
specific/narrow attitudinal measure entered at step 2
order of entry is crucial to outcome and interpretation
predictors can be entered singly or in blocks of >1
now we will have an R, R2, b, , pr2 sr2 for EACH step
to report
also test increment in prediction at each block:
– R2 change
– F change
hierarchical regression
45. 45
testing hierarchical models
= full(er) model [with more variables added]
r = reduced model
)
1
/
)
1
(
)
/(
)
R
-
(R
2
2
2
f
r
f
p
N
f
R
p
p
r
f
Fchange
1
N
,
df
f
r
f p
p
p
r
R
f
R
change
R 2
2
2
47. 47
suppose we wanted to repeat our GPA study using
hierarchical regression..
further suppose our real interest was motivation and
study time, we just wanted to control for anxiety:
– enter anxiety at step 1
– enter motivation and study time at step 2
preliminary statistics would be same as before
model would be assessed sequentially
– step 1 – prediction by anxiety
– step 2 – prediction by motivation and study time
above and beyond that explained by anxiety
an example:
48. 48
model summary
Model R R2 R2
adj
R2 ch F ch df1 df2 sig F
ch
1 .505 .255 .228 .255 9.584 1 28 .004
2 .813 .652 .612 .397 14.836 2 26 .000
change statistics
for model 1 – R and R2 are the same as bivariate
r between GPA and Anxiety (as anxiety is the only
variable in the model).
49. 49
model summary
Model R R2 R2
adj
R2 ch F ch df1 df2 sig F
ch
1 .505 .255 .228 .255 9.584 1 28 .004
2 .813 .652 .612 .397 14.836 2 26 .000
change statistics
here R2 ch is just the same as R2 because it
simply reflects the change from zero.
50. 50
model summary
Model R R2 R2
adj
R2 ch F ch df1 df2 sig F
ch
1 .505 .255 .228 .255 9.584 1 28 .004
2 .813 .652 .612 .397 14.836 2 26 .000
change statistics
for model 2 – R and R2 are the same as our full
standard multiple regression conducted earlier.
51. 51
model summary
Model R R2 R2
adj
R2 ch F ch df1 df2 sig F
ch
1 .505 .255 .228 .255 9.584 1 28 .004
2 .813 .652 .612 .397 14.836 2 26 .000
change statistics
R2 ch tells us that by including study time and
motivation we increase the amount of variance
accounted for in GPA by 40%
(this is the critical bit!)
52. 52
model summary
Model R R2 R2
adj
R2 ch F ch df1 df2 sig F
ch
1 .505 .255 .228 .255 9.584 1 28 .004
2 .813 .652 .612 .397 14.836 2 26 .000
change statistics
alternatively, R2 ch tells us that after controlling
for anxiety, study time and motivation explain
40% of the variance in GPA
53. 53
model summary
Model R R2 R2
adj
R2 ch F ch df1 df2 sig F
ch
1 .505 .255 .228 .255 9.584 1 28 .004
2 .813 .652 .612 .397 14.836 2 26 .000
change statistics
… and F ch tells us that this increment in the
variance accounted is significant
(null hyp: R2 ch = 0)
54. 54
Summary Table for Analysis of Regression:
anova
1
p
,
p
N
df
Model
Sums of
Squares df
Mean Square
F sig
1 Regression 1702.901 1 1702.901 9.584 .004
Residual 4964.567 28 177.306
Total 6667.46 29
2 Regression 4346.03 3 1448.68 16.23 .000
Residual 2321.43 26 89.29
Total 6667.46 29
details for model 1 are just the same as those
reported in the change statistics section on the
previous page (as the change was relative to zero)
55. 55
Summary Table for Analysis of Regression:
anova
1
p
,
p
N
df
Model
Sums of
Squares df
Mean Square
F sig
1 Regression 1702.901 1 1702.901 9.584 .004
Residual 4964.567 28 177.306
Total 6667.46 29
2 Regression 4346.03 3 1448.68 16.23 .000
Residual 2321.43 26 89.29
Total 6667.46 29
details for model 2 test the overall significance of the
model (and are therefore exactly the same as we
would get if we had done a standard regression)
56. 56
coefficients
1
p
,
p
N
df
Model B SE t sig
1 constant -80.233 7.595 7.009 .000
ANX 5.268 1.700 .505 3.009 .004
2 constant -95.02 3 1448.68 16.23 .000
ANX 1.678 1.437 .16 1.168 .253
ST .789 .208 .42 3.785 .000
MOT 1.453 .484 .46 3.000 .005
model 1 shows the coefficients for anxiety as the
predictor of GPA (i.e., the variables included at step 1)
57. 57
coefficients
Model B SE t sig
1 constant -80.233 7.595 7.009 .000
ANX 5.268 1.700 .505 3.009 .004
2 constant -95.02 3 16.23 .000
ANX 1.678 1.437 .16 1.168 .253
ST .789 .208 .42 3.785 .000
MOT 1.453 .484 .46 3.000 .005
model 2 is identical to the coefficients table we
would get in standard multiple regression if all
predictors were entered simultaneously
58. 58
summary of results
step R2 F R2 ch F ch
1 ANX .255 9.604* .255 9.584*
2 ST .651 16.23* .397 14.836*
MOT
59. 59
some uses for hierarchical multiple
regression (HMR)
to control for nuisance variables
– as we have done now
– logic is same as for ancova
to test mediation (briefly covered next week)
to test moderated relationships (interactions)
Ŷ = b1X1 + b2X2 + b3X1X2 + c
60. 60
Difference between structure of
Standard and Hierarchical MR tests
Hierarchical Multiple Regression:
1. Tests overall model automatically
2. Tests each Block (subgrouping of
variables) separately (2 sets of Fs)
3. Tests unique effect of each IV for
variables in this block and earlier – but
βs don‟t exclude overlapping variance
with variables in later blocks
4. Does not test for interactions
automatically – but use HMR to test
manually (moderated MR next week)
5. Report each block R2 change with F
test, plus IVs‟ βs with t-tests from each
block as entered, plus final model R2
with F test, plus relevant follow-ups.
6. Depending on theory may or may not
report betas for IVs from earlier blocks
again if they change in later blocks
- Usually not for if early block = control
- Definitely yes if mediation test
Standard Multiple Regression:
1. Tests overall model R2
automatically
2. Does not test
subgroupings of variables
(Blocks)
3. Tests unique effect of
each IV (i.e., covariation
of residual DV scores with
IV once all other IVs‟
effects are controlled
(partialled out))
4. Does not test for
interactions automatically
5. Report Model R2 with F
test, plus each IVs‟ βs
with t-tests, plus relevant
follow-ups
61. 61
multicollinearity and singularity
– this condition occurs when predictors are highly correlated
(>.80 - 90)
– diagnosed with high intercorrelations of IVs (collinearities) and a
statistic called tolerance
– tolerance = (1 - R2
x)
– R2
x is the overlap between a particular predictor and all the
other predictors
• low tolerance = multicollinearity singularity
• high tolerance = relatively independent predictors
– multicollinearity leads to unstable calculation of regression
coefficients (b), even though R2 may be significant
Some additional info about suppressor variables,
handling missing data, and cross-validation is provided
in the “Practice Materials” section of the web site
some issues in SMR & HMR
62. 62
assumptions of multiple regression
distribution of residuals
– normality: conditional array of Y values are normally distributed
around Ŷ (assumption of normality in arrays)
– homoscedasticity: variance of Y values are constant across
different values of Ŷ (assumption of “homogeneity of variance in
arrays”)
– linearity: relationship between Ŷ and errors of prediction
– independence of errors
scales (predictor and criterion scores)
– normality (variables are normally distributed), linearity (there is
a straight line relationship between predictors and criterion)
predictors are not singular (extremely highly correlated)
– measured using a continuous scale (interval or ratio)
63. 63
In class next week:
Moderated multiple regression
Assignment 2
In the tutes:
This week: Multiple regression, SPSS
In 2 weeks: Moderated regression, SPSS
readings :
Howell Ch 15
Field Ch 5