This document discusses correlation and defines it as the statistical relationship between two variables, where a change in one variable results in a corresponding change in the other. It describes different types of correlation including positive, negative, simple, partial and multiple. Methods for studying correlation are also outlined, including scatter diagrams and Karl Pearson's coefficient of correlation (represented by r), which quantifies the strength and direction of the linear relationship between two variables from -1 to 1. The coefficient of determination (r2) is also introduced, which expresses the proportion of variance in one variable that is predictable from the other.
The document discusses the concept of correlation, specifically linear correlation. It provides definitions of correlation from various sources and explains that correlation refers to the relationship between two or more variables. The degree of this relationship is measured by the correlation coefficient. Common types of correlation are discussed such as positive and negative correlation. Methods for studying correlation are also outlined, including scatter diagrams and Karl Pearson's coefficient of correlation.
The document discusses correlation and regression analysis. It defines correlation as the statistical relationship between two variables, where a change in one variable corresponds to a change in the other. The key types of correlation are positive, negative, simple, partial and multiple, and linear and non-linear. Regression analysis establishes the average relationship between an independent and dependent variable in order to predict or estimate values of the dependent variable based on the independent variable. Methods for studying correlation include scatter diagrams and Karl Pearson's coefficient of correlation, while regression analysis uses equations to model the linear relationship between variables.
Correlation analysis examines the relationship between two or more variables. Positive correlation means the variables increase together, while negative correlation means they change in opposite directions. The Pearson correlation coefficient, r, quantifies the strength of linear correlation between -1 and 1. Multiple correlation analysis extends this to measure the correlation between one dependent variable and multiple independent variables. It is useful but assumes linear relationships and can be complex to calculate.
This document is a presentation by Dwaiti Roy on partial correlation. It begins with an acknowledgement section thanking various professors and resources that helped in preparing the presentation. It then provides definitions and explanations of key concepts related to partial correlation such as correlation, assumptions of correlation, coefficient of correlation, coefficient of determination, variates, partial correlation, assumptions and hypothesis of partial correlation, order and formula of partial correlation. Examples are provided to illustrate partial correlation. The document concludes with references and suggestions for further reading.
To get a copy of the slides for free Email me at: japhethmuthama@gmail.com
You can also support my PhD studies by donating a 1 dollar to my PayPal.
PayPal ID is japhethmuthama@gmail.com
Correlation and regression analysis are statistical tools used to analyze relationships between variables. Correlation measures the strength and direction of association between two variables on a scale from -1 to 1. Regression analysis uses one variable to predict the value of another variable and draws a best-fit line to represent their relationship. There are always two lines of regression - one showing the regression of x on y and the other showing the regression of y on x. Regression coefficients from these lines indicate the slope and intercept of the lines and can help estimate unknown variable values based on known values.
1. The document discusses correlation and correlation coefficients, which measure the strength and direction of association between two variables.
2. A correlation coefficient ranges from 0, indicating no correlation, to 1 or -1, indicating perfect positive or negative correlation. Coefficients above 0.5 generally indicate a strong linear relationship.
3. The Pearson correlation coefficient (r) specifically measures the linear correlation between two normally distributed variables, while the Spearman correlation (rs) is nonparametric and assesses correlation between ordinal or non-normally distributed variables.
4. Correlation only indicates association, not causation. Significant correlation is also not necessarily clinically meaningful. Correlation coefficients and their statistical significance must be interpreted carefully.
The document discusses the concept of correlation, specifically linear correlation. It provides definitions of correlation from various sources and explains that correlation refers to the relationship between two or more variables. The degree of this relationship is measured by the correlation coefficient. Common types of correlation are discussed such as positive and negative correlation. Methods for studying correlation are also outlined, including scatter diagrams and Karl Pearson's coefficient of correlation.
The document discusses correlation and regression analysis. It defines correlation as the statistical relationship between two variables, where a change in one variable corresponds to a change in the other. The key types of correlation are positive, negative, simple, partial and multiple, and linear and non-linear. Regression analysis establishes the average relationship between an independent and dependent variable in order to predict or estimate values of the dependent variable based on the independent variable. Methods for studying correlation include scatter diagrams and Karl Pearson's coefficient of correlation, while regression analysis uses equations to model the linear relationship between variables.
Correlation analysis examines the relationship between two or more variables. Positive correlation means the variables increase together, while negative correlation means they change in opposite directions. The Pearson correlation coefficient, r, quantifies the strength of linear correlation between -1 and 1. Multiple correlation analysis extends this to measure the correlation between one dependent variable and multiple independent variables. It is useful but assumes linear relationships and can be complex to calculate.
This document is a presentation by Dwaiti Roy on partial correlation. It begins with an acknowledgement section thanking various professors and resources that helped in preparing the presentation. It then provides definitions and explanations of key concepts related to partial correlation such as correlation, assumptions of correlation, coefficient of correlation, coefficient of determination, variates, partial correlation, assumptions and hypothesis of partial correlation, order and formula of partial correlation. Examples are provided to illustrate partial correlation. The document concludes with references and suggestions for further reading.
To get a copy of the slides for free Email me at: japhethmuthama@gmail.com
You can also support my PhD studies by donating a 1 dollar to my PayPal.
PayPal ID is japhethmuthama@gmail.com
Correlation and regression analysis are statistical tools used to analyze relationships between variables. Correlation measures the strength and direction of association between two variables on a scale from -1 to 1. Regression analysis uses one variable to predict the value of another variable and draws a best-fit line to represent their relationship. There are always two lines of regression - one showing the regression of x on y and the other showing the regression of y on x. Regression coefficients from these lines indicate the slope and intercept of the lines and can help estimate unknown variable values based on known values.
1. The document discusses correlation and correlation coefficients, which measure the strength and direction of association between two variables.
2. A correlation coefficient ranges from 0, indicating no correlation, to 1 or -1, indicating perfect positive or negative correlation. Coefficients above 0.5 generally indicate a strong linear relationship.
3. The Pearson correlation coefficient (r) specifically measures the linear correlation between two normally distributed variables, while the Spearman correlation (rs) is nonparametric and assesses correlation between ordinal or non-normally distributed variables.
4. Correlation only indicates association, not causation. Significant correlation is also not necessarily clinically meaningful. Correlation coefficients and their statistical significance must be interpreted carefully.
Discriminant analysis (DA) is a statistical technique used to predict group membership when the dependent variable is categorical and the independent variables are continuous. It identifies which variables discriminate between two or more naturally occurring groups. DA develops a linear equation to predict group membership based on weighted combinations of predictor variables. It aims to maximize the distance between group means to achieve strong discriminatory power. Like regression, DA assumes variables are normally distributed, cases are randomly sampled, and groups are mutually exclusive and collectively exhaustive. It requires at least two groups with minimal overlap and similar group sizes of at least five cases. DA can classify new cases into groups based on the discriminant functions derived from existing data.
The document discusses correlation analysis and different types of correlation. It defines correlation as the linear association between two random variables. There are three main types of correlation:
1) Positive vs negative vs no correlation based on the relationship between two variables as one increases or decreases.
2) Linear vs non-linear correlation based on the shape of the relationship when plotted on a graph.
3) Simple vs multiple vs partial correlation based on the number of variables.
The document also discusses methods for studying correlation including scatter plots, Karl Pearson's coefficient of correlation r, and Spearman's rank correlation coefficient. It provides interpretations of the correlation coefficient r and coefficient of determination r2.
Correlation and Regression Analysis using SPSS and Microsoft ExcelSetia Pramana
This document discusses correlation and linear regression analysis. It covers correlation coefficients, linear relationships between variables, assumptions of linear regression, and using SPSS and Excel to conduct correlation and regression analyses. Pearson and Spearman correlation coefficients are introduced as measures of the linear association between two continuous variables. Simple and multiple linear regression models are explained as tools to predict an outcome variable from one or more predictor variables.
Linear regression and correlation analysis ppt @ bec domsBabasab Patil
This document introduces linear regression and correlation analysis. It discusses calculating and interpreting the correlation coefficient and linear regression equation to determine the relationship between two variables. It covers scatter plots, the assumptions of regression analysis, and using regression to predict and describe relationships in data. Key terms introduced include the correlation coefficient, linear regression model, explained and unexplained variation, and the coefficient of determination.
Fundamental of Statistics and Types of CorrelationsRajesh Verma
This document provides an overview of key concepts in statistics including parametric vs non-parametric statistics, descriptive vs inferential statistics, types of errors, significance levels, correlation, and different correlation coefficients. Parametric statistics rely on assumptions of the normal distribution while non-parametric do not. Descriptive statistics describe data and inferential statistics draw conclusions. Type I and II errors occur when the null hypothesis is incorrectly rejected or not rejected. Significance levels like 0.05 are used to determine statistical significance. Correlation measures the relationship between variables from -1 to 1. Different coefficients like Pearson, Spearman, and Kendall's Tau are used depending on the scale of measurement and data distribution.
Overviews non-parametric and parametric approaches to (bivariate) linear correlation. See also: http://en.wikiversity.org/wiki/Survey_research_and_design_in_psychology/Lectures/Correlation
Chapter 16: Correlation
(enhanced by VisualBee)nunngera
Correlation is a statistical method used to measure the relationship between two variables. A relationship exists when changes in one variable are accompanied by consistent changes in the other. A correlation evaluates the direction, form, and degree of the relationship. The Pearson correlation specifically measures the direction and strength of a linear relationship between two numerical variables. Other correlational methods like Spearman and point-biserial correlations can be used for ordinal or dichotomous variable relationships.
Correlation and regression are statistical techniques used to analyze relationships between variables. Correlation determines the strength and direction of a relationship, while regression describes the linear relationship to predict changes in one variable based on changes in another. There are different types of correlation including simple, multiple, and partial correlation. Regression analysis determines the regression line that best fits the data to estimate values of one variable based on the other. The correlation coefficient measures the strength of linear correlation from -1 to 1, while regression coefficients are used to predict changes in the variables.
This document discusses correlation analysis and its various types. Correlation is the degree of relationship between two or more variables. There are three stages to solve correlation problems: determining the relationship, measuring significance, and establishing causation. Correlation can be positive, negative, simple, partial, or multiple depending on the direction and number of variables. It is used to understand relationships, reduce uncertainty in predictions, and present average relationships. Conditions like probable error and coefficient of determination help interpret correlation values.
The document discusses correlation and linear regression. It defines Pearson and Spearman correlation as statistical techniques to measure the relationship between two variables. Pearson correlation measures the linear association between interval variables, while Spearman correlation measures statistical dependence between two variables using their rank order. Linear regression finds the best fit linear relationship between a dependent and independent variable to predict changes in one based on the other. The key assumptions and interpretations of correlation coefficients and regression lines are also covered.
Pearson Correlation, Spearman Correlation &Linear RegressionAzmi Mohd Tamil
This document discusses correlation and linear regression. It defines correlation as a statistic that measures the strength and direction of the linear relationship between two continuous variables. Positive correlation indicates that as one variable increases, so does the other. Negative correlation means the variables are inversely related. Linear regression can be used to predict a continuous outcome variable based on a continuous predictor variable using the regression equation y=a+bx. The regression line minimizes the sum of squared differences between the data points and the line. The slope coefficient b indicates the strength of the linear prediction and can be tested for significance.
This is about the correlation analysis in statistics. It covers types, importance,Scatter diagram method
Karl pearson correlation coefficient
Spearman rank correlation coefficient
This document provides an overview of correlation and regression analysis concepts. It defines correlation as the strength of relationship between two variables and discusses perfect, positive, negative, and no relationships. Pearson's correlation coefficient r is described as a measure of linear correlation between -1 and +1, with values closer to these extremes indicating a stronger linear relationship. The document also explains how to calculate r using z-scores and provides examples. Finally, it introduces the concept of linear regression analysis, including the least squares regression equation and how to calculate the line of best fit, as well as the standard error of estimate.
This document discusses correlation and regression analysis. It defines correlation as a statistical measure of how two variables are related. A correlation coefficient between -1 and 1 indicates the strength and direction of the linear relationship between variables. A scatterplot can show this graphically. Regression analysis involves using one variable to predict scores on another variable. Simple linear regression uses one independent variable to predict a dependent variable, while multiple regression uses two or more independent variables. The goal is to identify the regression line that best fits the data with the least error. The coefficient of determination, R2, indicates how much variance in the dependent variable is explained by the independent variables.
This document discusses correlation analysis and how it is used to measure the linear relationship between two variables. It provides the formulas for calculating the population and sample correlation coefficients (ρ and r) and describes what each represents. An example is given of a study examining the relationship between grades in Calculus and Fortran Computer Language for 10 students. It asks whether the observed relationship is statistically significant and how the dependent variable could be predicted from the independent variable.
This document discusses rank correlation and Spearman's rank correlation coefficient. It defines correlation as a relationship between two variables where a change in one variable corresponds to a change in the other. Rank correlation involves ranking observations from highest to lowest rather than using the original values, which avoids assumptions about the population distribution. Spearman's rank correlation coefficient measures the correspondence between two rankings and is calculated based on the differences between ranks of paired items. It provides a distribution-free measure of correlation.
It is most useful for the students of BBA for the subject of "Data Analysis and Modeling"/
It has covered the content of chapter- Data regression Model
Visit for more on www.ramkumarshah.com.np/
This document discusses correlation analysis in agriculture. It begins by defining correlation as the relationship between two or more variables. Some key points:
- Correlation can be positive (variables move in the same direction), negative (variables move in opposite directions), linear, nonlinear, simple, multiple, partial or total.
- Common types analyzed in agriculture include the relationship between yield and rainfall, price and supply, height and weight.
- Methods for measuring correlation are discussed, including Karl Pearson's coefficient of correlation (denoted by r), Spearman's rank correlation, and scatter diagrams.
- The value of r ranges from -1 to 1, with higher positive or negative values indicating a stronger linear relationship between variables
This document discusses correlation analysis and its various types. Correlation is a measure of the relationship between two or more variables. There are three main types of correlation based on the degree, number of variables, and linearity. Correlation can be positive, negative, simple, partial, multiple, linear, or non-linear. Correlation is important for understanding relationships between variables, making predictions, and interpreting data. However, correlation does not necessarily imply causation.
The document discusses different types and methods of measuring correlation between two variables. It describes Karl Pearson's coefficient of correlation (r) which measures the strength and direction of a linear relationship between two variables on a scale of -1 to 1. It also discusses Spearman's rank correlation coefficient (R) which is used when variables can only be ranked rather than measured quantitatively. The key methods covered are scatter diagrams, which graphically depict relationships, and calculating correlation coefficients based on deviations from the mean.
Discriminant analysis (DA) is a statistical technique used to predict group membership when the dependent variable is categorical and the independent variables are continuous. It identifies which variables discriminate between two or more naturally occurring groups. DA develops a linear equation to predict group membership based on weighted combinations of predictor variables. It aims to maximize the distance between group means to achieve strong discriminatory power. Like regression, DA assumes variables are normally distributed, cases are randomly sampled, and groups are mutually exclusive and collectively exhaustive. It requires at least two groups with minimal overlap and similar group sizes of at least five cases. DA can classify new cases into groups based on the discriminant functions derived from existing data.
The document discusses correlation analysis and different types of correlation. It defines correlation as the linear association between two random variables. There are three main types of correlation:
1) Positive vs negative vs no correlation based on the relationship between two variables as one increases or decreases.
2) Linear vs non-linear correlation based on the shape of the relationship when plotted on a graph.
3) Simple vs multiple vs partial correlation based on the number of variables.
The document also discusses methods for studying correlation including scatter plots, Karl Pearson's coefficient of correlation r, and Spearman's rank correlation coefficient. It provides interpretations of the correlation coefficient r and coefficient of determination r2.
Correlation and Regression Analysis using SPSS and Microsoft ExcelSetia Pramana
This document discusses correlation and linear regression analysis. It covers correlation coefficients, linear relationships between variables, assumptions of linear regression, and using SPSS and Excel to conduct correlation and regression analyses. Pearson and Spearman correlation coefficients are introduced as measures of the linear association between two continuous variables. Simple and multiple linear regression models are explained as tools to predict an outcome variable from one or more predictor variables.
Linear regression and correlation analysis ppt @ bec domsBabasab Patil
This document introduces linear regression and correlation analysis. It discusses calculating and interpreting the correlation coefficient and linear regression equation to determine the relationship between two variables. It covers scatter plots, the assumptions of regression analysis, and using regression to predict and describe relationships in data. Key terms introduced include the correlation coefficient, linear regression model, explained and unexplained variation, and the coefficient of determination.
Fundamental of Statistics and Types of CorrelationsRajesh Verma
This document provides an overview of key concepts in statistics including parametric vs non-parametric statistics, descriptive vs inferential statistics, types of errors, significance levels, correlation, and different correlation coefficients. Parametric statistics rely on assumptions of the normal distribution while non-parametric do not. Descriptive statistics describe data and inferential statistics draw conclusions. Type I and II errors occur when the null hypothesis is incorrectly rejected or not rejected. Significance levels like 0.05 are used to determine statistical significance. Correlation measures the relationship between variables from -1 to 1. Different coefficients like Pearson, Spearman, and Kendall's Tau are used depending on the scale of measurement and data distribution.
Overviews non-parametric and parametric approaches to (bivariate) linear correlation. See also: http://en.wikiversity.org/wiki/Survey_research_and_design_in_psychology/Lectures/Correlation
Chapter 16: Correlation
(enhanced by VisualBee)nunngera
Correlation is a statistical method used to measure the relationship between two variables. A relationship exists when changes in one variable are accompanied by consistent changes in the other. A correlation evaluates the direction, form, and degree of the relationship. The Pearson correlation specifically measures the direction and strength of a linear relationship between two numerical variables. Other correlational methods like Spearman and point-biserial correlations can be used for ordinal or dichotomous variable relationships.
Correlation and regression are statistical techniques used to analyze relationships between variables. Correlation determines the strength and direction of a relationship, while regression describes the linear relationship to predict changes in one variable based on changes in another. There are different types of correlation including simple, multiple, and partial correlation. Regression analysis determines the regression line that best fits the data to estimate values of one variable based on the other. The correlation coefficient measures the strength of linear correlation from -1 to 1, while regression coefficients are used to predict changes in the variables.
This document discusses correlation analysis and its various types. Correlation is the degree of relationship between two or more variables. There are three stages to solve correlation problems: determining the relationship, measuring significance, and establishing causation. Correlation can be positive, negative, simple, partial, or multiple depending on the direction and number of variables. It is used to understand relationships, reduce uncertainty in predictions, and present average relationships. Conditions like probable error and coefficient of determination help interpret correlation values.
The document discusses correlation and linear regression. It defines Pearson and Spearman correlation as statistical techniques to measure the relationship between two variables. Pearson correlation measures the linear association between interval variables, while Spearman correlation measures statistical dependence between two variables using their rank order. Linear regression finds the best fit linear relationship between a dependent and independent variable to predict changes in one based on the other. The key assumptions and interpretations of correlation coefficients and regression lines are also covered.
Pearson Correlation, Spearman Correlation &Linear RegressionAzmi Mohd Tamil
This document discusses correlation and linear regression. It defines correlation as a statistic that measures the strength and direction of the linear relationship between two continuous variables. Positive correlation indicates that as one variable increases, so does the other. Negative correlation means the variables are inversely related. Linear regression can be used to predict a continuous outcome variable based on a continuous predictor variable using the regression equation y=a+bx. The regression line minimizes the sum of squared differences between the data points and the line. The slope coefficient b indicates the strength of the linear prediction and can be tested for significance.
This is about the correlation analysis in statistics. It covers types, importance,Scatter diagram method
Karl pearson correlation coefficient
Spearman rank correlation coefficient
This document provides an overview of correlation and regression analysis concepts. It defines correlation as the strength of relationship between two variables and discusses perfect, positive, negative, and no relationships. Pearson's correlation coefficient r is described as a measure of linear correlation between -1 and +1, with values closer to these extremes indicating a stronger linear relationship. The document also explains how to calculate r using z-scores and provides examples. Finally, it introduces the concept of linear regression analysis, including the least squares regression equation and how to calculate the line of best fit, as well as the standard error of estimate.
This document discusses correlation and regression analysis. It defines correlation as a statistical measure of how two variables are related. A correlation coefficient between -1 and 1 indicates the strength and direction of the linear relationship between variables. A scatterplot can show this graphically. Regression analysis involves using one variable to predict scores on another variable. Simple linear regression uses one independent variable to predict a dependent variable, while multiple regression uses two or more independent variables. The goal is to identify the regression line that best fits the data with the least error. The coefficient of determination, R2, indicates how much variance in the dependent variable is explained by the independent variables.
This document discusses correlation analysis and how it is used to measure the linear relationship between two variables. It provides the formulas for calculating the population and sample correlation coefficients (ρ and r) and describes what each represents. An example is given of a study examining the relationship between grades in Calculus and Fortran Computer Language for 10 students. It asks whether the observed relationship is statistically significant and how the dependent variable could be predicted from the independent variable.
This document discusses rank correlation and Spearman's rank correlation coefficient. It defines correlation as a relationship between two variables where a change in one variable corresponds to a change in the other. Rank correlation involves ranking observations from highest to lowest rather than using the original values, which avoids assumptions about the population distribution. Spearman's rank correlation coefficient measures the correspondence between two rankings and is calculated based on the differences between ranks of paired items. It provides a distribution-free measure of correlation.
It is most useful for the students of BBA for the subject of "Data Analysis and Modeling"/
It has covered the content of chapter- Data regression Model
Visit for more on www.ramkumarshah.com.np/
This document discusses correlation analysis in agriculture. It begins by defining correlation as the relationship between two or more variables. Some key points:
- Correlation can be positive (variables move in the same direction), negative (variables move in opposite directions), linear, nonlinear, simple, multiple, partial or total.
- Common types analyzed in agriculture include the relationship between yield and rainfall, price and supply, height and weight.
- Methods for measuring correlation are discussed, including Karl Pearson's coefficient of correlation (denoted by r), Spearman's rank correlation, and scatter diagrams.
- The value of r ranges from -1 to 1, with higher positive or negative values indicating a stronger linear relationship between variables
This document discusses correlation analysis and its various types. Correlation is a measure of the relationship between two or more variables. There are three main types of correlation based on the degree, number of variables, and linearity. Correlation can be positive, negative, simple, partial, multiple, linear, or non-linear. Correlation is important for understanding relationships between variables, making predictions, and interpreting data. However, correlation does not necessarily imply causation.
The document discusses different types and methods of measuring correlation between two variables. It describes Karl Pearson's coefficient of correlation (r) which measures the strength and direction of a linear relationship between two variables on a scale of -1 to 1. It also discusses Spearman's rank correlation coefficient (R) which is used when variables can only be ranked rather than measured quantitatively. The key methods covered are scatter diagrams, which graphically depict relationships, and calculating correlation coefficients based on deviations from the mean.
Correlation and regression analysis are statistical methods used to determine if a relationship exists between variables and describe the nature of that relationship. A scatter plot graphs the independent and dependent variables and allows visualization of any trends in the data. The correlation coefficient measures the strength and direction of the linear relationship between variables, ranging from -1 to 1. Regression finds the linear "best fit" line that minimizes the residuals and can be used to predict dependent variable values.
Correlation and regression analysis are statistical methods used to determine if a relationship exists between variables and describe the nature of that relationship. A scatter plot graphs the independent and dependent variables and allows visualization of any trends in the data. The correlation coefficient measures the strength and direction of the linear relationship between variables, ranging from -1 to 1. Regression finds the linear "best fit" line that minimizes the residuals, or differences between observed and predicted dependent variable values. The coefficient of determination measures how much variation in the dependent variable is explained by the regression model.
HOW IS IT USEFUL IN FIELD OF FORENSIC SCIENCE AND IN THIS I HAVE SHOWN THE TYPES OF CORRELATION, SIGNIFICANCE , METHODS AND KARL PEARSON'S METHOD OF CORRELATION
This document provides an overview of simple linear regression and correlation analysis. It defines regression as estimating the relationship between two variables and correlation as measuring the strength and direction of that relationship. The key points covered include:
- Regression finds an estimating equation to relate known and unknown variables. Correlation determines how well that equation fits the data.
- Pearson's correlation coefficient r measures the linear relationship between two variables on a scale from -1 to 1.
- The coefficient of determination r2 indicates what percentage of variation in the dependent variable is explained by the independent variable.
- Statistical tests can evaluate whether a correlation is statistically significant or could be due to chance.
This document discusses correlation and regression analysis techniques used in physical geography to examine relationships between variables. Correlation determines the degree of relationship between two variables and is represented by the correlation coefficient r, which ranges from -1 to 1. Regression identifies relationships between a dependent variable and one or more independent variables by calculating a best-fit line that minimizes residuals. The document provides examples of calculating the correlation coefficient r and estimating the regression equation between variables.
This document provides an overview of correlation and linear regression analysis. It defines correlation as a statistical measure of the relationship between two variables. Pearson's correlation coefficient (r) ranges from -1 to 1, with values farther from 0 indicating a stronger linear relationship. Positive values indicate an increasing relationship, while negative values indicate a decreasing relationship. The coefficient of determination (r2) represents the proportion of shared variance between variables. While correlation indicates linear association, it does not imply causation. Multiple regression allows predicting a continuous dependent variable from two or more independent variables.
Correlation Analysis for MSc in Development Finance .pdfErnestNgehTingum
• Correlation is another way of assessing the relationship between variables.
– it measures the extent of correspondence between the ordering of two random variables.
• There is a large amount of resemblance between regression and correlation but for their methods of interpretation of the relationship.
– For example, a scatter diagram is of tremendous help when trying to describe the type of relationship existing between two variables.
The document discusses different types of correlation and methods for studying correlation. It describes Karl Pearson's coefficient of correlation, which measures the strength and direction of a linear relationship between two variables. The coefficient ranges from -1 to 1, where -1 is a perfect negative correlation, 0 is no correlation, and 1 is a perfect positive correlation. The document also discusses other types of correlation coefficients like Spearman's rank correlation coefficient and methods for analyzing correlation like scatter plots.
This document discusses correlation and regression analysis. It defines correlation as a statistical measure of how strongly two variables are related. A correlation coefficient between -1 and 1 indicates the strength and direction of the linear relationship between variables. Regression analysis allows us to predict the value of a dependent variable based on the value of one or more independent variables. Simple linear regression involves one independent variable, while multiple regression involves two or more independent variables to predict the dependent variable. The document provides examples and formulas for calculating correlation, regression lines, explained and unexplained variance, and the coefficient of determination.
This document defines correlation and discusses the relationship between two variables or events. It introduces the Pearson correlation coefficient r, which ranges from -1 to 1 and measures the strength and direction of association between two variables. Strong positive correlations near 1 indicate that as one variable increases, so does the other. The document also discusses how correlation does not necessarily imply causation and provides examples of calculating r from sample data.
The document discusses different types of correlation including positive, negative, simple, partial and multiple correlation. It describes methods for studying correlation such as scatter diagrams, correlation graphs, and Karl Pearson's coefficient of correlation. The key aspects of Pearson's correlation coefficient are also summarized such as its properties, limitations, and how to test for the significance of the correlation coefficient.
Correlation analysis measures the strength and direction of association between two or more variables. It is represented by the coefficient of correlation (r), which ranges from -1 to 1. A value of 0 indicates no association, 1 indicates perfect positive association, and -1 indicates perfect negative association. The scatter diagram is a graphical method to visualize the association between variables by plotting their values. Karl Pearson's coefficient is a commonly used algebraic method to calculate the coefficient of correlation from sample data.
This presentation covered the following topics:
1. Definition of Correlation and Regression
2. Meaning of Correlation and Regression
3. Types of Correlation and Regression
4. Karl Pearson's methods of correlation
5. Bivariate Grouped data method
6. Spearman's Rank correlation Method
7. Scattered diagram method
8. Interpretation of correlation coefficient
9. Lines of Regression
10. regression Equations
11. Difference between correlation and regression
12. Related examples
The document discusses regression and correlation analysis between BMI (Kg/m2) of pregnant mothers and birth weight (kg) of their newborns using data from 15 mothers. A scatter plot showed a positive linear relationship between BMI and birth weight. Linear regression was used to calculate the regression line as y=1.775351+0.0330817x, which can be used to predict birth weight based on a mother's BMI. The correlation coefficient (R) between BMI and birth weight was 0.94, indicating a strong positive correlation.
This document discusses correlation analysis. It defines correlation as the degree of relationship between two random variables. Correlation can be positive, negative, simple, partial, or multiple depending on the direction and number of variables. Linear correlation means changes in one variable are proportionally related to changes in the other. Non-linear correlation means changes are not proportional. Correlation is measured using the correlation coefficient r, which ranges from -1 to 1. A higher absolute r value means stronger correlation. Correlation only indicates relationship and not causation. The document also covers probable error and coefficient of determination in interpreting correlation results.
1. The document discusses the concept of correlation, including the different types and methods of measuring correlation.
2. It provides background on the history of correlation, beginning with its proposal by French scientist A. Bravis, and development of graphical representation by Sir Francis Galton.
3. Key aspects covered include Karl Pearson's coefficient of correlation (r), which measures the strength and direction of linear relationships between variables ranging from -1 to 1. Examples are provided to illustrate different degrees of positive and negative correlation.
This document provides an introduction to knowledge management (KM) and discusses its key concepts and evolution. It addresses:
1) KM gaining attention across disciplines as the economy shifts to knowledge-based. Effective KM drives innovation.
2) Knowledge is intangible and difficult to measure but critical to organizational survival. KM aims to increase useful knowledge through communication, learning opportunities, and knowledge sharing.
3) A KM initiative requires a focus on people, processes, and technology to create, share, and apply both explicit and tacit knowledge across the organization.
The document provides an overview of a syllabus for a course on knowledge management. The syllabus covers 5 units: (1) introduction to KM and its importance; (2) tools, technologies, and knowledge conversion; (3) social aspects of knowledge; (4) KM strategy and metrics; (5) roles in KM and future trends. It also includes an introductory lesson that defines knowledge and its types, knowledge management, and KM principles. The goal is to help students understand how to capture and apply knowledge as a competitive advantage for businesses.
The document provides an overview of a knowledge management syllabus. It outlines 5 units that will be covered: introduction to KM and tools/technologies; social aspects of knowledge and application; KM strategy and metrics; KM tools; and roles/responsibilities in KM. It also includes a lesson on the introduction to KM, discussing its history and evolution driven by advances in technology. Knowledge is defined as information combined with experience, and KM aims to leverage collective knowledge as a strategic asset.
This document provides an overview of the topics covered in the Major Based Elective I (B) E-Commerce course. The 5 units cover: (1) basics of e-commerce and case studies of companies like Intel and Amazon; (2) electronic mail services and standards; (3) electronic data interchange; (4) cyber security techniques like encryption and digital signatures; and (5) electronic payment systems. The objectives of the course are to understand the fundamentals of e-commerce and its associated security issues. Key aspects of e-commerce discussed include definitions, advantages and disadvantages compared to traditional commerce, and the scope of e-commerce applications.
The document discusses artificial intelligence and expert systems. It describes the evolution of artificial intelligence from early attempts to develop general problem solving methods to today's applications in multiple domains. Expert systems are discussed as a key application of artificial intelligence, using human expertise encoded as rules to perform tasks normally requiring human experts. The typical components of an expert system are described as the knowledge base containing rules and facts, the inference engine for reasoning, and the user interface. Popular representation methods and inference techniques like forward and backward chaining are also summarized.
The document outlines the key components and calculations for a three statement model, including income statements, balance sheets, and cash flow statements. It details the calculations for revenue, costs of goods sold, operating expenses, EBITDA, depreciation, taxes, and net income. The model also shows how to calculate asset value by accounting for total expenses, depreciation, and the balance between assets, liabilities, and equity.
Conjoint analysis is a market research technique used to understand customers' preferences for different product features. It involves assessing customers' ratings, rankings, or choices between hypothetical products defined by various attribute levels to estimate the value or "part-worth" that customers place on each attribute level. This helps companies design products with optimal combinations of features to satisfy customers. The method originated in psychology in the 1970s and uses computer surveys to consider many attributes and levels simultaneously, simulating real purchase decisions.
Conjoint analysis is a market research technique used to understand customers' preferences for different product features. It involves assessing customers' ratings, rankings, or choices between hypothetical products defined by various attribute levels to estimate the value or "part-worth" that customers place on each attribute level. This helps companies design products with optimal combinations of features to satisfy customers. The method originated in psychology in the 1970s and uses computer surveys to consider many attributes and levels simultaneously, simulating real purchase decisions involving trade-offs between features.
Demography is the scientific study of human populations and how they change. It examines population size, growth, characteristics, composition, migration, and more. Studying populations is important because it allows sociologists to understand sudden changes and how growth affects society. Both slow and rapid population growth can cause problems. Demographic data comes from censuses, vital statistics like births and deaths, and surveys. A population's age composition and sex ratio influence its economic and social structures. Key elements of demography include birth rate, death rate, total fertility rate, life expectancy, and growth rate. Rapid population growth can result from high birth rates not matched by mortality as well as migration for economic reasons. This rapid growth can have negative economic,
The document discusses the importance of HR analytics for talent management. It begins by defining talent and talent management, explaining that talent management is a continuous process of sourcing, hiring, developing, retaining and promoting employees. It then discusses how HR analytics can help with various aspects of talent management, including talent strategy and planning, identifying and acquiring talent, developing and deploying talent, and retaining talent. Some key implications of HR analytics for talent management are that it will make the HR function more strategic and data-driven, require better quality HR data, impact the tools used, and change long-held beliefs by analyzing outcomes.
The document discusses tabulation and summarizing data. It defines tabulation as organizing data into tables to simplify, analyze, and compare information. The objectives of tabulation are listed as carrying out investigations, comparisons, locating errors, using space efficiently, studying trends, and simplifying data for future reference. Rules for constructing good tables are outlined, such as suiting the table size to the page, having an appropriate number of rows and columns, approximating figures, and arranging items in logical order. Methods of tabulation like simple, double, and complex tabulation are described along with different types of frequency tables including simple, grouped, cumulative, and cross tabulation tables.
The document discusses different types of t-tests used for statistical hypothesis testing on population means:
1) The one-sample t-test compares a sample mean to a hypothetical population mean when the population variance is unknown.
2) The independent-samples t-test compares the means of two independent groups on a continuous variable to determine if there is a statistically significant difference.
3) The paired-samples t-test compares the means of two related/matched groups or repeated measures to determine if there is a statistically significant difference between two conditions or time points.
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxMAGOTI ERNEST
Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024).
Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).
Current Ms word generated power point presentation covers major details about the micronuclei test. It's significance and assays to conduct it. It is used to detect the micronuclei formation inside the cells of nearly every multicellular organism. It's formation takes place during chromosomal sepration at metaphase.
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...AbdullaAlAsif1
The pygmy halfbeak Dermogenys colletei, is known for its viviparous nature, this presents an intriguing case of relatively low fecundity, raising questions about potential compensatory reproductive strategies employed by this species. Our study delves into the examination of fecundity and the Gonadosomatic Index (GSI) in the Pygmy Halfbeak, D. colletei (Meisner, 2001), an intriguing viviparous fish indigenous to Sarawak, Borneo. We hypothesize that the Pygmy halfbeak, D. colletei, may exhibit unique reproductive adaptations to offset its low fecundity, thus enhancing its survival and fitness. To address this, we conducted a comprehensive study utilizing 28 mature female specimens of D. colletei, carefully measuring fecundity and GSI to shed light on the reproductive adaptations of this species. Our findings reveal that D. colletei indeed exhibits low fecundity, with a mean of 16.76 ± 2.01, and a mean GSI of 12.83 ± 1.27, providing crucial insights into the reproductive mechanisms at play in this species. These results underscore the existence of unique reproductive strategies in D. colletei, enabling its adaptation and persistence in Borneo's diverse aquatic ecosystems, and call for further ecological research to elucidate these mechanisms. This study lends to a better understanding of viviparous fish in Borneo and contributes to the broader field of aquatic ecology, enhancing our knowledge of species adaptations to unique ecological challenges.
Phenomics assisted breeding in crop improvementIshaGoswami9
As the population is increasing and will reach about 9 billion upto 2050. Also due to climate change, it is difficult to meet the food requirement of such a large population. Facing the challenges presented by resource shortages, climate
change, and increasing global population, crop yield and quality need to be improved in a sustainable way over the coming decades. Genetic improvement by breeding is the best way to increase crop productivity. With the rapid progression of functional
genomics, an increasing number of crop genomes have been sequenced and dozens of genes influencing key agronomic traits have been identified. However, current genome sequence information has not been adequately exploited for understanding
the complex characteristics of multiple gene, owing to a lack of crop phenotypic data. Efficient, automatic, and accurate technologies and platforms that can capture phenotypic data that can
be linked to genomics information for crop improvement at all growth stages have become as important as genotyping. Thus,
high-throughput phenotyping has become the major bottleneck restricting crop breeding. Plant phenomics has been defined as the high-throughput, accurate acquisition and analysis of multi-dimensional phenotypes
during crop growing stages at the organism level, including the cell, tissue, organ, individual plant, plot, and field levels. With the rapid development of novel sensors, imaging technology,
and analysis methods, numerous infrastructure platforms have been developed for phenotyping.
ESR spectroscopy in liquid food and beverages.pptxPRIYANKA PATEL
With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
Nucleophilic Addition of carbonyl compounds.pptxSSR02
Nucleophilic addition is the most important reaction of carbonyls. Not just aldehydes and ketones, but also carboxylic acid derivatives in general.
Carbonyls undergo addition reactions with a large range of nucleophiles.
Comparing the relative basicity of the nucleophile and the product is extremely helpful in determining how reversible the addition reaction is. Reactions with Grignards and hydrides are irreversible. Reactions with weak bases like halides and carboxylates generally don’t happen.
Electronic effects (inductive effects, electron donation) have a large impact on reactivity.
Large groups adjacent to the carbonyl will slow the rate of reaction.
Neutral nucleophiles can also add to carbonyls, although their additions are generally slower and more reversible. Acid catalysis is sometimes employed to increase the rate of addition.
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...University of Maribor
Slides from talk:
Aleš Zamuda: Remote Sensing and Computational, Evolutionary, Supercomputing, and Intelligent Systems.
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Inter-Society Networking Panel GRSS/MTT-S/CIS Panel Session: Promoting Connection and Cooperation
https://www.etran.rs/2024/en/home-english/
2. 2
Correlation
Introduction:
Two variables are said to be correlated if the change in
one variable results in a corresponding change in the
other variable.
The correlation is a statistical tool which studies the
relationship between two variables.
Correlation analysis involves various methods and
techniques used for studying and measuring the extent
of the relationship between the two variables.
Correlation is concerned with the measurement of
“strength of association between variables”.
The degree of association between two or more
variables is termed as correlation.
3. 3
Contd…
Correlation analysis helps us to decide the strength of the
linear relationship between two variables.
The word correlation is used to decide the degree of
association between variables.
If two variables ‘x’ and ‘y’ are so related, the variables in the
magnitude of one variable tend to be accompanied by
variations in the magnitude of the other variable, they are
said to be correlated.
Thus, correlation is a statistical tool, with the help of which,
we can determine whether or not two or more variables are
correlate and if they are correlated, what is the degree and
direction of correlation.
4. 4
Definition
The correlation is the measure of the extent and
the direction of the relationship between two
variables in a bivariate distribution.
Example:
(i) Height and weight of children.
(ii)An increase in the price of the commodity by a
decrease in the quantity demanded.
Types of Correlation: The following are the types of
correlation
(i) Positive and Negative Correlation
(ii) Simple, Partial and Multiple Correlation
(iii) Linear and Non-linear Correlation
5. Contd…
Correlation first developed by Sir Francis
Galton (1822 – 1911) and then reformulated
by Karl Pearson (1857 – 1936)
Note: The degree of relationship or
association is known as the degree of
relationship.
5
6. 6
Types of Correlation
i. Positive and Negative correlation: If both the
variables are varying in the same direction i.e. if one
variable is increasing and the other on an average is
also increasing or if as one variable is decreasing, the
other on an average, is also decreasing, correlation is
said to be positive. If on the other hand, the variable
is increasing, the other is decreasing or vice versa,
correlation is said to be negative.
Example 1: a) heights and weights (b) amount of rainfall
and yields of crops (c) price and supply of a
commodity (d) income and expenditure on luxury
goods (e) blood pressure and age
Example 2: a) price and demand of commodity (b) sales
of woolen garments and the days temperature.
7. 7
Contd…
ii. Simple, Partial and Multiple Correlation:
When only two variables are studied, it is a
case of simple correlation. In partial and
multiple correlation, three or more variables
are studied. In multiple correlation three or
more variables are studied simultaneously. In
partial correlation, we have more than two
variables, but consider only two variables to
be influencing each other, the effect of the
other variables being kept constant.
8. 8
Contd…
iii. Linear and Non-linear Correlation: If the
change in one variable tends to bear a
constant ratio to the change in the
other variable, the correlation is said to
be linear. Correlation is said to be non-
linear if the amount of change in one
variable does nor bear a constant ratio
to the amount of change in the other
variable.
9. Methods of Studying Correlation
Correlation
Graphical
Method
Scatter
Diagram
Algebraic
Method
Karl Pearson’s
Coefficient of
Correlation
9
10. 10
Methods of Studying Correlation
The following are the methods of determining
correlation
1. Scatter diagram method
2. Karl Pearson’s Coefficient of Correlation
1. Scatter Diagram:
This is a graphic method of finding out relationship
between the variables.
Given data are plotted on a graph paper in the form
of dots i.e. for each pair of x and y values we put a
dot and thus obtain as many points as the number of
observations.
The greater the scatter of points over the graph, the
lesser the relationship between the variables.
11. Scatter Diagram
Perfect Positive
X
O
Y Correlation
Perfect Negative
O
Y Correlation
X
O
Low Degree of
Y Negative Correlation
Low Degree of
Positive Correlation
X
X O
Y
High Degree of
X
O
Positive Correlation
Y
O
Y
No Correlation
XO
High Degree of
Negative CorrelationY
No Correlation
X
XO
Y
11
12. 12
Interpretation
If all the points lie in a straight line, there is either
perfect positive or perfect negative correlation.
If all the points lie on a straight falling from the lower
left hand corner to the upper right hand corner then the
correlation is perfect positive.
Perfect positive if r = + 1.
If all the points lie on a straight falling from the upper
left hand corner to the lower right hand corner then the
correlation is perfect negative.
Perfect negative if r = -1.
The nearer the points are to be straight line, the higher
degree of correlation.
The farthest the points from the straight line, the lower
degree of correlation.
If the points are widely scattered and no trend is
revealed, the variables may be un-correlated i.e. r = 0.
13. 13
The Coefficient of Correlation:
A scatter diagram give an idea about the type of
relationship or association between the variables
under study. It does not tell us about the
quantification of the association between the two.
In order to quantify the relation ship between the
variables a measure called correlation coefficient
developed by Karl Pearson.
It is defined as the measure of the degree to which
there is linear association between two intervally
scaled variables.
Thus, the coefficient of correlation is a number which
indicates to what extent two variables are related , to
what extent variations in one go with the variations
in the other
14. 14
Contd…
The symbol ‘r’ or ‘rₓᵧ’ or ‘rᵧₓ’is denoted in this
method and is calculated by:
r = {Cov(X,Y) ÷ Sₓ Sᵧ} ………..(i)
Where Cov(X, Y) is the Sample Covariance
between X and Y.Mathematically it is defined
by
Cov(X, Y)={∑(X–X̅)(Y–Y)}̅ ÷(n – 1)
Sₓ = Sample standard deviation of X, is given by
Sₓ = {∑(X – X̅)² ÷ (n – 1)}½
Sᵧ = Sample standard deviation of Y,is givenby
Sᵧ = {∑(Y – Y)̅² ÷ (n – 1)}½ and X̅ = ∑X ÷ n and Y̅ =
∑Y ÷ n
15. 15
Interpretation
iii.
i. If the covariance is positive, the relationship is
positive.
ii. If the covariance is negative, the relationship is
negative.
If the covariance is zero, the variables are said to be
not correlated.
Hence the covariance measures the strength of linear
association between considered numerical variables.
Thus, covariance is an absolute measure of linear
association .
In order to have relative measure of relationship it is
necessary to compute correlation coefficient .
Computation of correlation coefficient a relation
developed by Karl Pearson are as follows:
16. Contd…
The formula for sample correlation coefficient (r) is calculated
by the following relation:
If (X – X̅) = x and (Y - Y̅) = y then above formula reduces to:
Example:
16
17. 17
Properties of Karl Pearson’s Correlation Coefficient
1. The coefficient of correlation ‘r’ is always a number between -1
and +1 inclusive.
2. If r = +1 or -1, the sample points lie on a straight line.
3. If ‘r’ is near to +1 or -1, there is a strong linear association
between the variables.
4. If ‘r’ is small(close to zero), there is low degree of correlation
between the variables.
5. The coefficient of correlation is the geometric mean of the two
regression coefficients.
Symbolically: r= √(bₓᵧ . bᵧₓ)
Note: It is clear that correlation coefficient is a measure of the
degree to which the association between the two variables
approaches a linear functional relationship.
18. 18
Interpretation of Correlation Coefficient
iii.
i. The coefficient of correlation, as obtained by the above formula shall
always lie between +1 to -1.
ii. When r = +1, there is perfect positive correlation between the
variables.
When r = -1, there is perfect negative correlation between the
variables.
iv. When r = 0, there is no correlation.
v. When r = 0.7 to 0.999, there is high degree of correlation.
vi. When r = 0.5 to 0.699, there is a moderate degree of correlation.
vii. When r is less than 0.5, there is a low degree of correlation.
viii. The value of correlation lies in between -1 to +1 i.e.
-1⩽ r ⩽ +1.
ix. The correlation coefficient is independent of the choice of both
origin and scale of observation.
x. The correlation coefficient is a pure number. It is independent of the
units of measurement.
19. 19
Coefficient of Determination
The coefficient of determination(r²) is the square of the
coefficient of correlation.
It is the measure of strength of the relationship
between two variables.
It is subject to more precise interpretation because it
can be presented as a proportion or as a percentage.
The coefficient of determination gives the ratio of the
explained variance to the total variance.
Thus, coefficient of determination,
r² = Explained variance ÷ Total variance
Thus, coefficient of determination shows what amount of
variability or change in independent variable is
accounted for by the variability of the dependent
variable.
20. 20
Example
Example 1: If r = 0.8 then r2 = (0.8)2 = 0.64 or 64%. This means that
based on the sample, 64% of the variation in dependent variable (Y), is
caused by the variations of the independent variable (X). The
remaining 36% variation in Y is unexplained by variation in X. In other
words, the variations other than X could have caused the remaining
36% variations in Y.
Example 2: While comparing two correlation coefficients, one of which
is 0.4 and the other is 0.8 it is misleading to conclude that the
correlation in the second case is twice as high as correlation in the first
case. The coefficient of determination clearly explains this viewpoint,
since in the case r = 0.4, the coefficient of determination is r² = 0.16
and in the case r = 0.8, the coefficient of determination is r² = 0.64,
from which we conclude that correlation in the second case is four
times as high as the correlation in the first case. If the value of r = 0.8,
we cannot conclude that 80% of the variation in the relative series
(dependent variable). But the coefficient of determination in this case
is r² = 0.64 which implies that only 64% of the variation in the relative
series has been explained by the subject series and the remaining 36%
of the variation is due to other factor.
21. 21
Interpretation
The closeness of the relationship between two variables as
determined by correlation coefficient r is not proportional.
The following table gives the values of the coefficient of
determination (r2) for different values of ‘r’.
It is clear from the above table that as the value of ‘r’
decreases, r2 decreases very rapidly expect in two particular
case r = 0 and r = 1 when we get r2 = r.
Coefficient of determination (r2) is always non-negative and as
such that it does not tell us about the direction of the
relationship (whether it is positive or negative) between two
series.
(r) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
(r)2 0.01 0.04 0.09 0.16 0.25 0.36 0.49 0.64 0.81 1.00
(r)2in
%
1 % 4 % 9 % 16 % 25% 36 % 49 % 64 % 81 % 100 %
22. 22
Test of significance of Correlation Coefficient
In order to asses whether the computed
correlation coefficient between considered
variables X and Y are statistically significant or
not t-test can be applied as test statistic. For
this the following steps can be performed:
Step 1. Formulating the Hypothesis: The
common way of starting a hypothesis is that
“The population correlation coefficient (ρ) is
zero that means there is no correlation
between X and Y variables in the population”.
Null Hypothesis, H₀: ρ = 0 ( No correlation
between X and Y variables in the population)
23. Contd…
Alternative Hypothesis, H₁: ρ ≠ 0 ( There is
correlation between X and Y variables in the
population) (Two tail test)
Or H₁: ρ > 0 (There is positive correlation between
X and Y variables in the population) (One tail test)
Or H₁: ρ < 0 (There is negative correlation between
X and Y variables in the population) (One tail test)
Step 2. Computing the Test Statistic: The test
statistic t for testing the existence of correlation is:
23
24. 24
Contd…
where; r = the sample correlation coefficient.
ρ = the population correlation coefficient which is
hypothesized as zero.
n= the total number of pair of observations under study.
Step 3. Decision:
(i) If the computed value of t is greater than table value of t
at given level of significance (α) with (n – 2) d. f. , we reject
the null hypothesis and conclude that there is evidence of
an association between considered variables.
(ii) If the computed value of t is less than table value of t at
given level of significance (α) with (n – 2) d. f. , we accept
the null hypothesis and conclude that there is not evidence
of an association between considered variables.
Example:
25. 25
Simple Linear Regression
Regression is concerned with the “Prediction” of the most
likely value of one variable when the value of the other
variable is known.
The term regression literally means “stepping back towards
the average”.
It was first used by British biometrician Sir Francis Galton
(1822 – 1911).
Definition: Regression analysis is a mathematical measure of
the average relationship between two or more variables in
terms of the original units of the data.
Thus term regression is used to denote estimation or
prediction of the average value of one variable for a
specified value of the other variable.
The estimation is done by means of suitable equation,
derived on the basis of available bivariate data. Such an
equation and its geometrical representation is called
regression curve.
26. 26
Contd…
In regression analysis there are two types of variables
and they are:
i. Independent and ii. Dependent.
Dependent variable(Y): The variable whose value is
influenced or is to be predicted is called dependent
variable.
Independent variable(X): The variable which influences
the values or is used for prediction is called
independent variable.
In regression analysis independent variable is known as
regressor or predictor or explanatory variable.
In regression analysis dependent variable is known as
regressed or explained variable.
Thus the term regression is used to denote estimation
or prediction of the average value of one variable for a
specified value of the other variable.
27. 27
The lines of Regression
A line fitted to a set of data points to estimates the
relationship between two variables is called regression
line.
The regression equation of Y on X describes the
changes in the value of Y for given changes in the value
of X.
The regression equation of X on Y describes the
changes in the value of X for given changes in the value
of Y.
Hence, an equation for estimating a dependent variable
Y for X from the independent variable X or Y,is called
regression equation of Y on X or X on Y respectively.
The regression equations of the regression lines, also
called least squares lines are determined by least
square method.
28. 28
Simple Regression Model
• Simple regression line is a straight line that describe
about the dependence of the average value of one
variable on the other.
• Y = β₀ + β₁ X + Ɛ ……….(*)
• Where Y = Dependent or response or outcome variable
(Population)
• X = Independent or explanatory or predictor variable
(Population)
• β₀ = Y- intercept of the model for the population
• β₁ = population slope coefficient or population
regression coefficient. It measures the average rate of
change in dependent variable per unit change in
independent variable.
• Ɛ= Population error in Y for observation.
29. One unit change in x
Slope = β₁
β₀
Y- intercept
Error term
Estimated value
of y when x = x₀
Straight line
ŷ = β₀ + β₁ x
An observed value
of y when x = x₀
X
x₀ = A specific value of the
independent variable x
Y
29
30. Estimation of Regression Equation
Regression Model
Y = β₀ + β₁ X + Ɛ
Regression
Equation
Y = β₀ + β₁ X
Unknown
parameter
β₀ & β₁
Estimated
Regression
equation
ŷ = b₀ + b₁ x
Sample Statistics
b₀ & b₁
Sample Data:
x
x₁
x₂
.
.
xn
y
y₁
y₂
.
.
yn
b₀ & b₁
Provides estimates
of
β₀ & β₁
30
31. 31
Model
Linear regression model is
Y = β₀ + β₁ X + Ɛ
Linear regression equation is:
Y = β₀ + β₁ X
Sample regression model is
ŷ= b₀ + b₁x + e
Sample regression equation is
ŷ= b₀ + b₁x
Where b₀ = sample y intercept,
b₁= sample slope coefficient
x= independent variable
y= dependent variable
ŷ= estimated value of dependent variable for a given value
of independent variable.
e = error term = y - ŷ
33. 33
Least squares methods
• Let ŷ= b₀ + b₁x …..(1) be estimated linear
regression equation of y on x of the regression
equation Y = β₀ + β₁ X .
• By using the principles of least square, we can
get two normal equations of regression
equation (1) are as:
• ∑y = nb₀ +b₁ ∑x………(2)
• ∑xy = b₀∑x₁ + b₁∑x₂…….(3)
• By solving equations (2) & (3) we get the value
of b₀ & b₁ as:
34. Contd…
• The computational formula for y intercept b₀ as follows:
• After finding the value of b₀ & b₁, we get the required fitted
regression model of y on x as ŷ= b₀ + b₁x .
34
35. 35
Measures of variation
• There are three measures of variations.
• They are as follows:
i. Total Sum of Squares (SST):It is a measures of
variation in the values of dependent variable (y)
around their mean value (y̅). That is
• SST = ∑(y – y)̅² = ∑y² - (∑y)²/n = ∑y² - n.y²̅ .
• Note: The total sum of squares or the total variation
is divided into the sum of two components. One is
explained variation due to the relationship between
the considered dependent variable (y) and the
independent variable (x) and the other is unexplained
variation which might be developed due to some
other factors other than the relationship between
variable x and y.
36. 36
Contd…
ii. Regression Sum of Squares( SSR): The
regression sum of squares is the sum of the
squared differences between the predicted
value of y and the mean value of y.
• SSR = ∑(ŷ - y)̅² = b₀.∑y+b₁ ∑xy – (∑y)²/n =
b₀.∑y+b₁∑xy – n.y²
ii. Error Sum of Squares (SSE): The error sum of
square is computed as the sum of the
squared differences between the observed
value of y and the predicted value of y i.e.
• SSE = ∑(y – ŷ)² = ∑y²- b₀∑y – b₁∑xy.
38. 38
Contd…
Relationship: From the above figures the
relationship of SST, SSR and SSE are as follows
SST = SSR + SSE………………(i)
Where: SST = Total sum of square
SSR = Regression sum of squares
SSE = Error sum of squares
•The fit of the estimated regression line would
be best if every value of the dependent variable
y falls on the regression line.
39. 39
Contd….
• If SSE = 0 i. e. e = (y – ŷ) = 0 then SST = SSR.
• For the perfect fit of the regression model, the
ratio of SSR to SST must be equal to unity i. e.
If SSE = 0 then the model would be perfect.
• If SSE would be larger, the fit of the regression
line would be poor.
• Note: Largest value of SSE the regression line
would be poor and if SSE = 0 the regression
line would be perfect.
40. 40
Coefficient of Determination (r²)
• The coefficient of determination measures the
strength or extent of the association that exists
between dependent variable (y) and independent
variable (x).
• It measures the proportion of variation in the
dependent variable (y) that is explained by
independent variable of the regression line.
• Coefficient of variation measures the total
variation in the dependent variable due to the
variation in the independent variable and it is
denoted b r².
• r² = SSR/SST but SST = SSE + SSR
• then SSR = SST - SSE
• r² = 1 – (SSE/SST) = (b₀.∑y+b₁∑xy – n.y²̅ )/(∑y² – ny²̅ ).
41. Contd…
• Note:
i. Coefficient of determination is the square of
coefficent of correlation.
then r = ±√r²
ii. If the regression coefficient (b₁) is negative then
take the negative sign
iii. If the regression coefficient (b₁) is positive then
take the positive sign
• Adjusted coefficient of determination: The
adjusted coefficient of determination is
calculated by using the following relation:
•
41
42. The Standard Error of Estimates
The standard error of estimate of Y on X, denoted by Sᵧᵪ, measures the
average variation or scatteredness of the observes data point around the
regression line. It is used to measure the reliability of the regression
equation. It is calculated by the following relation:
Interpretation of standard error of the estimate:
iii.
i. If the standard error of estimate is larger, there is greater scattering or
dispersion of the data points around the fitted line then the regression
line is poor.
ii. If the standard error is small, then there is less variation of the observed
data around the regression line. So the regression line will be better for
the predicting the dependent variable.
If the standard error is zero, it is expected that the estimating equation
would be perfect estimator of the dependent variable.
42
43. 43
Test of Significance of Regression Coefficient in
Simple Linear Regression Model
• Totest the significance of regression
coefficient of the simple linear regression
model Y = β₀ + β₁ X + Ɛ, the following statistical
test have been applied.
i. t- test for significance in simple linear
regression.
ii. F-test for significance in simple linear
regression.
44. 44
(i) t- test for Significance in Simple Linear Regression
• t-test is applied whether the regression
coefficient β₁ is statistically significant or not.
• The process of setting Hypothesis are as follows:
• Setting of Hypothesis:
• Null Hypothesis, H₀: β₁=0 (The population slope
(β₁) is zero between two variables X and Y in the
population.)
• Alternative Hypothesis, H₁: β₁ ≠ 0 (The population
slope (β₁) is not zero between two variables X and
Y in the population.) or H₁: β₁ > 0 or H₁: β₁ < 0
45. Contd…
• Test statistic: Under H₀ the test statistic is:
• Where
• This test statistic t follows t- distribution with (n – 2) d. f.
• Decisions: (i) If tcal < t tab at α % level of significance with (n-2) d.
f. then we accept H₀.
45
>t tab
• (ii) If tcal at α % level of significance with (n-2) d. f. then we
reject H₀ then we accept H₁.
46. 46
Confidence Interval Estimating for β₁
• Another way for the linear relationship between
the variables X and Y,we can construct
confidence Interval (C. I.) estimate of β₁.
• By the help of C. I. we conclude that whether the
hypothesized value (β₁= 0) is included or not.
• For this the following formula is used:
• C. I. for β₁ = b₁ ± t(n – 2).Sb₁
• Conclusion: If this confidence interval does not
include 0(zero), then we can conclude that there
is significant relationship between the variables X
and Y.
• Example:
47. 47
(ii) F-test for Significant in Simple Linear Regression
• F- test based on F-probability distribution can
also be applied in order to test for the
significance in regression.
• The process of setting Hypothesis are as follows:
• Setting of Hypothesis:
• Null Hypothesis, H₀: β₁=0 (The population slope
(β₁) is zero between two variables X and Y in the
population.)
• Alternative Hypothesis, H₁: β₁ ≠ 0 (The population
slope (β₁) is not zero between two variables X and
Y in the population.) or H₁: β₁ > 0 or H₁: β₁ < 0
48. 48
Contd…
• Test statistic: F is defined as the ratio of
regression mean square (MSR) to the error mean
square (MSE).
• Where, MSR = SSR/k and MSE = SSE/(n-k-1)
• k= No. of independent variables in the regression
model. The value of k = 1 for simple linear
regression model as it has only one predictor
variable x.
• SSR = regression sum of squares = ∑(ŷ – y)̅ ²
• SSE = error sum of squares = ∑(y – ŷ)²
• The test statistic F- follows F-distribution with (n –
k – 1) i. E. (n – 2) d. f. With k = 1.
49. 49
Contd…
• The ANOVA Table for F- statistic are summarized as:
Sources of
variation
Sum of
squares
d. f. Mean square F- ratio
Regression SSR 1 MSR = SSR/1 F = MSR / MSE
Error SSE (n – 2) MSE = SSE/n - 2
Total SST (n – 1)
• Decisions:
i. (i) If Fcal < Ftab at α % level of significance and F with 1
d. f. in the numerator and (n-2) d. f. in the
denominator then we accept H₀.
ii. (ii) If Fcal > Ftab at α % level of significance and F with1
d. f. in the numerator and (n-2) d. f. in the
denominator then we reject H₀ the accept H₁.
50. 50
Contd…
iii. Using p- value we reject H₀ if p- value is less
than α.
• Note:
i. F- test will provide same conclusion as provided
by the t-test for only one independent variable.
ii. For simple linear regression; if t-test indicates β₁
≠ 0 and hence the F-test will also show a
significance relationship.
iii. However only the F-test can be used to test for
an overall significant relationship for the
regression with more than one independent
variable.
51. 51
Confidence Interval Estimating of the Mean value of y.
• A point estimate is a single numerical estimate of
y is produces without any indication of ite
accuracy.
• A point estimate provides no sense of how far off
it may be from the population parameter.
• Todetermine the information a prediction or
confidence interval is developed.
• Prediction interval are used to predict particular y
values for a given value of x.
• Confidence intervals are used to estimate the
mean value of y for a given value of x.
• The point estimate of the mean value of y is same
as the point estimate of an individual value of y.
52. Contd…
• The formula to compute the confidence interval estimate for the mean
value of y is:
• The formula to compute the prediction interval estimate of an individual
value of y is:
• Where; ŷ = estimated or predicted value of the dependent variable
for a given value of independent variable.
• Sᵧᵪ = standard error of estimate
• t(n-2)= tabulated value of t for (n- 2) d. f. and α level of significance.
• h= hat matrix element.
• n = number of pairs of observations or sample size.
52
53. Interval Estimates for different values of x
X
x̅ A Given X
Y
Prediction interval for
a individual Y
Confidence interval
for the mean of Y
53