004

IBS Statistics Year 1 Dr. Ning DING

Table of content Review Interquartile Range Skewness Learning Goals Chapter 12: Simple Regression and Correlation Exercises

Review Chapter 3: Describing Data Find the interquartile range: 1460 1471 1637 1721 1758 1787 1940 2038 2047 2054 2097 2205 2287 2311 2406 Interquartile Range =Q 3 -Q 1 =2205-1721 =484

Correction of EXCEL Exercise 5 L=(8+1)*25%=2.25 Q1=133.5 L=(8+1)*75%=6.75 Q3=274.5 Interquartile Range =274.5-133.5 =141

Boxplot 1 2 2 4 5 7 8 9 12 Median 1 2 2 4 7 8 9 12 Quartile Q 1 =2 Q 3 =8.5 5 Interquartile Range Decile 1st D 9th D Percentile http://cnx.org/content/m11192/latest/ How to interpret?

Boxplot The distribution is skewed to __________ because the mean is __________the median. the right larger than http://cnx.org/content/m11192/latest/ € 20 € 2000 Q 1 = € 250 Q 3 = € 850 Median= € 350 Mean= € 450 a b

0.8 1.0 1.0 1.2 1.2 1.3 1.5 1.7 2.0 2.0 2.1 2.2 4.0 2.0 3.2 3.6 3.7 4.0 4.2 4.2 4.5 4.5 4.6 4.8 5.0 5.0 Mean > Median Mean < Median Positively skewed Negatively skewed http://qudata.com/online/statcalc/

This means that the data is symmetrically distributed . Zero skewness mode=median=mean

Learning Goals Chapter 12: Learn how many business decisions depend on knowing the specific relationship between two or more variables Use scatter diagrams to visualize the relationship between two variables Use regression analysis to estimate the relationship between two variables Use the least-squares estimating equation to predict future values of the dependent variable Learn how correlation analysis describes the degree to which two variables are linearly related to each other Understand the coefficient of determination as a measure of the strength of the relationship between two variables Learn limitations of regression and correlation analyses and caveats about their use.

1. Introduction Chapter 12: Sim Reg & Corr Regression and Correlation Analyses: How to determine both the nature and the strength of a relationship between variables.

1. Introduction Chapter 12: Sim Reg & Corr Scatter Diagram: Positive correlation

1. Introduction Chapter 12: Sim Reg & Corr Scatter Diagram: Negative correlation

1. Introduction Chapter 12: Sim Reg & Corr Scatter Diagram: No correlation

2. Types of Relationships Chapter 12: Sim Reg & Corr Variables: Independent variables: known Dependent variables: to predict Independent Variable Dependent Variable

2. Types of Relationship Chapter 12: Sim Reg & Corr Correlation & Cause Effect? The relationships found by regression to be relationships of association Not necessarilly of cause and effect. Independent Variable Dependent Variable

2. Estimation Using the Regression Line Chapter 12: Sim Reg & Corr Scatter Diagrams: Patterns indicating that the variables are related If related, we can describe the relationship Strong & Positive correlation Strong & Negative correlation Weak & Positive correlation Weak & Negative correlation No correlation

Chapter 12: Sim Reg & Corr Scatter Diagrams: 2. Estimation Using the Regression Line

2. Estimation Using the Regression Line Chapter 12: Sim Reg & Corr Simple Linear Regression: The dependent variable Y is determined by the independent variable X Ŷ = a + b X Independent Variable Dependent Variable Y X

2. Estimation Using the Regression Line Chapter 12: Sim Reg & Corr Simple Linear Regression: The dependent variable Y is determined by the independent variable X Ŷ = a + b X

2. Estimation Using the Regression Line Chapter 12: Sim Reg & Corr Slope of the Best-Fitting Regression Line: Y = a + b X a = Y - b X

2. Estimation Using the Regression Line Chapter 12: Sim Reg & Corr the relationship between the age of a truck and the annual repair expense? a = 6 - 0.75*3 = 3.75 Ŷ = 3.75 + 0.75 X If the city has a truck that is 4 years old, the director could use the equation to predict $675 annually in repairs. 6.75 = 3.75 + 0.75 * 4 Y = a + b X a = Y - b X X=3 Y=6

Exercise Chapter 12: Sim Reg & Corr Example: To find the simple/linear regression of Personal Income ( X ) and Auto Sales ( Y ) Count the number of values. Find XY, X 2 See the below table N = 5 X=64 what about Y? Step 1: Step 2:

Exercise Chapter 12: Sim Reg & Corr Find Σ X, Σ Y, Σ XY, Σ X 2 . Σ X = 311 Mean = 62.2 Σ Y = 18.6 Mean = 3.72 Σ XY = 1159.7 Σ X 2 = 19359 Step 3: Step 4: Substitute in the above slope formula given. Slope(b) = = 0.19 1159.7-5*62.2*3.72 19359-5*62.2*62.2

Exercise Chapter 12: Sim Reg & Corr Then substitute these values in regression equation formula Regression Equation( Ŷ ) = a + bX Ŷ = -8.098 + 0.19 X . Slope(b) = 0.19 Suppose if we want to know the approximate y value for the variable X = 64. Then we can substitute the value in the above equation. Regression Equation: Ŷ = a + bX = -8.098 + 0.19( 64 ). = -8.098 + 12.16 = 4.06 Step 5: Step 6: Now, again substitute in the above intercept formula given. Intercept(a) = Y - b X = 3.72- 0.19 * 62.2= -8.098

2. Estimation Using the Regression Line Chapter 12: Sim Reg & Corr Least Squares Method: Minimize the sum of the squares of the errors to measure the goodness of fit of a line e i = residual i

2. Estimation Using the Regression Line Chapter 12: Sim Reg & Corr Least Squares Method:

2. Estimation Using the Regression Line Chapter 12: Sim Reg & Corr Example:

2. Estimation Using the Regression Line Chapter 12: Sim Reg & Corr Example Solution:

3. Correlation Analysis Chapter 12: Sim Reg & Corr Correlation Analysis: describe the degree to which one variable is linearly related to another. Coefficient of Determination: Measure the extent, or strength, of the association that exists between two variables. Coefficient of Correlation: Square root of coefficient of determination r 2 r

3. Correlation Analysis Chapter 12: Sim Reg & Corr Coefficient of Determination: Measure the extent, or strength, of the association that exists between two variables. 0 ≤ r 2 ≤ 1. The larger r 2 , the stronger the linear relationship. The closer r 2 is to 1, the more confident we are in our prediction.

3. Correlation Analysis Chapter 12: Sim Reg & Corr Coefficient of Correlation:

3. Correlation Analysis Chapter 12: Sim Reg & Corr Coefficient of Determination:

3. Correlation Analysis Chapter 12: Sim Reg & Corr Example Solution:

Review Chapter 3: Describing Data Which value of r indicates a stronger correlation than 0.40? A. -0.30 B. -0.50 C. +0.38 D. 0 If all the plots on a scatter diagram lie on a straight line, what is the standard error of estimate? A. -1 B. +1 C. 0 D. Infinity

Review Chapter 3: Describing Data In the least squares equation, Ŷ = 10 + 20 X the value of 20 indicates A. the Y intercept. B. for each unit increase in X , Y increases by 20. C. for each unit increase in Y , X increases by 20. D. none of these.

Exercise Chapter 3: Describing Data A sales manager for an advertising agency believes there is a relationship between the number of contacts and the amount of the sales. To verify this belief, the following data was collected: What is the Y-intercept of the linear equation? A. -12.201 B. 2.1946 C. -2.1946 D. 12.201

Exercise Chapter 12: Sim Reg & Corr Ŷ = -1.8182 + 0.1329X Sample Exam P.4

Exercise Chapter 12: Sim Reg & Corr Sample Exam P.4

Exercise Chapter 12: Sim Reg & Corr Sample Exam P.4 Ŷ = -1.8182 + 0.1329X

Summary Chapter 1: What is Statistics? Chapter 3: Calculate the arithmetic mean, weighted mean, median, mode, and geometric mean Explain the characteristics, uses, advantages, and disadvantages of each measure of location Identify the position of the mean, median, and mode for both symmetric and skewed distributions Compute and interpret the range, mean deviation, variance, and standard deviation Understand the characteristics, uses, advantages, and disadvantages of each measure of dispersion Understand Chebyshev’s theorem and the Empirical Rule as they relate to a set of observations

004

More Related Content

What's hot

Viewers also liked

Similar to 004

More from Ning Ding

Recently uploaded

004

Editor's Notes