Lesson04

Statistics for International Business School, Hanze University of Applied Science, Groningen, The Netherlands

Lesson04

IBS Statistics
Year 1
Dr. Ning DING 
n.ding@pl.hanze.nl
I.007
Find the interquartile range:
 
1460
1471
1637
1721
1758
1787 
1940
2038
2047
2054 
2097
2205
2287
2311
2406
Interquartile Range
=Q3-Q1
=2205-1721
=484
Median
Quartile
Decile
Percentile
1
2
2
4
1
2
2
4
5
7
8
9
12
1st D
Q1=2
Interquartile
Range
5
7
8
9
12
Q3=8.5
9th D
Boxplot
How to interpret?
http://cnx.org/content/m11192/latest/
0.8
1.0
1.0
1.2
1.2
1.3
1.5
1.7
2.0
2.0
2.1
2.2
4.0
Review
Mean > Median
2.0
3.2
3.6
3.7
4.0
4.2
4.2
4.5
4.5
4.6
4.8
5.0
5.0
Mean < Median
Positively skewed
http://qudata.com/online/statcalc/
Negatively skewed
This means that the data is symmetrically distributed. 
Zero skewness
mode=median=mean
Scatter Diagram:
Positive correlation
Scatter Diagram:
Negative correlation
Scatter Diagram:
No correlation
73. 73. the coefficient of correlation</li></ul>Scatter Diagrams:<br /><ul><li>Patterns indicating that the variables are related
Scatter Diagrams:
Patterns indicating that the variables are related
If related, we can describe the relationship
Weak & Positive
correlation
Strong & Positive
correlation
No
correlation
Weak & Negative
correlation
Strong & Negative
correlation
83. 83. Independent variables: known
84. 84. Dependent variables: to predict</li></ul>Variables: <br />DependentVariable<br />Independent Variable<br />
Correlation & Cause Effect?
The relationships found by regression to be relationships of association
Notnecessarilly of cause and effect.
Chapter12: Simple Regression and Correlation
scatterdiagrams
dependent / independent variables
regressionanalysis
Least-squares estimatingequation
the coefficient of determination
the coefficient of correlation
101. 101. Chapter12:
102. 102. scatterdiagrams
103. 103. dependent / independent variables
104. 104. regression analysis
105. 105. Least-squares estimating equation
106. 106. the coefficient of determination
Least-squares estimating equation:
The dependent variable Y is determined by the independent variable X
Y
 X
DependentVariable
88
?
I
Independent Variable
Ŷ = a + bX
Least-squares estimating equation:
Ŷ = a + bX
Least-squares estimating equation:
Y = a + bX
a = Y - bX
Least-squares estimating equation:
therelationshipbetween the age of a truck and the annual repair expense?
a = Y - bX
Step 2:
Y = a + bX
Step 1:
Ŷ = 3.75 + 0.75 X
Step 6:
Step 4:
X=3
Y=6
6.75= 3.75 + 0.75 * 4
Step 7:
a = 6 - 0.75*3 = 3.75
Step 5:
If the city has a truck that is 4 years old, 
Step 8:
the director could use the equation to predict \$675 annually in repairs.
Least-squares estimating equation:
Example:
To find the simple/linear regression of Personal Income (X) and Auto Sales (Y)
If X=64, what about Y?
Step 1: 
Count the number of values.      
N = 5
Step 2: 
Find XY, X2   See the below table
Least-squares estimating equation:
Substitute in the above slope formula given.            
Slope(b) = = 0.19
 1159.7-5*62.2*3.72
19359-5*62.2*62.2
Find ΣX, ΣY, ΣXY, ΣX2.            ΣX = 311 Mean = 62.2             ΣY = 18.6 Mean = 3.72
            ΣXY = 1159.7             ΣX2 = 19359 
Step 3: 
Step 4:
Least-squares estimating equation:
            
Slope(b) = 0.19
Now, again substitute in the above intercept formula given.           
 Intercept(a) = Y - bX  = 3.72- 0.19 * 62.2= -8.098
Step 5: 
Step 6: 
Then substitute these values in regression equation formula            Regression Equation(Ŷ) = a + bX
         Ŷ  = -8.098 + 0.19X
Regression Equation:
Ŷ = a + bX            = -8.098 + 0.19(64)            = -8.098 + 12.16            = 4.06
Suppose if we want to know the approximate y value for the variable X = 64. Then we can substitute the value in the above equation.
Least-squares estimating equation:
 to minimize the sum of the squares of the errors to measure the goodness of fit of a line
SE
SE
ei = residuali
Strong
correlation
Weak
correlation
171. 171. the coefficient of correlation</li></ul>ei = residuali<br />
Correlation Analysis:
describe the degree to which one variable is linearly related to another. 
r 2
Coefficient of Determination:
Measure the extent, or strength, of the association that exists
between two variables. 
r
Coefficient of Correlation:
Square root of coefficient of determination
187. 187. the coefficient of correlation
188. 188. 0 ≤ r2 ≤ 1.
189. 189. The larger r2 , the stronger the linear relationship.
r
Coefficient of Correlation:
Square root of coefficient of determination
204. 204. the coefficient of correlation</li></li></ul><li>Review<br /><ul><li>Review
Review
In the least squares equation,  Ŷ = 10 + 20X the value of 20 indicates A. the Y intercept.B. for each unit increase in X, Y increases by 20.C. for each unit increase in Y, X increases by 20.D. none of these.
Review
A sales manager for an advertising agency believes there is a relationship between the number of contacts and the amount of the sales. To verify this belief, the following data was collected: 
What is the Y-intercept of the linear equation? A. -12.201B. 2.1946C. -2.1946D. 12.201
What we have learnt?
scatterdiagrams
dependent / independent variables
regressionanalysis
Least-squares estimatingequation
the coefficient of determination
the coefficient of correlation
