Upcoming SlideShare
×

# Lesson04

2,741 views
2,573 views

Published on

Statistics for International Business School, Hanze University of Applied Science, Groningen, The Netherlands

Published in: Technology
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
2,741
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
65
0
Likes
0
Embeds 0
No embeds

No notes for slide
• Correlation and CauseJust because two variables are correlated, does not mean that one of the variables is the cause of the other. It could be the case, but it does not necessarily follow: There is a strong positive correlation between the number of cigarettes that one smokes a day and one&apos;s chances of contracting lung cancer (measured as the number of cases of lung cancer per hundred people who smoke a given number of cigarettes). The percentage of heavy smokers who contract lung cancer is higher than the percentage of light smokers who develop the disease, and both figures are higher than the percentage of non-smokers who get lung cancer. In this case, the cigarettes are definitely causing the cancer. There is a strong negative correlation between the total number of skiing holidays that people book for any month of the year and the total amount of ice cream that supermarkets sell for that month. This means that the more skiing holidays that are booked, the less ice cream is sold. Is there a cause here? Are people spending so much money on ice cream that they can&apos;t afford skiing holidays? Is the fact that the ice cream is so cold putting people off skiing? Clearly not! The simple fact is that most people tend to book their skiing holidays in the winter, and they tend to buy ice cream in the summer. Although a correlation between two variables doesn&apos;t mean that one of them causes the other, it can suggest a way of finding out what the true cause might be. There may be some underlying variable that is causing both of them. For instance, if a survey found that there is a correlation between the time that people spend watching television and the amount of crime that people commit, it could be because unemployed people tend to sit around watching the television, and that unemployed people are more likely to commit crime. If that were the case, then unemployment would be the true cause!
• ### Lesson04

1. 1. IBS Statistics<br />Year 1<br />Dr. Ning DING <br />n.ding@pl.hanze.nl<br />I.007<br />
2. 2. What we are going to learn?<br /><ul><li>Review
3. 3. Chapter12: Simple Regression and Correlation
4. 4. dependent / independent variables
5. 5. scatterdiagrams
6. 6. regressionanalysis
7. 7. Least-squares estimatingequation
8. 8. the coefficient of determination
9. 9. the coefficient of correlation</li></li></ul><li>Review<br /><ul><li>Review
10. 10. Chapter12: Simple Regression and Correlation
11. 11. Exercises</li></ul>Find the interquartile range:<br /> <br />1460<br />1471<br />1637<br />1721<br />1758<br />1787 <br />1940<br />2038<br />2047<br />2054 <br />2097<br />2205<br />2287<br />2311<br />2406<br />Interquartile Range<br />=Q3-Q1<br />=2205-1721<br />=484<br />
12. 12. Review EXCEL Lesson<br /><ul><li>Review
13. 13. Chapter12: Simple Regression and Correlation
14. 14. Exercises</li></ul>L=(8+1)*25%=2.25<br />Q1=133.5<br />Interquartile Range<br />=274.5-133.5<br />=141<br />L=(8+1)*75%=6.75<br />Q3=274.5<br />
15. 15. Review<br />Median<br />Quartile<br />Decile<br />Percentile<br />1<br />2<br />2<br />4<br />1<br />2<br />2<br />4<br />5<br />7<br />8<br />9<br />12<br />1st D<br />Q1=2<br />Interquartile<br />Range<br />5<br />7<br />8<br />9<br />12<br />Q3=8.5<br />9th D<br />Boxplot<br />How to interpret?<br />http://cnx.org/content/m11192/latest/<br />
16. 16. Review<br /><ul><li>Review
17. 17. Chapter12: Simple Regression and Correlation
18. 18. Exercises</li></ul>Mean= € 450<br />a<br />b<br />€ 20<br />€ 2000<br />Q1= € 250<br />Q3= € 850<br />Median= € 350<br />The distribution is skewed to __________ because the mean is __________the median. <br />the right <br />larger than <br />http://cnx.org/content/m11192/latest/<br />
19. 19. 0.8<br />1.0<br />1.0<br />1.2<br />1.2<br />1.3<br />1.5<br />1.7<br />2.0<br />2.0<br />2.1<br />2.2<br />4.0<br />Review<br />Mean > Median<br />2.0<br />3.2<br />3.6<br />3.7<br />4.0<br />4.2<br />4.2<br />4.5<br />4.5<br />4.6<br />4.8<br />5.0<br />5.0<br />Mean < Median<br />Positively skewed<br />http://qudata.com/online/statcalc/<br />Negatively skewed<br />
20. 20. Review<br />This means that the data is symmetrically distributed. <br />Zero skewness<br />mode=median=mean<br />
21. 21. Chapter 12<br /><ul><li>Review
22. 22. Chapter12:
23. 23. scatterdiagrams
24. 24. dependent / independent variables
25. 25. regressionanalysis
26. 26. Least-squares estimatingequation
27. 27. the coefficient of determination
28. 28. the coefficient of correlation
29. 29. scatterdiagrams
30. 30. dependent / independent variables
31. 31. regressionanalysis
32. 32. Least-squares estimatingequation
33. 33. the coefficient of determination
34. 34. the coefficient of correlation</li></li></ul><li>Regression and Correlation Analyses<br /><ul><li>Review
35. 35. Chapter12:
36. 36. scatter diagrams
37. 37. dependent / independent variables
38. 38. regressionanalysis
39. 39. Least-squares estimatingequation
40. 40. the coefficient of determination
41. 41. the coefficient of correlation
42. 42. How to determine both the nature and the strength of a relationship between variables. </li></li></ul><li>Regression and Correlation Analyses<br /><ul><li>Review
43. 43. Chapter12:
44. 44. scatterdiagrams
45. 45. dependent / independent variables
46. 46. regressionanalysis
47. 47. Least-squares estimatingequation
48. 48. the coefficient of determination
49. 49. the coefficient of correlation</li></ul>Scatter Diagram:<br />Positive correlation<br />
50. 50. Regression and Correlation Analyses<br /><ul><li>Review
51. 51. Chapter12:
52. 52. scatterdiagrams
53. 53. dependent / independent variables
54. 54. regressionanalysis
55. 55. Least-squares estimatingequation
56. 56. the coefficient of determination
57. 57. the coefficient of correlation</li></ul>Scatter Diagram:<br />Negative correlation<br />
58. 58. Regression and Correlation Analyses<br /><ul><li>Review
59. 59. Chapter12:
60. 60. scatterdiagrams
61. 61. dependent / independent variables
62. 62. regressionanalysis
63. 63. Least-squares estimatingequation
64. 64. the coefficient of determination
65. 65. the coefficient of correlation</li></ul>Scatter Diagram:<br />No correlation<br />
66. 66. Regression and Correlation Analyses<br /><ul><li>Review
67. 67. Chapter12:
68. 68. scatterdiagrams
69. 69. dependent / independent variables
70. 70. regressionanalysis
71. 71. Least-squares estimatingequation
72. 72. the coefficient of determination
73. 73. the coefficient of correlation</li></ul>Scatter Diagrams:<br /><ul><li>Patterns indicating that the variables are related
74. 74. If related, we can describe the relationship</li></ul>Weak & Positive<br />correlation<br />Strong & Positive<br />correlation<br />No<br />correlation<br />Weak & Negative<br />correlation<br />Strong & Negative<br />correlation<br />
75. 75. Regression and Correlation Analyses<br /><ul><li>Review
76. 76. Chapter12:
77. 77. scatterdiagrams
78. 78. dependent / independent variables
79. 79. regressionanalysis
80. 80. Least-squares estimatingequation
81. 81. the coefficient of determination
82. 82. the coefficient of correlation
83. 83. Independent variables: known
84. 84. Dependent variables: to predict</li></ul>Variables: <br />DependentVariable<br />Independent Variable<br />
85. 85. Regression and Correlation Analyses<br /><ul><li>Review
86. 86. Chapter12:
87. 87. scatterdiagrams
88. 88. dependent / independent variables
89. 89. regressionanalysis
90. 90. Least-squares estimatingequation
91. 91. the coefficient of determination
92. 92. the coefficient of correlation</li></ul>Correlation & Cause Effect?<br /><ul><li>The relationships found by regression to be relationships of association
93. 93. Notnecessarilly of cause and effect.</li></li></ul><li><ul><li>Review
94. 94. Chapter12:
95. 95. scatterdiagrams
96. 96. dependent / independent variables
97. 97. regressionanalysis
98. 98. Least-squares estimatingequation
99. 99. the coefficient of determination
100. 100. the coefficient of correlation</li></li></ul><li>Least-squares estimating equation:<br /><ul><li>The dependent variable Y is determined by the independent variable X</li></ul>Y<br /> X<br /><ul><li>Review
101. 101. Chapter12:
102. 102. scatterdiagrams
103. 103. dependent / independent variables
104. 104. regression analysis
105. 105. Least-squares estimating equation
106. 106. the coefficient of determination
107. 107. the coefficient of correlation</li></ul>DependentVariable<br />88<br />?<br />I<br />Independent Variable<br />Ŷ = a + bX<br />
108. 108. Least-squares estimating equation:<br /><ul><li>Review
109. 109. Chapter12:
110. 110. scatterdiagrams
111. 111. dependent / independent variables
112. 112. regression analysis
113. 113. Least-squares estimating equation
114. 114. the coefficient of determination
115. 115. the coefficient of correlation</li></ul>Ŷ = a + bX<br />
116. 116. Least-squares estimating equation:<br /><ul><li>Review
117. 117. Chapter12:
118. 118. scatterdiagrams
119. 119. dependent / independent variables
120. 120. regression analysis
121. 121. Least-squares estimating equation
122. 122. the coefficient of determination
123. 123. the coefficient of correlation</li></ul>Y = a + bX<br />a = Y - bX<br />
124. 124. Least-squares estimating equation:<br />therelationshipbetween the age of a truck and the annual repair expense?<br /><ul><li>Review
125. 125. Chapter12:
126. 126. scatterdiagrams
127. 127. dependent / independent variables
128. 128. regression analysis
129. 129. Least-squares estimating equation
130. 130. the coefficient of determination
131. 131. the coefficient of correlation</li></ul>a = Y - bX<br />Step 2:<br />Y = a + bX<br />Step 1:<br />Ŷ = 3.75 + 0.75 X<br />Step 6:<br />Step 4:<br />X=3<br />Y=6<br />6.75= 3.75 + 0.75 * 4<br />Step 7:<br />a = 6 - 0.75*3 = 3.75<br />Step 5:<br />If the city has a truck that is 4 years old, <br />Step 8:<br />the director could use the equation to predict \$675 annually in repairs. <br />
132. 132. Least-squares estimating equation:<br />Example:<br /><ul><li>To find the simple/linear regression of Personal Income (X) and Auto Sales (Y)</li></ul>If X=64, what about Y?<br /><ul><li>Review
133. 133. Chapter12:
134. 134. scatterdiagrams
135. 135. dependent / independent variables
136. 136. regression analysis
137. 137. Least-squares estimating equation
138. 138. the coefficient of determination
139. 139. the coefficient of correlation</li></ul>Step 1: <br />Count the number of values.      <br />N = 5<br />Step 2: <br />Find XY, X2   See the below table<br />
140. 140. Least-squares estimating equation:<br />Substitute in the above slope formula given.            <br />Slope(b) = = 0.19<br /> 1159.7-5*62.2*3.72<br />19359-5*62.2*62.2<br /><ul><li>Review
141. 141. Chapter12:
142. 142. scatterdiagrams
143. 143. dependent / independent variables
144. 144. regression analysis
145. 145. Least-squares estimating equation
146. 146. the coefficient of determination
147. 147. the coefficient of correlation</li></ul>Find ΣX, ΣY, ΣXY, ΣX2.            ΣX = 311 Mean = 62.2             ΣY = 18.6 Mean = 3.72<br />            ΣXY = 1159.7             ΣX2 = 19359 <br />Step 3: <br />Step 4: <br />
148. 148. Least-squares estimating equation:<br />            <br />Slope(b) = 0.19<br /><ul><li>Review
149. 149. Chapter12:
150. 150. scatterdiagrams
151. 151. dependent / independent variables
152. 152. regression analysis
153. 153. Least-squares estimating equation
154. 154. the coefficient of determination
155. 155. the coefficient of correlation</li></ul>Now, again substitute in the above intercept formula given.           <br /> Intercept(a) = Y - bX  = 3.72- 0.19 * 62.2= -8.098<br />Step 5: <br />Step 6: <br />Then substitute these values in regression equation formula            Regression Equation(Ŷ) = a + bX<br />         Ŷ  = -8.098 + 0.19X<br />Regression Equation:<br />Ŷ = a + bX            = -8.098 + 0.19(64)            = -8.098 + 12.16            = 4.06<br />Suppose if we want to know the approximate y value for the variable X = 64. Then we can substitute the value in the above equation.<br />
156. 156. Least-squares estimating equation:<br /> to minimize the sum of the squares of the errors to measure the goodness of fit of a line<br /><ul><li>Review
157. 157. Chapter12:
158. 158. scatterdiagrams
159. 159. dependent / independent variables
160. 160. regression analysis
161. 161. Least-squares estimating equation
162. 162. the coefficient of determination
163. 163. the coefficient of correlation</li></ul>SE<br />SE<br />ei = residuali<br />Strong<br />correlation<br />Weak<br />correlation<br />
164. 164. Least-squares estimating equation:<br /> to minimize the sum of the squares of the errors to measure the goodness of fit of a line<br /><ul><li>Review
165. 165. Chapter12:
166. 166. scatterdiagrams
167. 167. dependent / independent variables
168. 168. regression analysis
169. 169. Least-squares estimating equation
170. 170. the coefficient of determination
171. 171. the coefficient of correlation</li></ul>ei = residuali<br />
172. 172. Correlation Analysis:<br />describe the degree to which one variable is linearly related to another. <br /><ul><li>Review
173. 173. Chapter12:
174. 174. scatterdiagrams
175. 175. dependent / independent variables
176. 176. regression analysis
177. 177. Least-squares estimating equation
178. 178. the coefficient of determination
179. 179. the coefficient of correlation</li></ul>r 2<br />Coefficient of Determination:<br />Measure the extent, or strength, of the association that exists<br />between two variables. <br />r<br />Coefficient of Correlation:<br />Square root of coefficient of determination<br />
180. 180. r 2<br />Coefficient of Determination:<br />Measure the extent, or strength, of the association that exists between two variables. <br /><ul><li>Review
181. 181. Chapter12:
182. 182. scatterdiagrams
183. 183. dependent / independent variables
184. 184. regression analysis
185. 185. Least-squares estimating equation
186. 186. the coefficient of determination
187. 187. the coefficient of correlation
188. 188. 0 ≤ r2 ≤ 1.
189. 189. The larger r2 , the stronger the linear relationship.
190. 190. The closer r2 is to 1, the more confident we are in our prediction.</li></li></ul><li>r 2<br />Coefficient of Determination:<br /><ul><li>Review
191. 191. Chapter12:
192. 192. scatterdiagrams
193. 193. dependent / independent variables
194. 194. regression analysis
195. 195. Least-squares estimating equation
196. 196. the coefficient of determination
197. 197. the coefficient of correlation</li></li></ul><li>r<br />Coefficient of Correlation:<br />Square root of coefficient of determination<br /><ul><li>Review
198. 198. Chapter12:
199. 199. scatterdiagrams
200. 200. dependent / independent variables
201. 201. regression analysis
202. 202. Least-squares estimating equation
203. 203. the coefficient of determination
204. 204. the coefficient of correlation</li></li></ul><li>Review<br /><ul><li>Review
205. 205. Chapter12:
206. 206. scatterdiagrams
207. 207. dependent / independent variables
208. 208. regression analysis
209. 209. Least-squares estimating equation
210. 210. the coefficient of determination
211. 211. the coefficient of correlation</li></ul>Which value of r indicates a stronger correlation than 0.40? A. -0.30B. -0.50C. +0.38D. 0<br />If all the plots on a scatter diagram lie on a straight line, what is the standard error of estimate? A. -1B. +1C. 0D. Infinity<br />
212. 212. Review<br /><ul><li>Review
213. 213. Chapter12:
214. 214. scatterdiagrams
215. 215. dependent / independent variables
216. 216. regression analysis
217. 217. Least-squares estimating equation
218. 218. the coefficient of determination
219. 219. the coefficient of correlation</li></ul>In the least squares equation,  Ŷ = 10 + 20X the value of 20 indicates A. the Y intercept.B. for each unit increase in X, Y increases by 20.C. for each unit increase in Y, X increases by 20.D. none of these.<br /> <br />
220. 220. Review<br /><ul><li>Review
221. 221. Chapter12:
222. 222. scatterdiagrams
223. 223. dependent / independent variables
224. 224. regression analysis
225. 225. Least-squares estimating equation
226. 226. the coefficient of determination
227. 227. the coefficient of correlation</li></ul>A sales manager for an advertising agency believes there is a relationship between the number of contacts and the amount of the sales. To verify this belief, the following data was collected: <br />What is the Y-intercept of the linear equation? A. -12.201B. 2.1946C. -2.1946D. 12.201<br />
228. 228. What we have learnt?<br /><ul><li>scatterdiagrams
229. 229. dependent / independent variables
230. 230. regressionanalysis
231. 231. Least-squares estimatingequation
232. 232. the coefficient of determination
233. 233. the coefficient of correlation</li></li></ul><li>