2. Descriptive Statistics used to describe the main
features of a collection of data in quantitative terms
Inferential Statistic comprises the use of statistics
and random sampling to make conclusion
concerning some unknown aspect of a population
Sample (mean)
Random • Calculate Population
sample mean
Sample to estimate
(mean)
population
mean
3. Measures of central tendency
(Mean, Median, Mode)
Measures of dispersion
(variance, standard deviation)
Measures of shape
(skewness)
4. Mean
Arithmetic Mean
Geometric Mean
Median
Mode
Quartiles
5. Range: the difference between the largest value of data
set and the smallest value
Interquartile range: the range of values between the
first and the third quartile
Mean absolute deviation MAD = ∑ | x – x | / n
Variance S2 = ∑ X2 – (∑ X)2/n
(for sample variance) n-1
Standard Deviation S S2
6. Interpretation of Standard Deviation
Eg. µ = 100 σ=15
• ± 1σ = 85/115
• ± 2σ = 70/130
• ± 3σ = 55/145
Frequency
Value Changes
68%
95%
99.7%
7. Skewness
is a measure of the asymmetry of the probability
distribution of a real-valued random variable
Negatively Skew/ Positively Skew/
Skewed to the left Skewed to the right
Mean
Mean
Mode
Mode
Median
Median
Sk = 3 (mean − median) / standard deviation
8. Relative Cumulative
Class Interval Frequency Mid Point
Frequency Frequency
20 ≤ x < 30 6 25 .12 6
30 ≤ x < 40 18 35 .36 24
40 ≤ x < 50 11 45 .22 35
50 ≤ x < 60 11 55 .22 46
60 ≤ x < 70 3 65 .06 49
70 ≤ x < 80 1 75 .02 50
Totals 50 1.00
11. Method of assigning probabilities:
Classical (Apriority) probability
Relative frequency of occurrence
Subjective probability
12. General law of addition
P X Y P X PY P X Y
Special law of addition
P X Y P X P Y
General law of multiplication
P X Y P X P Y | X P Y P X | Y
Special law of multiplication
P X Y P X PY
Law of conditional probability
P X | Y P X Y P X PY | X
PY P Y
14. Find the equation of regression line
^
Y b0 b1 X
_ _
Where as the populationY intercept b0 Y b1 X
The population slope X Y
XY
b1 SSxy n
SSxx X 2
X2 n
15. Hospitals Number of beds Full Time Employees ^
X Y Y b0 b1 X
1 23 69 ^
2 29 95 Y 30 .9125 2.232 X
3 29 102
4 35 118
5 42 126
6 46 125
7 50 138
8 54 178
9 64 156
10 66 184
11 76 176
12 78 225
16. Measure of how well the regression line
approximates the real data points
The proportion of variability of the dependent
variable (Y) explained by independent variable (X)
R2 = 0 ---> no regression prediction of Y by X
R2 = 1 ---> perfect regression prediction of Y by X
(100% of the variability of Y is accounted for by X )
17. r2 = Explained Variation / Total Variation
Total Variation = Explained Variation + Unexplained Variation
(The dependent variable,Y , measured by sum of squares ofY (SSyy))
Explained Variation = sum of square regression (SSR)
SSR Yi Y )
i
2
Unexplained Variation = sum of square of error (SSE)
SSE Xi Yi
2
i
18. r2 = Explained Variation / Total Variation
2
^
Y Y
r2 = 1 -
Y 2
Y 2 n
i
19. Hospitals Number of beds Full Time Employees ^
X Y Y b0 b1 X
1 23 69 ^
2 29 95 Y 30 .9125 2.232 X
3
4
29
35
102
118
SSE = 2448.6
5 42 126
6 46 125 r2 = 0,886
7 50 138
8 54 178
9 64 156
10 66 184
11 76 176
12 78 225
20. Diane Christina | 2009
diane.christina@apb-group.com | me@dianechristina.com
http://dianechristina.wordpress.com