SlideShare a Scribd company logo
1 of 31
5/3/2023 Summary Statistics 1
Summary Statistics
Last week we used stemplots and histograms to
describe the shape, location, and spread of a
distribution. This week we use numerical summaries of
location and spread.
5/3/2023 Summary Statistics 2
Main Summary Statistics by Type
Central location
 Mean
 Median
 Mode
Spread
 Variance and standard deviation
 Quartiles and Inter Quartile Range (IQR)
Shape
 Statistical measures of spread (e.g., skewness and
kurtosis) are available but are seldom used in
practice (not covered)
5/3/2023 Summary Statistics 3
Notation
n  sample size
X  variable
xi  value of individual i
  sum all values (capital sigma)
Illustrative example (sample.sav), data:
21 42 5 11 30 50 28 27 24 52
 n = 10
 X = age
 x1= 21, x2= 42, …, x10= 52
 x = 21 + 42 + … + 52 = 290
5/3/2023 Summary Statistics 4
Sample Mean

 
 i
i
x
n
n
x
x
1
0
.
29
)
290
(
10
1
1


  i
x
n
x
Illustrative example: n = 10 (data & intermediate calculations on prior slide)
5/3/2023 Summary Statistics 5
Population Mean
Same operation as sample mean, but
based on entire population (N =
population size)
Not available in practice, but important
conceptually

 
 i
i
x
N
N
x 1

5/3/2023 Summary Statistics 6
Interpretation of xbar
Sample mean used to predict
 an observation drawn at random from a sample
 an observation drawn at random from the
population
 the population mean
Gravitational center (balance point)
0 10 20 30 40 50 60
Mean = 29
5/3/2023 Summary Statistics 7
Median – a different kind of average
“Middle value”
Covered last week
 Order data
 Depth of median is (n+1) / 2
 When n is odd  middle value
 When n is even  average two middle values
Illustrative example, n = 10  median has
depth (10+1) / 2 = 5.5
05 11 21 24 27 28 30 42 50 52

median = average of 27 and 28 = 27.5
5/3/2023 Summary Statistics 8
Median is “robust”
Robust  resistant to skews and outliers
This data set has a mean (xbar) of 1600:
1362 1439 1460 1614 1666 1792 1867
This data set has an outlier and a mean of 2743:
1362 1439 1460 1614 1666 1792 9867
Outlier
The median is 1614 in both instances.
The median was not influenced by the outlier.
5/3/2023 Summary Statistics 9
Mode
Mode  value with greatest frequency
e.g., {4, 7, 7, 7, 8, 8, 9} has mode = 7
Used only in very large data sets
5/3/2023 Summary Statistics 10
Mean, Median, Mode
(A) Symmetrical data: mean = median
(B) positive skew: mean > median [mean gets “pulled” by tail]
(C) negative skew: mean < median
Mean Mode
Median
(A)Symmetrica
l
Mode
Median
Mean
Mean
Median
Mode
(B)PositiveSkew (B)NegativeS
kew
5/3/2023 Summary Statistics 11
Spread = Variability
Variability  amount values spread
above and below the average
Measures of spread
 Range and inter-quartile range
 Standard deviation and variance (this week)
5/3/2023 Summary Statistics 12
Range = max – min
The range is rarely used in practice b/c it
tends to underestimate population range
and is not robust
5/3/2023 Summary Statistics 13
Standard deviation
x
xi 
Deviation =
 2
 
 x
x
SS i
Sum of squared deviations =
1
2


n
SS
s
Sample variance =
2
s
s 
Sample standard deviation =
Most common descriptive measure of spread
5/3/2023 Summary Statistics 14
Standard deviation (formula)
 

 2
)
(
1
1
x
x
n
s i
Sample standard deviation s is the unbiased estimator of
population standard deviation .
Population standard deviation  is rarely known in practice.
5/3/2023 Summary Statistics 15
New data set (“Metabolic Rates”)
This example is not in your lecture notes
Metabolic rates (cal/day), n = 7
1792 1666 1362 1614 1460 1867 1439
1600
7
200
,
11
7
1439
1867
1460
1614
1362
1666
1792









x
5/3/2023 Summary Statistics 16
Metabolic rates showing mean (*) and
deviations of first two observations
5/3/2023 Summary Statistics 17
Standard Deviation Calculation
metabolic.sav – introduced slide 15
Observations Deviations Squared deviations
1792 1792 1600 = 192 (192)2 = 36,864
1666 1666 1600 = 66 (66)2 = 4,356
1362 1362 1600 = -238 (-238)2 = 56,644
1614 1614 1600 = 14 (14)2 = 196
1460 1460 1600 = -140 (-140)2 = 19,600
1867 1867 1600 = 267 (267)2 = 71,289
1439 1439 1600 = -161 (-161)2 = 25,921
SUMS  0* SS = 214,870
x
xi 
i
x  2
x
xi 
* Sum of deviations will always equal zero
5/3/2023 Summary Statistics 18
Standard Deviation Metabolic data
(cont.)
2
2
calories
67
.
811
,
35
1
7
870
,
214
1





n
SS
s
calories
24
.
189
67
.
811
,
35
2


 s
s
Variance (s2)
Standard deviation (s)
5/3/2023 Summary Statistics 19
General rule for rounding means
and standard deviations
Report mean to one additional decimals above that of
the data
To achieve accuracy, intermediate calculations should
carry still an additional decimals
Illustrative example
 Suppose data is recorded with one decimal accuracy (i.e.,
xx.x)
 Report mean with two decimal accuracy (i.e., xx.xx)
 Carry all intermediate calculations with at least three decimal
accuracy (i.e., xx.xxx)
Even more important: Always use common sense and judgment.
5/3/2023 Summary Statistics 20
TI-30XIIS – about $12
In practice, we often use software
or a calculator to check our
standard deviation
5/3/2023 Summary Statistics 21
Interpretation of Standard Deviation
Larger standard deviation  greater variability
 s1 = 15 and s2 = 10  group 1 has more variability
68-95-99.7 rule – Normal data only
 68% of data with 1 SD of mean, 95% within 2 SD from
mean, and 99.7% within 3 SD of mean
 e.g., if mean = 30 and SD = 10, then 95% of individuals are
in the range 30 ± (2)(10) = 30 ± 20 = (10 to 50)
Chebychev’s rule – All data
 at least 75% data within 2 SD of mean
 e.g., mean = 30 and SD = 10, then at least 75% of
individuals in range 30 ± (2)(10) = (10 to 50)
5/3/2023 Summary Statistics 22
Quartiles and IQR
Quartiles divide the ordered data into
four equally-sized groups
Q0 = minimum
Q1 = 25th %ile
Q2 = 50th %ile (Median)
Q3 = 75th %ile
Q4 = maximum
5/3/2023 Summary Statistics 23
Rule for quartiles
Find the median  Q2
Middle of lower half of data set  Q1
Middle of upper half of the data  Q3
Bottom half | Top half
05 11 21 24 27 | 28 30 42 50 52
  
Q1 Q2 Q3
IQR = Q3 – Q1 = 42 – 21 = 21
gives spread of middle 50% of the data
5/3/2023 Summary Statistics 24
5-Point Summary (sample.sav)
Q0 = 5 (minimum)
Q1 = 21 (lower hinge)
Q2 = 27.5 (median)
Q3 = 42 (upper hinge)
Q4 = 52 (maximum)
Best descriptive statistics for skewed data
5/3/2023 Summary Statistics 25
Illustrative example (metabolic.sav)
1362 1439 1460 1614 1666 1792 1867

median
Bottom half : 1362 1439 1460 1614

Q1 = (1439 + 1460) / 2 = 1449.5
Top half: 1614 1666 1792 1867

Q3 = (1666 + 1792) / 2 = 1729
5-point summary: 1362, 1449.5, 1614, 1729, 1867
5/3/2023 Summary Statistics 26
Box-and-whiskers plot (boxplot)
5 point summary + “outside values”
Procedure
 Determine 5-point summary
 Draw box from Q1 to Q3
 Draw line @ Q2
 Calculate IQR = Q3 – Q1
 Calculate fences
 FLower = Q1 – 1.5(IQR)
 FUpper = Q3 + 1.5(IQR)
 Determine if any outside values? If so, plot separately
 Determine inside values and draw whiskers from box to
inside values
5/3/2023 Summary Statistics 27
Boxplot example
5-point: 5, 21, 27.5, 42, 52
IQR = 42 – 21 = 21
FU = 42 + (1.5)(21) = 73.5
 No outside above (outside)
Upper inside value = 52
FL = 21 – (1.5)(21) = –10.5
 No values below (outside)
 Lower inside value = 5
05 11 21 24 27 28 30 42 50 52
60
50
40
30
20
10
0
Upper inside = 52
Q3 = 42
Q1 = 21
Lower inside = 5
Q2 = 27.5
5/3/2023 Summary Statistics 28
Boxplot example 2
5-point: 3, 22, 25.5, 29, 51
IQR = 29 – 22 = 7
FU = 29 + (1.5)(7) = 39.5
 One outside (51)
 Inside value = 31
FL = 22 – (1.5)(7) = 11.5
 One outside (3)
 Inside value = 21
3 21 22 24 25 26 28 29 31 51
60
50
40
30
20
10
0
Outside value (51)
Outside value (3)
Inside value (21)
Upper hinge (29)
Lower hinge (22)
Median (25.5)
Inside value (31)
5/3/2023 Summary Statistics 29
Boxplot example 3 (metabolic.sav)
5-point: 1362, 1449.5, 1614, 1729,
1867 (slide 30)
IQR = 1729 – 1449.5 = 279.5
FU = 1729 + (1.5)(279.5) =
2148.25
 None outside
 Upper inside = 1867
FL = 1449.5 – (1.5)(279.5) =
1030.25
 None outside
 Lower inside = 1362
1362 1439 1460 1614 1666 1792 1867
7
N =
Data source: Moore,
2000
1900
1800
1700
1600
1500
1400
1300
5/3/2023 Summary Statistics 30
Interpretation of boxplots
Location
 Position of median
 Position of box
Spread
 Hinge-spread (box length) = IQR
 Whisker-to-whisker spread (range or range minus
the outside values)
Shape
 Symmetry of box
 Size of whiskers
 Outside values (potential outliers)
5/3/2023 Summary Statistics 31
Side-by-side boxplots
Boxplots are especially useful for comparing groups:

More Related Content

Similar to summary statistics

measure of variability (windri). In research include example
measure of variability (windri). In research include examplemeasure of variability (windri). In research include example
measure of variability (windri). In research include example
windri3
 
Lesson 7 measures of dispersion part 2
Lesson 7 measures of dispersion part 2Lesson 7 measures of dispersion part 2
Lesson 7 measures of dispersion part 2
nurun2010
 
Mean, median, and mode ug
Mean, median, and mode ugMean, median, and mode ug
Mean, median, and mode ug
AbhishekDas15
 
03 ch ken black solution
03 ch ken black solution03 ch ken black solution
03 ch ken black solution
Krunal Shah
 

Similar to summary statistics (20)

Rm class-2 part-1
Rm class-2 part-1Rm class-2 part-1
Rm class-2 part-1
 
Measures of-variation
Measures of-variationMeasures of-variation
Measures of-variation
 
measure of variability (windri). In research include example
measure of variability (windri). In research include examplemeasure of variability (windri). In research include example
measure of variability (windri). In research include example
 
Introduction to Probability and Statistics 13th Edition Mendenhall Solutions ...
Introduction to Probability and Statistics 13th Edition Mendenhall Solutions ...Introduction to Probability and Statistics 13th Edition Mendenhall Solutions ...
Introduction to Probability and Statistics 13th Edition Mendenhall Solutions ...
 
CENTRAL LIMIT THEOREM- STATISTICS AND PROBABILITY
CENTRAL LIMIT THEOREM- STATISTICS AND PROBABILITYCENTRAL LIMIT THEOREM- STATISTICS AND PROBABILITY
CENTRAL LIMIT THEOREM- STATISTICS AND PROBABILITY
 
An overview of statistics management with excel
An overview of statistics management with excelAn overview of statistics management with excel
An overview of statistics management with excel
 
First term notes 2020 econs ss2 1
First term notes 2020 econs ss2 1First term notes 2020 econs ss2 1
First term notes 2020 econs ss2 1
 
Statistics assignment
Statistics assignmentStatistics assignment
Statistics assignment
 
Chapter13
Chapter13Chapter13
Chapter13
 
Lesson 7 measures of dispersion part 2
Lesson 7 measures of dispersion part 2Lesson 7 measures of dispersion part 2
Lesson 7 measures of dispersion part 2
 
Math unit18 measure of variation
Math unit18 measure of variationMath unit18 measure of variation
Math unit18 measure of variation
 
Mean, median, and mode ug
Mean, median, and mode ugMean, median, and mode ug
Mean, median, and mode ug
 
Central tendency
Central tendencyCentral tendency
Central tendency
 
Describing Data: Numerical Measures
Describing Data: Numerical MeasuresDescribing Data: Numerical Measures
Describing Data: Numerical Measures
 
Variability
VariabilityVariability
Variability
 
Measures of Dispersion.pptx
Measures of Dispersion.pptxMeasures of Dispersion.pptx
Measures of Dispersion.pptx
 
8490370.ppt
8490370.ppt8490370.ppt
8490370.ppt
 
Statistical methods
Statistical methods Statistical methods
Statistical methods
 
03 ch ken black solution
03 ch ken black solution03 ch ken black solution
03 ch ken black solution
 
Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendency
 

More from ClarissaCambosaReyes (7)

CHAPTER IV.pptx
CHAPTER IV.pptxCHAPTER IV.pptx
CHAPTER IV.pptx
 
CONCEPTUAL FRAMEWORK (NATH).docx
CONCEPTUAL FRAMEWORK (NATH).docxCONCEPTUAL FRAMEWORK (NATH).docx
CONCEPTUAL FRAMEWORK (NATH).docx
 
Secondary Education
Secondary EducationSecondary Education
Secondary Education
 
Presentation (3).pdf
Presentation (3).pdfPresentation (3).pdf
Presentation (3).pdf
 
THE AGRICULTURAL research
THE AGRICULTURAL researchTHE AGRICULTURAL research
THE AGRICULTURAL research
 
Title Defense Presentation and Format.pptx
Title Defense Presentation and Format.pptxTitle Defense Presentation and Format.pptx
Title Defense Presentation and Format.pptx
 
syllabus-on-philippine-literature.docx
syllabus-on-philippine-literature.docxsyllabus-on-philippine-literature.docx
syllabus-on-philippine-literature.docx
 

Recently uploaded

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Recently uploaded (20)

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 

summary statistics

  • 1. 5/3/2023 Summary Statistics 1 Summary Statistics Last week we used stemplots and histograms to describe the shape, location, and spread of a distribution. This week we use numerical summaries of location and spread.
  • 2. 5/3/2023 Summary Statistics 2 Main Summary Statistics by Type Central location  Mean  Median  Mode Spread  Variance and standard deviation  Quartiles and Inter Quartile Range (IQR) Shape  Statistical measures of spread (e.g., skewness and kurtosis) are available but are seldom used in practice (not covered)
  • 3. 5/3/2023 Summary Statistics 3 Notation n  sample size X  variable xi  value of individual i   sum all values (capital sigma) Illustrative example (sample.sav), data: 21 42 5 11 30 50 28 27 24 52  n = 10  X = age  x1= 21, x2= 42, …, x10= 52  x = 21 + 42 + … + 52 = 290
  • 4. 5/3/2023 Summary Statistics 4 Sample Mean     i i x n n x x 1 0 . 29 ) 290 ( 10 1 1     i x n x Illustrative example: n = 10 (data & intermediate calculations on prior slide)
  • 5. 5/3/2023 Summary Statistics 5 Population Mean Same operation as sample mean, but based on entire population (N = population size) Not available in practice, but important conceptually     i i x N N x 1 
  • 6. 5/3/2023 Summary Statistics 6 Interpretation of xbar Sample mean used to predict  an observation drawn at random from a sample  an observation drawn at random from the population  the population mean Gravitational center (balance point) 0 10 20 30 40 50 60 Mean = 29
  • 7. 5/3/2023 Summary Statistics 7 Median – a different kind of average “Middle value” Covered last week  Order data  Depth of median is (n+1) / 2  When n is odd  middle value  When n is even  average two middle values Illustrative example, n = 10  median has depth (10+1) / 2 = 5.5 05 11 21 24 27 28 30 42 50 52  median = average of 27 and 28 = 27.5
  • 8. 5/3/2023 Summary Statistics 8 Median is “robust” Robust  resistant to skews and outliers This data set has a mean (xbar) of 1600: 1362 1439 1460 1614 1666 1792 1867 This data set has an outlier and a mean of 2743: 1362 1439 1460 1614 1666 1792 9867 Outlier The median is 1614 in both instances. The median was not influenced by the outlier.
  • 9. 5/3/2023 Summary Statistics 9 Mode Mode  value with greatest frequency e.g., {4, 7, 7, 7, 8, 8, 9} has mode = 7 Used only in very large data sets
  • 10. 5/3/2023 Summary Statistics 10 Mean, Median, Mode (A) Symmetrical data: mean = median (B) positive skew: mean > median [mean gets “pulled” by tail] (C) negative skew: mean < median Mean Mode Median (A)Symmetrica l Mode Median Mean Mean Median Mode (B)PositiveSkew (B)NegativeS kew
  • 11. 5/3/2023 Summary Statistics 11 Spread = Variability Variability  amount values spread above and below the average Measures of spread  Range and inter-quartile range  Standard deviation and variance (this week)
  • 12. 5/3/2023 Summary Statistics 12 Range = max – min The range is rarely used in practice b/c it tends to underestimate population range and is not robust
  • 13. 5/3/2023 Summary Statistics 13 Standard deviation x xi  Deviation =  2    x x SS i Sum of squared deviations = 1 2   n SS s Sample variance = 2 s s  Sample standard deviation = Most common descriptive measure of spread
  • 14. 5/3/2023 Summary Statistics 14 Standard deviation (formula)     2 ) ( 1 1 x x n s i Sample standard deviation s is the unbiased estimator of population standard deviation . Population standard deviation  is rarely known in practice.
  • 15. 5/3/2023 Summary Statistics 15 New data set (“Metabolic Rates”) This example is not in your lecture notes Metabolic rates (cal/day), n = 7 1792 1666 1362 1614 1460 1867 1439 1600 7 200 , 11 7 1439 1867 1460 1614 1362 1666 1792          x
  • 16. 5/3/2023 Summary Statistics 16 Metabolic rates showing mean (*) and deviations of first two observations
  • 17. 5/3/2023 Summary Statistics 17 Standard Deviation Calculation metabolic.sav – introduced slide 15 Observations Deviations Squared deviations 1792 1792 1600 = 192 (192)2 = 36,864 1666 1666 1600 = 66 (66)2 = 4,356 1362 1362 1600 = -238 (-238)2 = 56,644 1614 1614 1600 = 14 (14)2 = 196 1460 1460 1600 = -140 (-140)2 = 19,600 1867 1867 1600 = 267 (267)2 = 71,289 1439 1439 1600 = -161 (-161)2 = 25,921 SUMS  0* SS = 214,870 x xi  i x  2 x xi  * Sum of deviations will always equal zero
  • 18. 5/3/2023 Summary Statistics 18 Standard Deviation Metabolic data (cont.) 2 2 calories 67 . 811 , 35 1 7 870 , 214 1      n SS s calories 24 . 189 67 . 811 , 35 2    s s Variance (s2) Standard deviation (s)
  • 19. 5/3/2023 Summary Statistics 19 General rule for rounding means and standard deviations Report mean to one additional decimals above that of the data To achieve accuracy, intermediate calculations should carry still an additional decimals Illustrative example  Suppose data is recorded with one decimal accuracy (i.e., xx.x)  Report mean with two decimal accuracy (i.e., xx.xx)  Carry all intermediate calculations with at least three decimal accuracy (i.e., xx.xxx) Even more important: Always use common sense and judgment.
  • 20. 5/3/2023 Summary Statistics 20 TI-30XIIS – about $12 In practice, we often use software or a calculator to check our standard deviation
  • 21. 5/3/2023 Summary Statistics 21 Interpretation of Standard Deviation Larger standard deviation  greater variability  s1 = 15 and s2 = 10  group 1 has more variability 68-95-99.7 rule – Normal data only  68% of data with 1 SD of mean, 95% within 2 SD from mean, and 99.7% within 3 SD of mean  e.g., if mean = 30 and SD = 10, then 95% of individuals are in the range 30 ± (2)(10) = 30 ± 20 = (10 to 50) Chebychev’s rule – All data  at least 75% data within 2 SD of mean  e.g., mean = 30 and SD = 10, then at least 75% of individuals in range 30 ± (2)(10) = (10 to 50)
  • 22. 5/3/2023 Summary Statistics 22 Quartiles and IQR Quartiles divide the ordered data into four equally-sized groups Q0 = minimum Q1 = 25th %ile Q2 = 50th %ile (Median) Q3 = 75th %ile Q4 = maximum
  • 23. 5/3/2023 Summary Statistics 23 Rule for quartiles Find the median  Q2 Middle of lower half of data set  Q1 Middle of upper half of the data  Q3 Bottom half | Top half 05 11 21 24 27 | 28 30 42 50 52    Q1 Q2 Q3 IQR = Q3 – Q1 = 42 – 21 = 21 gives spread of middle 50% of the data
  • 24. 5/3/2023 Summary Statistics 24 5-Point Summary (sample.sav) Q0 = 5 (minimum) Q1 = 21 (lower hinge) Q2 = 27.5 (median) Q3 = 42 (upper hinge) Q4 = 52 (maximum) Best descriptive statistics for skewed data
  • 25. 5/3/2023 Summary Statistics 25 Illustrative example (metabolic.sav) 1362 1439 1460 1614 1666 1792 1867  median Bottom half : 1362 1439 1460 1614  Q1 = (1439 + 1460) / 2 = 1449.5 Top half: 1614 1666 1792 1867  Q3 = (1666 + 1792) / 2 = 1729 5-point summary: 1362, 1449.5, 1614, 1729, 1867
  • 26. 5/3/2023 Summary Statistics 26 Box-and-whiskers plot (boxplot) 5 point summary + “outside values” Procedure  Determine 5-point summary  Draw box from Q1 to Q3  Draw line @ Q2  Calculate IQR = Q3 – Q1  Calculate fences  FLower = Q1 – 1.5(IQR)  FUpper = Q3 + 1.5(IQR)  Determine if any outside values? If so, plot separately  Determine inside values and draw whiskers from box to inside values
  • 27. 5/3/2023 Summary Statistics 27 Boxplot example 5-point: 5, 21, 27.5, 42, 52 IQR = 42 – 21 = 21 FU = 42 + (1.5)(21) = 73.5  No outside above (outside) Upper inside value = 52 FL = 21 – (1.5)(21) = –10.5  No values below (outside)  Lower inside value = 5 05 11 21 24 27 28 30 42 50 52 60 50 40 30 20 10 0 Upper inside = 52 Q3 = 42 Q1 = 21 Lower inside = 5 Q2 = 27.5
  • 28. 5/3/2023 Summary Statistics 28 Boxplot example 2 5-point: 3, 22, 25.5, 29, 51 IQR = 29 – 22 = 7 FU = 29 + (1.5)(7) = 39.5  One outside (51)  Inside value = 31 FL = 22 – (1.5)(7) = 11.5  One outside (3)  Inside value = 21 3 21 22 24 25 26 28 29 31 51 60 50 40 30 20 10 0 Outside value (51) Outside value (3) Inside value (21) Upper hinge (29) Lower hinge (22) Median (25.5) Inside value (31)
  • 29. 5/3/2023 Summary Statistics 29 Boxplot example 3 (metabolic.sav) 5-point: 1362, 1449.5, 1614, 1729, 1867 (slide 30) IQR = 1729 – 1449.5 = 279.5 FU = 1729 + (1.5)(279.5) = 2148.25  None outside  Upper inside = 1867 FL = 1449.5 – (1.5)(279.5) = 1030.25  None outside  Lower inside = 1362 1362 1439 1460 1614 1666 1792 1867 7 N = Data source: Moore, 2000 1900 1800 1700 1600 1500 1400 1300
  • 30. 5/3/2023 Summary Statistics 30 Interpretation of boxplots Location  Position of median  Position of box Spread  Hinge-spread (box length) = IQR  Whisker-to-whisker spread (range or range minus the outside values) Shape  Symmetry of box  Size of whiskers  Outside values (potential outliers)
  • 31. 5/3/2023 Summary Statistics 31 Side-by-side boxplots Boxplots are especially useful for comparing groups:

Editor's Notes

  1. 5/3/2023
  2. 5/3/2023
  3. 5/3/2023
  4. 5/3/2023
  5. 5/3/2023
  6. 5/3/2023
  7. 5/3/2023
  8. 5/3/2023
  9. 5/3/2023
  10. 5/3/2023
  11. 5/3/2023
  12. 5/3/2023