SlideShare a Scribd company logo
Part II
Each slide has its own narration in an audio file.
For the explanation of any slide click on the audio icon to start it.
Professor Friedman's Statistics Course by H & L Friedman is licensed under a
Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
 A third important property of data – after
location and dispersion - is its shape.
 Shape can be described by degree of
asymmetry (i.e., skewness).
◦ mean > median positive or right-skewness
◦ mean = median symmetric or zero-skewness
◦ mean < median negative or left-skewness
 Positive skewness can arise when the mean is
increased by some unusually high values.
 Negative skewness can arise when the mean is
decreased by some unusually low values.
Descriptive Statistics II 2
 Left skewed:
 Right skewed:
 Symmetric:
Descriptive Statistics II 3
Source: Levine et al., Business Statistics, Pearson, 2013.
Data (for n=12 employees):
2 3 8 ┋ 8 9 10 ┋ 10 12 15 ┋ 18 22 63
𝑋= 180/12 = 15 hours
Median = 10 hours
The (extremely slow) employee who took 63 hours to
complete the task skewed the entire distributon to the
right.
s2 = 2868 / 11 = 260.79
s = 16.25 hours
CV = 107.7%
Descriptive Statistics II 4
This guy
took a VERY
long time!
 Scores of 17 students on a national calculus
exam. Data:
0, 0, 10, 12, 15, 18, 20, 25, 30, 33, 34, 41, 56,
87, 92, 94, 95
 Open MS Excel.
 Go to Data Analysis—Analysis Tools —
Descriptive Statistics.
 If you do not have Data Analysis-Analysis Tools, you
have to use the Add-in feature and add it to MS Excel.
 Make sure to check the Summary Statistics box
once you are in descriptive statistics.
 See MS Excel Output on next slide.
Descriptive Statistics II 5
MS Excel uses a formula – the Pearson Coefficient of
Skewness – to calculate skewness. You do not have to know
the formula. If the coefficient is 0 or very close to it, you
have a symmetric distribution.
Descriptive Statistics II 6
Column1
Mean 38.94117647
Standard Error 8.111117365
Median 30
Mode 0
Standard Deviation 33.44299364
Sample Variance 1118.433824
Kurtosis -0.82259021
Skewness 0.782252352
Range 95
Minimum 0
Maximum 95
Sum 662
Count 17
From the output:
• mean is 38.94
• median is 30
• mode is 0
• standard deviation is 33.44
• variance is 1118.43
• skewness is .78 (positive)
• range is 95
• n is 17
 We can convert the original scores to new
scores with 𝑋 = 0 and s = 1.
 This will give us a pure number with no
units of measurement.
 Any score below the mean will now be
negative.
 Any score at the mean will be 0.
 Any score above the mean will be positive.
Descriptive Statistics II 7
To compute the Z-scores:
𝑍 =
𝑋 − 𝑋
𝑠
Example.
Data: 0, 2, 4, 6, 8, 10
𝑋 = 30/6 = 5; s = 3.74
Descriptive Statistics II 8
X  Z
0 0−5
3.74
-1.34
2 2−5
3.74
-.80
4 4−5
3.74
-.27
6 6−5
3.74
.27
8 8−5
3.74
.80
10 10−5
3.74
1.34
Descriptive Statistics II 9
 Data: Exam Scores
Original data Change 7 to 97 Change 23 to 93
X Z X Z X Z
65 -0.45 65 -0.81 65 -1.40
73 -0.11 73 -0.38 73 -0.79
78 0.10 78 -0.10 78 -0.40
69 -0.28 69 -0.60 69 -1.09
78 0.10 78 -0.10 78 -0.40
7 -2.89 <= 97 0.94 97 1.07
23 -2.21 23 -3.12 <= 93 0.76
98 0.94 98 0.99 98 1.14
99 0.99 99 1.05 99 1.22
99 0.99 99 1.05 99 1.22
97 0.90 97 0.94 97 1.07
99 0.99 99 1.05 99 1.22
75 -0.02 75 -0.27 75 -0.63
79 0.14 79 -0.05 79 -0.32
85 0.40 85 0.28 85 0.14
63 -0.53 63 -0.92 63 -1.56
67 -0.36 67 -0.70 67 -1.25
72 -0.15 72 -0.43 72 -0.86
73 -0.11 73 -0.38 73 -0.79
93 0.73 93 0.72 93 0.76
95 0.82 95 0.83 95 0.91
Mean 75.57 Mean 79.86 Mean 83.19
s 23.75 s 18.24 s. 12.96
 No matter what you are measuring, a Z-score of
more than +5 or less than – 5 would indicate a
very, very unusual score.
 For standardized data, if it is normally distributed,
95% of the data will be between ±2 standard
deviations about the mean.
 If the data follows a normal distribution,
◦ 95% of the data will be between -1.96 and +1.96.
◦ 99.7% of the data will fall between -3 and +3.
◦ 99.99% of the data will fall between -4 and +4.
 Worst case scenario: 75% of the data are between 2
standard deviations about the mean.
[Chebychev.]
Descriptive Statistics II 10
 When examining a distribution for shape,
sometime the five number summary is useful:
Smallest| Q1 | Median | Q3 | Largest
 Example:
𝑋 = 15
5-number summary: 2 | 8 | 10 | 16.5 | 63
This data is right-skewed.
In right-skewed distributions, the distance from Q3 to
Xlargest (16.5 to 63) is significantly greater than the distance
from Xsmallest to Q1(2 to 8).
Descriptive Statistics II 11
2 3 8 8 9 10 10 12 15 18 22 63
Smallest Largest
Median
Q1
Q3
 The boxplot is a way to graphically portray a
distribution of data by means of its five-number
summary.
 Boxplot can be drawn along the horizontal or vertically.
Descriptive Statistics II 12
Vertical line drawn within the box is the
median
Vertical line at the left side of box is Q1
Vertical line at the right side of box is Q3
Line on left connects left side of box with
Xsmallest (lower 25% of data)
Line on right connects right side of box
with Xlargest (upper 25% of data)
 A “bell-shaped” symmetric data distribution
would look like this:
Descriptive Statistics II 13
 We summarize categorical data using
frequencies and graphical methods.
Descriptive Statistics II 14
 A frequency distribution records data
grouped into classes and the number of
observations that fell into each class.
 A frequency distribution can be used for:
◦ categorical data
◦ numerical data that can be grouped into intervals
◦ numerical data with repeated observations
 A percentage distribution records the percent
of the observations that fell into each class.
Descriptive Statistics II 15
Example. A sample was taken of 200 professors at a (fictitious)
local college. Each was asked for his or her (take-home) weekly
salary. The responses ranged from about$520 to $590. If we
wanted to display the data in, say, 7 equal intervals, we would use
an interval width of $10.
Width of interval =
𝑅𝑎𝑛𝑔𝑒
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠𝑒𝑠
=
$70
7
= $10/class.
The Frequency / Percentage
Distribution:
.
Descriptive Statistics II 16
Take-home pay frequency percentage
520 and under 530 6 3 %
530 " " 540 30 15
540 " " 550 38 19
550 " " 560 52 26
560 " " 570 42 21
570 " " 580 24 12
580 to 590 8 4
200 100 %
A Cumulative Distribution focuses on the
number or percentage of cases that lie below
or above specified values rather than within
intervals.
Descriptive Statistics II 17
Take-home pay frequency percentage
less than 520 0 0
" " 530 6 3
" " 540 36 18
" " 550 74 37
" " 560 126 63
" " 570 168 84
" " 580 192 96
" " 590 200 100
The Frequency Histogram:
Descriptive Statistics II 18
The Frequency Polygon
Descriptive Statistics II 19
The Cumulative Frequency Distribution
Descriptive Statistics II 20
 Categorical Data – graphical representation
◦ Contingency Table
◦ Side-by-Side Bar Chart
 Numerical Data – looking for relationships in
bivariate data
◦ Scatter Plot
◦ Correlation
◦ The Regression Line
Descriptive Statistics II 21
Two categorical variables are most easily displayed in a
contingency table. This is a table of two-way frequencies.
Example: “Who would you vote for in the next election?”
This also works for two-way percentages:
.
Descriptive Statistics II 22
Male Female
Republican Candidate 250 250 500
Democrat Candidate 150 350 500
400 600 1000
Chart: Relative Performance (Source: Microsoft.com)
Descriptive Statistics II 23
What can we do with 2 numerical variables? We
can graph them.
Example – Grade and Height (in inches)
Descriptive Statistics II 24
Y (Grade) 100 95 90 80 70 65 60 40 30 20
X (Height) 73 79 62 69 74 77 81 63 68 74
 Correlation coefficient is r = .12
 Coefficient of determination is r2 = .01
We will learn about the above measures, as well
as more about scatter plots, in the topic
onCORRELATION.
Descriptive Statistics II 25
 Practice, practice, practice.
◦ As always, do lots and lots of problems. You can
find these in the online lecture notes and
homework assignments.
Descriptive Statistics II 26

More Related Content

Similar to Lecture 2 Descriptive statistics.pptx

Module 1 Powerpoint 2.pptx
Module 1 Powerpoint 2.pptxModule 1 Powerpoint 2.pptx
Module 1 Powerpoint 2.pptx
ZyrenMisaki
 
Applied numerical methods lec8
Applied numerical methods lec8Applied numerical methods lec8
Applied numerical methods lec8
Yasser Ahmed
 
DescriptiveStatistics.pdf
DescriptiveStatistics.pdfDescriptiveStatistics.pdf
DescriptiveStatistics.pdf
data2businessinsight
 
Basics of Stats (2).pptx
Basics of Stats (2).pptxBasics of Stats (2).pptx
Basics of Stats (2).pptx
madihamaqbool6
 
Statistical analysis
Statistical analysisStatistical analysis
Statistical analysis
highlandn
 
Estimation and confidence interval
Estimation and confidence intervalEstimation and confidence interval
Estimation and confidence interval
Homework Guru
 
Unit 5 8614.pptx A_Movie_Review_Pursuit_Of_Happiness
Unit 5 8614.pptx A_Movie_Review_Pursuit_Of_HappinessUnit 5 8614.pptx A_Movie_Review_Pursuit_Of_Happiness
Unit 5 8614.pptx A_Movie_Review_Pursuit_Of_Happiness
ourbusiness0014
 
1.0 Descriptive statistics.pdf
1.0 Descriptive statistics.pdf1.0 Descriptive statistics.pdf
1.0 Descriptive statistics.pdf
thaersyam
 
Chapter 2. Know Your Data.ppt
Chapter 2. Know Your Data.pptChapter 2. Know Your Data.ppt
Chapter 2. Know Your Data.ppt
Subrata Kumer Paul
 
Statr sessions 9 to 10
Statr sessions 9 to 10Statr sessions 9 to 10
Statr sessions 9 to 10
Ruru Chowdhury
 
TOPIC 9 VARIABILITY TESTS.pdf
TOPIC 9 VARIABILITY TESTS.pdfTOPIC 9 VARIABILITY TESTS.pdf
TOPIC 9 VARIABILITY TESTS.pdf
Edwin Osiyel
 
Year 12 Maths A Textbook - Chapter 10
Year 12 Maths A Textbook - Chapter 10Year 12 Maths A Textbook - Chapter 10
Year 12 Maths A Textbook - Chapter 10
westy67968
 
Numerical measures stat ppt @ bec doms
Numerical measures stat ppt @ bec domsNumerical measures stat ppt @ bec doms
Numerical measures stat ppt @ bec doms
Babasab Patil
 
Chapter_9.pptx
Chapter_9.pptxChapter_9.pptx
Chapter_9.pptx
Sanjeev Banerjee
 
Chapter3
Chapter3Chapter3
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersion
Nilanjan Bhaumik
 
Measures of dispersion discuss 2.2
Measures of dispersion discuss 2.2Measures of dispersion discuss 2.2
Measures of dispersion discuss 2.2
Makati Science High School
 
SAMPLING MEAN DEFINITION The term sampling mean .docx
SAMPLING MEAN DEFINITION The term sampling mean .docxSAMPLING MEAN DEFINITION The term sampling mean .docx
SAMPLING MEAN DEFINITION The term sampling mean .docx
anhlodge
 
statistics
statisticsstatistics
statistics
Sanchit Babbar
 
Frequency.pptx
Frequency.pptxFrequency.pptx
Frequency.pptx
ZainabNoor83
 

Similar to Lecture 2 Descriptive statistics.pptx (20)

Module 1 Powerpoint 2.pptx
Module 1 Powerpoint 2.pptxModule 1 Powerpoint 2.pptx
Module 1 Powerpoint 2.pptx
 
Applied numerical methods lec8
Applied numerical methods lec8Applied numerical methods lec8
Applied numerical methods lec8
 
DescriptiveStatistics.pdf
DescriptiveStatistics.pdfDescriptiveStatistics.pdf
DescriptiveStatistics.pdf
 
Basics of Stats (2).pptx
Basics of Stats (2).pptxBasics of Stats (2).pptx
Basics of Stats (2).pptx
 
Statistical analysis
Statistical analysisStatistical analysis
Statistical analysis
 
Estimation and confidence interval
Estimation and confidence intervalEstimation and confidence interval
Estimation and confidence interval
 
Unit 5 8614.pptx A_Movie_Review_Pursuit_Of_Happiness
Unit 5 8614.pptx A_Movie_Review_Pursuit_Of_HappinessUnit 5 8614.pptx A_Movie_Review_Pursuit_Of_Happiness
Unit 5 8614.pptx A_Movie_Review_Pursuit_Of_Happiness
 
1.0 Descriptive statistics.pdf
1.0 Descriptive statistics.pdf1.0 Descriptive statistics.pdf
1.0 Descriptive statistics.pdf
 
Chapter 2. Know Your Data.ppt
Chapter 2. Know Your Data.pptChapter 2. Know Your Data.ppt
Chapter 2. Know Your Data.ppt
 
Statr sessions 9 to 10
Statr sessions 9 to 10Statr sessions 9 to 10
Statr sessions 9 to 10
 
TOPIC 9 VARIABILITY TESTS.pdf
TOPIC 9 VARIABILITY TESTS.pdfTOPIC 9 VARIABILITY TESTS.pdf
TOPIC 9 VARIABILITY TESTS.pdf
 
Year 12 Maths A Textbook - Chapter 10
Year 12 Maths A Textbook - Chapter 10Year 12 Maths A Textbook - Chapter 10
Year 12 Maths A Textbook - Chapter 10
 
Numerical measures stat ppt @ bec doms
Numerical measures stat ppt @ bec domsNumerical measures stat ppt @ bec doms
Numerical measures stat ppt @ bec doms
 
Chapter_9.pptx
Chapter_9.pptxChapter_9.pptx
Chapter_9.pptx
 
Chapter3
Chapter3Chapter3
Chapter3
 
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersion
 
Measures of dispersion discuss 2.2
Measures of dispersion discuss 2.2Measures of dispersion discuss 2.2
Measures of dispersion discuss 2.2
 
SAMPLING MEAN DEFINITION The term sampling mean .docx
SAMPLING MEAN DEFINITION The term sampling mean .docxSAMPLING MEAN DEFINITION The term sampling mean .docx
SAMPLING MEAN DEFINITION The term sampling mean .docx
 
statistics
statisticsstatistics
statistics
 
Frequency.pptx
Frequency.pptxFrequency.pptx
Frequency.pptx
 

More from ABCraftsman

Lecture 6 Normal Distribution.pptx
Lecture 6 Normal Distribution.pptxLecture 6 Normal Distribution.pptx
Lecture 6 Normal Distribution.pptx
ABCraftsman
 
Lecture 5 Binomial Distribution.pptx
Lecture 5 Binomial Distribution.pptxLecture 5 Binomial Distribution.pptx
Lecture 5 Binomial Distribution.pptx
ABCraftsman
 
Lecture 4 Probability Distributions.pptx
Lecture 4 Probability Distributions.pptxLecture 4 Probability Distributions.pptx
Lecture 4 Probability Distributions.pptx
ABCraftsman
 
Lecture 3 Graphs.ppt
Lecture 3 Graphs.pptLecture 3 Graphs.ppt
Lecture 3 Graphs.ppt
ABCraftsman
 
Lecture 1 Descriptives.pptx
Lecture 1 Descriptives.pptxLecture 1 Descriptives.pptx
Lecture 1 Descriptives.pptx
ABCraftsman
 
Scholarships presentation.pptx
Scholarships presentation.pptxScholarships presentation.pptx
Scholarships presentation.pptx
ABCraftsman
 

More from ABCraftsman (6)

Lecture 6 Normal Distribution.pptx
Lecture 6 Normal Distribution.pptxLecture 6 Normal Distribution.pptx
Lecture 6 Normal Distribution.pptx
 
Lecture 5 Binomial Distribution.pptx
Lecture 5 Binomial Distribution.pptxLecture 5 Binomial Distribution.pptx
Lecture 5 Binomial Distribution.pptx
 
Lecture 4 Probability Distributions.pptx
Lecture 4 Probability Distributions.pptxLecture 4 Probability Distributions.pptx
Lecture 4 Probability Distributions.pptx
 
Lecture 3 Graphs.ppt
Lecture 3 Graphs.pptLecture 3 Graphs.ppt
Lecture 3 Graphs.ppt
 
Lecture 1 Descriptives.pptx
Lecture 1 Descriptives.pptxLecture 1 Descriptives.pptx
Lecture 1 Descriptives.pptx
 
Scholarships presentation.pptx
Scholarships presentation.pptxScholarships presentation.pptx
Scholarships presentation.pptx
 

Recently uploaded

一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
74nqk8xf
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
g4dpvqap0
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 

Recently uploaded (20)

一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 

Lecture 2 Descriptive statistics.pptx

  • 1. Part II Each slide has its own narration in an audio file. For the explanation of any slide click on the audio icon to start it. Professor Friedman's Statistics Course by H & L Friedman is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
  • 2.  A third important property of data – after location and dispersion - is its shape.  Shape can be described by degree of asymmetry (i.e., skewness). ◦ mean > median positive or right-skewness ◦ mean = median symmetric or zero-skewness ◦ mean < median negative or left-skewness  Positive skewness can arise when the mean is increased by some unusually high values.  Negative skewness can arise when the mean is decreased by some unusually low values. Descriptive Statistics II 2
  • 3.  Left skewed:  Right skewed:  Symmetric: Descriptive Statistics II 3 Source: Levine et al., Business Statistics, Pearson, 2013.
  • 4. Data (for n=12 employees): 2 3 8 ┋ 8 9 10 ┋ 10 12 15 ┋ 18 22 63 𝑋= 180/12 = 15 hours Median = 10 hours The (extremely slow) employee who took 63 hours to complete the task skewed the entire distributon to the right. s2 = 2868 / 11 = 260.79 s = 16.25 hours CV = 107.7% Descriptive Statistics II 4 This guy took a VERY long time!
  • 5.  Scores of 17 students on a national calculus exam. Data: 0, 0, 10, 12, 15, 18, 20, 25, 30, 33, 34, 41, 56, 87, 92, 94, 95  Open MS Excel.  Go to Data Analysis—Analysis Tools — Descriptive Statistics.  If you do not have Data Analysis-Analysis Tools, you have to use the Add-in feature and add it to MS Excel.  Make sure to check the Summary Statistics box once you are in descriptive statistics.  See MS Excel Output on next slide. Descriptive Statistics II 5
  • 6. MS Excel uses a formula – the Pearson Coefficient of Skewness – to calculate skewness. You do not have to know the formula. If the coefficient is 0 or very close to it, you have a symmetric distribution. Descriptive Statistics II 6 Column1 Mean 38.94117647 Standard Error 8.111117365 Median 30 Mode 0 Standard Deviation 33.44299364 Sample Variance 1118.433824 Kurtosis -0.82259021 Skewness 0.782252352 Range 95 Minimum 0 Maximum 95 Sum 662 Count 17 From the output: • mean is 38.94 • median is 30 • mode is 0 • standard deviation is 33.44 • variance is 1118.43 • skewness is .78 (positive) • range is 95 • n is 17
  • 7.  We can convert the original scores to new scores with 𝑋 = 0 and s = 1.  This will give us a pure number with no units of measurement.  Any score below the mean will now be negative.  Any score at the mean will be 0.  Any score above the mean will be positive. Descriptive Statistics II 7
  • 8. To compute the Z-scores: 𝑍 = 𝑋 − 𝑋 𝑠 Example. Data: 0, 2, 4, 6, 8, 10 𝑋 = 30/6 = 5; s = 3.74 Descriptive Statistics II 8 X  Z 0 0−5 3.74 -1.34 2 2−5 3.74 -.80 4 4−5 3.74 -.27 6 6−5 3.74 .27 8 8−5 3.74 .80 10 10−5 3.74 1.34
  • 9. Descriptive Statistics II 9  Data: Exam Scores Original data Change 7 to 97 Change 23 to 93 X Z X Z X Z 65 -0.45 65 -0.81 65 -1.40 73 -0.11 73 -0.38 73 -0.79 78 0.10 78 -0.10 78 -0.40 69 -0.28 69 -0.60 69 -1.09 78 0.10 78 -0.10 78 -0.40 7 -2.89 <= 97 0.94 97 1.07 23 -2.21 23 -3.12 <= 93 0.76 98 0.94 98 0.99 98 1.14 99 0.99 99 1.05 99 1.22 99 0.99 99 1.05 99 1.22 97 0.90 97 0.94 97 1.07 99 0.99 99 1.05 99 1.22 75 -0.02 75 -0.27 75 -0.63 79 0.14 79 -0.05 79 -0.32 85 0.40 85 0.28 85 0.14 63 -0.53 63 -0.92 63 -1.56 67 -0.36 67 -0.70 67 -1.25 72 -0.15 72 -0.43 72 -0.86 73 -0.11 73 -0.38 73 -0.79 93 0.73 93 0.72 93 0.76 95 0.82 95 0.83 95 0.91 Mean 75.57 Mean 79.86 Mean 83.19 s 23.75 s 18.24 s. 12.96
  • 10.  No matter what you are measuring, a Z-score of more than +5 or less than – 5 would indicate a very, very unusual score.  For standardized data, if it is normally distributed, 95% of the data will be between ±2 standard deviations about the mean.  If the data follows a normal distribution, ◦ 95% of the data will be between -1.96 and +1.96. ◦ 99.7% of the data will fall between -3 and +3. ◦ 99.99% of the data will fall between -4 and +4.  Worst case scenario: 75% of the data are between 2 standard deviations about the mean. [Chebychev.] Descriptive Statistics II 10
  • 11.  When examining a distribution for shape, sometime the five number summary is useful: Smallest| Q1 | Median | Q3 | Largest  Example: 𝑋 = 15 5-number summary: 2 | 8 | 10 | 16.5 | 63 This data is right-skewed. In right-skewed distributions, the distance from Q3 to Xlargest (16.5 to 63) is significantly greater than the distance from Xsmallest to Q1(2 to 8). Descriptive Statistics II 11 2 3 8 8 9 10 10 12 15 18 22 63 Smallest Largest Median Q1 Q3
  • 12.  The boxplot is a way to graphically portray a distribution of data by means of its five-number summary.  Boxplot can be drawn along the horizontal or vertically. Descriptive Statistics II 12 Vertical line drawn within the box is the median Vertical line at the left side of box is Q1 Vertical line at the right side of box is Q3 Line on left connects left side of box with Xsmallest (lower 25% of data) Line on right connects right side of box with Xlargest (upper 25% of data)
  • 13.  A “bell-shaped” symmetric data distribution would look like this: Descriptive Statistics II 13
  • 14.  We summarize categorical data using frequencies and graphical methods. Descriptive Statistics II 14
  • 15.  A frequency distribution records data grouped into classes and the number of observations that fell into each class.  A frequency distribution can be used for: ◦ categorical data ◦ numerical data that can be grouped into intervals ◦ numerical data with repeated observations  A percentage distribution records the percent of the observations that fell into each class. Descriptive Statistics II 15
  • 16. Example. A sample was taken of 200 professors at a (fictitious) local college. Each was asked for his or her (take-home) weekly salary. The responses ranged from about$520 to $590. If we wanted to display the data in, say, 7 equal intervals, we would use an interval width of $10. Width of interval = 𝑅𝑎𝑛𝑔𝑒 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠𝑒𝑠 = $70 7 = $10/class. The Frequency / Percentage Distribution: . Descriptive Statistics II 16 Take-home pay frequency percentage 520 and under 530 6 3 % 530 " " 540 30 15 540 " " 550 38 19 550 " " 560 52 26 560 " " 570 42 21 570 " " 580 24 12 580 to 590 8 4 200 100 %
  • 17. A Cumulative Distribution focuses on the number or percentage of cases that lie below or above specified values rather than within intervals. Descriptive Statistics II 17 Take-home pay frequency percentage less than 520 0 0 " " 530 6 3 " " 540 36 18 " " 550 74 37 " " 560 126 63 " " 570 168 84 " " 580 192 96 " " 590 200 100
  • 20. The Cumulative Frequency Distribution Descriptive Statistics II 20
  • 21.  Categorical Data – graphical representation ◦ Contingency Table ◦ Side-by-Side Bar Chart  Numerical Data – looking for relationships in bivariate data ◦ Scatter Plot ◦ Correlation ◦ The Regression Line Descriptive Statistics II 21
  • 22. Two categorical variables are most easily displayed in a contingency table. This is a table of two-way frequencies. Example: “Who would you vote for in the next election?” This also works for two-way percentages: . Descriptive Statistics II 22 Male Female Republican Candidate 250 250 500 Democrat Candidate 150 350 500 400 600 1000
  • 23. Chart: Relative Performance (Source: Microsoft.com) Descriptive Statistics II 23
  • 24. What can we do with 2 numerical variables? We can graph them. Example – Grade and Height (in inches) Descriptive Statistics II 24 Y (Grade) 100 95 90 80 70 65 60 40 30 20 X (Height) 73 79 62 69 74 77 81 63 68 74
  • 25.  Correlation coefficient is r = .12  Coefficient of determination is r2 = .01 We will learn about the above measures, as well as more about scatter plots, in the topic onCORRELATION. Descriptive Statistics II 25
  • 26.  Practice, practice, practice. ◦ As always, do lots and lots of problems. You can find these in the online lecture notes and homework assignments. Descriptive Statistics II 26