SlideShare a Scribd company logo
1 of 25
Download to read offline
Measures of Variability
(DATA SCIENCE BASIC)
Variance and
Standard Deviation
Measures of Variability
Another measure of the variability in a
data set uses the deviations from the
mean (x – x).
Remember the sample of 6 fish that we caught from
the lake . . .
They were the following lengths:
3”, 4”, 5”, 6”, 8”, 10”
The mean length was 6 inches. Recall that we
calculated the deviations from the mean. What was the
sum of these deviations?
Can we find an average deviation?
What can we do to the deviations so that
we could find an average?
The estimated average of the deviations squared
is called the variance.
s 2
=
x -m
( )
2
å
N
Standard Deviation
- is the square root of the variance.
- is the average distance from the
center(mean).
s =
x -m
( )
2
å
N
Notations
s for populationstandard deviation
s for sample standard deviation
( )
1
2
2
−
−
=

n
x
x
s
Degree of
freedom
When calculating sample variance, we use degrees of freedom (n – 1)
in the denominator instead of n because this tends to produce
better estimates.
Degrees of freedom will be revisited again in Chapter 8.
x (x - x) (x - x)2
3 -3
4 -2
5 -1
6 0
8 2
10 4
Sum 0
What is the sum
of the deviations
squared?
Remember the sample of 6 fish that we caught from the lake . . .
Find the variance of the length of fish.
Divide this by 5.
First square the
deviations
9
4
1
0
4
16
34
s2 = 6.8
A typical deviation from the mean is the
standard deviation.
s2 = 6.8 inches2 so s = 2.608 inches
The fish in our sample deviate from the mean of
6 by an average of 2.608 inches.
The most commonly used measures of
center and variability are the mean
and standard deviation, respectively.
Choosing Measures of Center and Spread
- Mean and Standard Deviation
- Median and Interquartile Range
• The median and IQR are usually better than
the mean and standard deviation for
describing a skewed distribution or a
distribution with outliers.
• Use mean and standard deviation only for
reasonably symmetric distributions that don’t
have outliers.
Rule of Thumb
The range is 4 times as much as the
standard deviation.
symmetrical distribution of data
Consider the following data set:
4 5 6 6 6 7 7 7 7 7 7 8 8 8 9 10
This data set produces the histogram shown below. Each interval has width one and each
value is located in the middle of an interval. The histogram displays
a symmetrical distribution of data
Skewness
• Skewness is a measure of symmetry, or more precisely, the lack of symmetry. A
distribution, or data set, is symmetric if it looks the same to the left and right of the
center point.
• The skewness for a normal distribution is zero, and any symmetric data should have
a skewness near zero. Negative values for the skewness indicate data that are
skewed left and positive values for the skewness indicate data that are skewed right.
By skewed left, we mean that the left tail is long relative to the right tail. Similarly,
skewed right means that the right tail is long relative to the left tail.
• [Ref: https://en.wikipedia.org/wiki/Skewness] Skewness in a data series may
sometimes be observed not only graphically but by simple inspection of the
values. For instance, consider the numeric sequence (49, 50, 51), whose values are
evenly distributed around a central value of 50. We can transform this sequence
into a negatively skewed distribution by adding a value far below the mean, e.g.
(40, 49, 50, 51). Similarly, we can make the sequence positively skewed by adding
a value far above the mean, e.g. (49, 50, 51, 60).
Skewness
Denoted by Sk
Sk = 0 Symmetric
Sk > 0 Positively skewed
Sk < 0 Negatively skewed
Negative skew
Positive skew
Symmetrical distributions (STD
– 0) where mean,median &
mode are lying on the same
line
Example:

More Related Content

Similar to 3.4.-variance-and-stndard-deviation.pdf

Lecture. Introduction to Statistics (Measures of Dispersion).pptx
Lecture. Introduction to Statistics (Measures of Dispersion).pptxLecture. Introduction to Statistics (Measures of Dispersion).pptx
Lecture. Introduction to Statistics (Measures of Dispersion).pptx
NabeelAli89
 
Graphical presentation of data
Graphical presentation of dataGraphical presentation of data
Graphical presentation of data
drasifk
 
statical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.pptstatical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.ppt
NazarudinManik1
 
These is info only ill be attaching the questions work CJ 301 – .docx
These is info only ill be attaching the questions work CJ 301 – .docxThese is info only ill be attaching the questions work CJ 301 – .docx
These is info only ill be attaching the questions work CJ 301 – .docx
meagantobias
 
Properties of Standard Deviation
Properties of Standard DeviationProperties of Standard Deviation
Properties of Standard Deviation
Rizwan Sharif
 
Confidence Intervals in the Life Sciences PresentationNamesS.docx
Confidence Intervals in the Life Sciences PresentationNamesS.docxConfidence Intervals in the Life Sciences PresentationNamesS.docx
Confidence Intervals in the Life Sciences PresentationNamesS.docx
maxinesmith73660
 
Describing quantitative data with numbers
Describing quantitative data with numbersDescribing quantitative data with numbers
Describing quantitative data with numbers
Ulster BOCES
 

Similar to 3.4.-variance-and-stndard-deviation.pdf (20)

Lecture. Introduction to Statistics (Measures of Dispersion).pptx
Lecture. Introduction to Statistics (Measures of Dispersion).pptxLecture. Introduction to Statistics (Measures of Dispersion).pptx
Lecture. Introduction to Statistics (Measures of Dispersion).pptx
 
Graphical presentation of data
Graphical presentation of dataGraphical presentation of data
Graphical presentation of data
 
best for normal distribution.ppt
best for normal distribution.pptbest for normal distribution.ppt
best for normal distribution.ppt
 
statical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.pptstatical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.ppt
 
Statistics.pdf
Statistics.pdfStatistics.pdf
Statistics.pdf
 
Measures of dispersions
Measures of dispersionsMeasures of dispersions
Measures of dispersions
 
Lect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data MiningLect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data Mining
 
The-Normal-Distribution, Statics and Pro
The-Normal-Distribution, Statics and ProThe-Normal-Distribution, Statics and Pro
The-Normal-Distribution, Statics and Pro
 
These is info only ill be attaching the questions work CJ 301 – .docx
These is info only ill be attaching the questions work CJ 301 – .docxThese is info only ill be attaching the questions work CJ 301 – .docx
These is info only ill be attaching the questions work CJ 301 – .docx
 
Sriram seminar on introduction to statistics
Sriram seminar on introduction to statisticsSriram seminar on introduction to statistics
Sriram seminar on introduction to statistics
 
ch-4-measures-of-variability-11 2.ppt for nursing
ch-4-measures-of-variability-11 2.ppt for nursingch-4-measures-of-variability-11 2.ppt for nursing
ch-4-measures-of-variability-11 2.ppt for nursing
 
measures-of-variability-11.ppt
measures-of-variability-11.pptmeasures-of-variability-11.ppt
measures-of-variability-11.ppt
 
Central tendency _dispersion
Central tendency _dispersionCentral tendency _dispersion
Central tendency _dispersion
 
template.pptx
template.pptxtemplate.pptx
template.pptx
 
PG STAT 531 Lecture 2 Descriptive statistics
PG STAT 531 Lecture 2 Descriptive statisticsPG STAT 531 Lecture 2 Descriptive statistics
PG STAT 531 Lecture 2 Descriptive statistics
 
Properties of Standard Deviation
Properties of Standard DeviationProperties of Standard Deviation
Properties of Standard Deviation
 
Confidence Intervals in the Life Sciences PresentationNamesS.docx
Confidence Intervals in the Life Sciences PresentationNamesS.docxConfidence Intervals in the Life Sciences PresentationNamesS.docx
Confidence Intervals in the Life Sciences PresentationNamesS.docx
 
Standard deviation and standard error
Standard deviation and standard errorStandard deviation and standard error
Standard deviation and standard error
 
Describing quantitative data with numbers
Describing quantitative data with numbersDescribing quantitative data with numbers
Describing quantitative data with numbers
 
Working with Numerical Data
Working with  Numerical DataWorking with  Numerical Data
Working with Numerical Data
 

More from DebarpanHaldar1 (8)

ansante case study.pptx
ansante case study.pptxansante case study.pptx
ansante case study.pptx
 
Final presentation MNZ.pptx
Final presentation MNZ.pptxFinal presentation MNZ.pptx
Final presentation MNZ.pptx
 
AA1OOOSES.pptx
AA1OOOSES.pptxAA1OOOSES.pptx
AA1OOOSES.pptx
 
Batch Profile VLFM-05 (1).pptx
Batch Profile VLFM-05 (1).pptxBatch Profile VLFM-05 (1).pptx
Batch Profile VLFM-05 (1).pptx
 
Value Engineering and Target costing.ppt
Value Engineering and Target costing.pptValue Engineering and Target costing.ppt
Value Engineering and Target costing.ppt
 
PRESENTATION TITLE.pptx
PRESENTATION TITLE.pptxPRESENTATION TITLE.pptx
PRESENTATION TITLE.pptx
 
Network Design in Gurobi - final.pptx
Network Design in Gurobi - final.pptxNetwork Design in Gurobi - final.pptx
Network Design in Gurobi - final.pptx
 
Intro_to_Sustainability.ppt
Intro_to_Sustainability.pptIntro_to_Sustainability.ppt
Intro_to_Sustainability.ppt
 

Recently uploaded

Corporate_Science-based_Target_Setting.pptx
Corporate_Science-based_Target_Setting.pptxCorporate_Science-based_Target_Setting.pptx
Corporate_Science-based_Target_Setting.pptx
arnab132
 

Recently uploaded (20)

Rising temperatures also mean that more plant pests are appearing earlier and...
Rising temperatures also mean that more plant pests are appearing earlier and...Rising temperatures also mean that more plant pests are appearing earlier and...
Rising temperatures also mean that more plant pests are appearing earlier and...
 
2024-05-08 Composting at Home 101 for the Rotary Club of Pinecrest.pptx
2024-05-08 Composting at Home 101 for the Rotary Club of Pinecrest.pptx2024-05-08 Composting at Home 101 for the Rotary Club of Pinecrest.pptx
2024-05-08 Composting at Home 101 for the Rotary Club of Pinecrest.pptx
 
Corporate_Science-based_Target_Setting.pptx
Corporate_Science-based_Target_Setting.pptxCorporate_Science-based_Target_Setting.pptx
Corporate_Science-based_Target_Setting.pptx
 
CAUSES,EFFECTS,CONTROL OF DEFORESTATION.pptx
CAUSES,EFFECTS,CONTROL OF DEFORESTATION.pptxCAUSES,EFFECTS,CONTROL OF DEFORESTATION.pptx
CAUSES,EFFECTS,CONTROL OF DEFORESTATION.pptx
 
Cooperative Mangrove Project: Introduction, Scope, and Perspectives
Cooperative Mangrove Project: Introduction, Scope, and PerspectivesCooperative Mangrove Project: Introduction, Scope, and Perspectives
Cooperative Mangrove Project: Introduction, Scope, and Perspectives
 
Town and Country Planning-he term 'town planning' first appeared in 1906 and ...
Town and Country Planning-he term 'town planning' first appeared in 1906 and ...Town and Country Planning-he term 'town planning' first appeared in 1906 and ...
Town and Country Planning-he term 'town planning' first appeared in 1906 and ...
 
Palynology: History, branches, basic principles and application, collection o...
Palynology: History, branches, basic principles and application, collection o...Palynology: History, branches, basic principles and application, collection o...
Palynology: History, branches, basic principles and application, collection o...
 
My Museum presentation by Jamilyn Gonzalez
My Museum presentation by Jamilyn GonzalezMy Museum presentation by Jamilyn Gonzalez
My Museum presentation by Jamilyn Gonzalez
 
Global warming, Types, Causes and Effects.
Global warming, Types, Causes and Effects.Global warming, Types, Causes and Effects.
Global warming, Types, Causes and Effects.
 
Heavy metals with their causes and effect.ppt
Heavy metals with their causes and effect.pptHeavy metals with their causes and effect.ppt
Heavy metals with their causes and effect.ppt
 
Christmas Palm Trees in Florida The Ultimate Guide to Festive Landscaping wit...
Christmas Palm Trees in Florida The Ultimate Guide to Festive Landscaping wit...Christmas Palm Trees in Florida The Ultimate Guide to Festive Landscaping wit...
Christmas Palm Trees in Florida The Ultimate Guide to Festive Landscaping wit...
 
Elemental Analysis of Plants using ICP-OES(2023)
Elemental Analysis of Plants using ICP-OES(2023)Elemental Analysis of Plants using ICP-OES(2023)
Elemental Analysis of Plants using ICP-OES(2023)
 
ADBB 5cladba Precursor JWH018 +85244677121
ADBB 5cladba Precursor JWH018 +85244677121ADBB 5cladba Precursor JWH018 +85244677121
ADBB 5cladba Precursor JWH018 +85244677121
 
A Complete Guide to Understanding Air Quality Monitoring.pptx
A Complete Guide to Understanding Air Quality Monitoring.pptxA Complete Guide to Understanding Air Quality Monitoring.pptx
A Complete Guide to Understanding Air Quality Monitoring.pptx
 
Urban Farming: 3 Benefits, Challenges & The Rise of Green Cities | CIO Women ...
Urban Farming: 3 Benefits, Challenges & The Rise of Green Cities | CIO Women ...Urban Farming: 3 Benefits, Challenges & The Rise of Green Cities | CIO Women ...
Urban Farming: 3 Benefits, Challenges & The Rise of Green Cities | CIO Women ...
 
A Review on Integrated River Basin Management and Development Master Plan of ...
A Review on Integrated River Basin Management and Development Master Plan of ...A Review on Integrated River Basin Management and Development Master Plan of ...
A Review on Integrated River Basin Management and Development Master Plan of ...
 
Role of Copper and Zinc Nanoparticles in Plant Disease Management
Role of Copper and Zinc Nanoparticles in Plant Disease ManagementRole of Copper and Zinc Nanoparticles in Plant Disease Management
Role of Copper and Zinc Nanoparticles in Plant Disease Management
 
NO1 Pakistan online istikhara for love marriage vashikaran specialist love pr...
NO1 Pakistan online istikhara for love marriage vashikaran specialist love pr...NO1 Pakistan online istikhara for love marriage vashikaran specialist love pr...
NO1 Pakistan online istikhara for love marriage vashikaran specialist love pr...
 
Smart Watering Solutions for Your Garden
Smart Watering Solutions for Your GardenSmart Watering Solutions for Your Garden
Smart Watering Solutions for Your Garden
 
Understanding Air Quality Monitoring A Comprehensive Guide.pdf
Understanding Air Quality Monitoring A Comprehensive Guide.pdfUnderstanding Air Quality Monitoring A Comprehensive Guide.pdf
Understanding Air Quality Monitoring A Comprehensive Guide.pdf
 

3.4.-variance-and-stndard-deviation.pdf

  • 1. Measures of Variability (DATA SCIENCE BASIC) Variance and Standard Deviation
  • 2. Measures of Variability Another measure of the variability in a data set uses the deviations from the mean (x – x).
  • 3. Remember the sample of 6 fish that we caught from the lake . . . They were the following lengths: 3”, 4”, 5”, 6”, 8”, 10” The mean length was 6 inches. Recall that we calculated the deviations from the mean. What was the sum of these deviations? Can we find an average deviation? What can we do to the deviations so that we could find an average?
  • 4. The estimated average of the deviations squared is called the variance. s 2 = x -m ( ) 2 å N
  • 5. Standard Deviation - is the square root of the variance. - is the average distance from the center(mean). s = x -m ( ) 2 å N
  • 6. Notations s for populationstandard deviation s for sample standard deviation
  • 7. ( ) 1 2 2 − − =  n x x s Degree of freedom When calculating sample variance, we use degrees of freedom (n – 1) in the denominator instead of n because this tends to produce better estimates. Degrees of freedom will be revisited again in Chapter 8.
  • 8. x (x - x) (x - x)2 3 -3 4 -2 5 -1 6 0 8 2 10 4 Sum 0 What is the sum of the deviations squared? Remember the sample of 6 fish that we caught from the lake . . . Find the variance of the length of fish. Divide this by 5. First square the deviations 9 4 1 0 4 16 34 s2 = 6.8
  • 9. A typical deviation from the mean is the standard deviation. s2 = 6.8 inches2 so s = 2.608 inches The fish in our sample deviate from the mean of 6 by an average of 2.608 inches.
  • 10. The most commonly used measures of center and variability are the mean and standard deviation, respectively.
  • 11. Choosing Measures of Center and Spread - Mean and Standard Deviation - Median and Interquartile Range
  • 12. • The median and IQR are usually better than the mean and standard deviation for describing a skewed distribution or a distribution with outliers. • Use mean and standard deviation only for reasonably symmetric distributions that don’t have outliers.
  • 13. Rule of Thumb The range is 4 times as much as the standard deviation.
  • 14. symmetrical distribution of data Consider the following data set: 4 5 6 6 6 7 7 7 7 7 7 8 8 8 9 10 This data set produces the histogram shown below. Each interval has width one and each value is located in the middle of an interval. The histogram displays a symmetrical distribution of data
  • 15. Skewness • Skewness is a measure of symmetry, or more precisely, the lack of symmetry. A distribution, or data set, is symmetric if it looks the same to the left and right of the center point. • The skewness for a normal distribution is zero, and any symmetric data should have a skewness near zero. Negative values for the skewness indicate data that are skewed left and positive values for the skewness indicate data that are skewed right. By skewed left, we mean that the left tail is long relative to the right tail. Similarly, skewed right means that the right tail is long relative to the left tail. • [Ref: https://en.wikipedia.org/wiki/Skewness] Skewness in a data series may sometimes be observed not only graphically but by simple inspection of the values. For instance, consider the numeric sequence (49, 50, 51), whose values are evenly distributed around a central value of 50. We can transform this sequence into a negatively skewed distribution by adding a value far below the mean, e.g. (40, 49, 50, 51). Similarly, we can make the sequence positively skewed by adding a value far above the mean, e.g. (49, 50, 51, 60).
  • 16. Skewness Denoted by Sk Sk = 0 Symmetric Sk > 0 Positively skewed Sk < 0 Negatively skewed
  • 17. Negative skew Positive skew Symmetrical distributions (STD – 0) where mean,median & mode are lying on the same line
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.