SlideShare a Scribd company logo
1 of 23
Assessing Normality and
Data Transformations
Role of Normality
• Many statistical methods require that the
numeric variables we are working with have an
approximate normal distribution.
• For example, t-tests, F-tests, and
regression analyses all require in some
sense that the numeric variables are
approximately normally distributed.
Standardized normal
distribution with
empirical rule
percentages.
Tools for Assessing Normality
• Histogram and Boxplot
• Normal Quantile Plot
(also called Normal Probability Plot)
• Goodness of Fit Tests
Shapiro-Wilk Test (JMP)
Kolmogorov-Smirnov Test (SPSS)
Anderson-Darling Test (MINITAB)
Histograms and Boxplots
The cholesterol levels of
the patients appear to be
approximately normal,
although there is some
evidence of right skewness
as the mean is larger than
the median.
The red curve represents a
normal distribution fit to
these data and the blue
curve the density estimate
for these data, these curves
should agree if our data is
normally distributed.
Histograms and Boxplots
The systolic volumes of
the male heart patients in
this study suggest that they
come from a right skewed
population distribution.
The red curve represents a
normal distribution fit to
these data and the blue is
the estimated density from
the data which does not
agree with the imposed
normal.
Outliers are not
consistent with
normality.
Normal Quantile Plot
• Basically compares the spacing of our data to
what we would expect to see in terms of spacing
if our data were approximately normal.
If our data is approximately normally
distributed we should spacing similar to
what I attempted to show on the normal
curve on the right. Very few
observations in both tails and
increasingly more observations as we
move towards the mean from either
side. Also remember the spacing must
be symmetric about the mean.
Normal Quantile Plot
THE IDEAL PLOT:
Here is an example where
the data is perfectly normal.
The plot on right is a
normal quantile plot with
the data on the vertical axis
and the expected z-scores if
our data was normal on the
horizontal axis.
When our data is
approximately normal the
spacing of the two will
agree resulting in a plot
with observations lying on
the reference line in the
normal quantile plot. The
points should lie within the
dashed lines.
Normal Quantile Plot
THE IDEAL PLOT:
Here is an example where
the data is perfectly normal.
The plot on right is a
normal quantile plot with
the data on the vertical axis
and the expected z-scores if
our data was normal on the
horizontal axis.
When our data is
approximately normal the
spacing of the two will
agree resulting in a plot
with observations lying on
the reference line in the
normal quantile plot. The
points should lie within the
dashed lines.
Normal Quantile Plot
(right skewness)
The systolic volumes of
the male heart patients are
clearly right skewed.
When the data is plotted
vs. the expected z-scores
the normal quantile plot
shows right skewness by
a upward bending curve
Normal Quantile Plot
(left skewness)
The distribution of
birthweights from this
study of very low
birthweight infants is
skewed left.
When the data is plotted
vs. the expected z-
scores the normal
quantile plot shows left
skewness by a
downward bending
curve.
Normal Quantile Plot
(leptokurtosis)
The distribution of
sodium levels of
patients in this right
heart catheterization
study has heavier tails
than a normal
distribution (i.e,
leptokurtosis).
When the data is plotted
vs. the expected z-
scores the normal
quantile plot there is an
“S-shape” which
indicates kurtosis.
Normal Quantile Plot
(discrete data)
Although the distribution
of the gestational age data
of infants in the very low
birthweight study is
approx. normal there is a
“staircase” appearance in
normal quantile plot.
This is due to the discrete
coding of the gestational
age which was recorded to
the nearest week or half
week.
Normal Quantile Plots
IMPORTANT NOTE:
• If you plot DATA vs. NORMAL as on the
previous slides then:
downward bend = left skew
upward bend = right skew
• If you plot NORMAL vs. DATA then:
downward bend = right skew
upward bend = left skew
Tests of Normality
There are several different tests that can be used to
test the following hypotheses:
Ho: The distribution is normal
HA: The distribution is NOT normal
Common tests of normality include:
Shapiro-Wilk Kolmogorov-Smirnov
Anderson-Darling Lillefor’s
Problem: THEY DON’T ALWAYS AGREE!!
Tests of Normality
Ho: The distribution of systolic volume
is normal
HA: The distribution of systolic volume
is NOT normal Because p < .0001 we have
strong evidence against
normality for the systolic
volume population distribution
using the Shapiro-Wilk test.
Tests of Normality
Ho: The distribution of systolic volume
is normal
HA: The distribution of systolic volume
is NOT normal
We do not have evidence at
the a = .05 level against the
normality of the population
systolic volume distribution
when using the Kolmogorov-
Smirnov test from SPSS.
Tests of Normality
Ho: The distribution of cholesterol level
is normal
HA: The distribution of cholesterol level
is NOT normal We have no evidence
against the normality of
the population
distribution of
cholesterol levels for male
heart patients (p = .2184).
Transformations to Improve
Normality (removing skewness)
Many statistical methods require that the
numeric variables you are working with have
an approximately normal distribution.
Reality is that this is often times not the case.
One of the most common departures from
normality is skewness, in particular, right
skewness.
2
V
UP
)
as
this
of
think
(
log 0
10 V
V
V
1

2
1
V

3
V
4
V
Bigger
Impact
Bigger
Impact
3
V
2
V
V Middle rung:
No transformation
(l = 1)
DOWN
Here V represents our variable of
interest. We are going to consider this
variable raised to a power l, i.e. Vl
We go up the ladder to remove left
skewness and down the ladder to
remove right skewness.
Right skewed
Left skewed
Tukey’s Ladder of Powers
Tukey’s Ladder of Powers
• To remove right skewness we typically take the
square root, cube root, logarithm, or reciprocal of a
the variable etc., i.e. V .5, V .333, log10(V) (think of V0) ,
V -1, etc.
• To remove left skewness we raise the variable to a
power greater than 1, such as squaring or cubing
the values, i.e. V 2, V 3, etc.
Removing Right Skewness
Example 1: PDP-LI levels for cancer patients
In the log base 10 scale the PDP-LI values are approximately
normally distributed.
Removing Right Skewness
Example 2: Systolic Volume for Male Heart Patients
sysvol sysvol .5 sysvol .333
log10(sysvol) 1/sysvol
Removing Right Skewness
Example 2: Systolic Volume for Male Heart Patients
1/sysvol
The reciprocal of systolic volume is
approximately normally distributed
and the Shapiro-Wilk test provides no
evidence against normality (p = .5340).
CAUTION: The use of the reciprocal transformation reorders the data in the sense
that the largest value becomes the smallest and the smallest becomes the largest
after transformation. The units after transformation may or may not make sense,
e.g. if the original units are mg/ml then after transformation they would be ml/mg.

More Related Content

Similar to AssessingNormalityandDataTransformations.ppt

STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docxSTAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
whitneyleman54422
 
Graphical presentation of data
Graphical presentation of dataGraphical presentation of data
Graphical presentation of data
drasifk
 

Similar to AssessingNormalityandDataTransformations.ppt (20)

Statistics basics for oncologist kiran
Statistics basics for oncologist kiranStatistics basics for oncologist kiran
Statistics basics for oncologist kiran
 
COM 201_Inferential Statistics_18032022.pptx
COM 201_Inferential Statistics_18032022.pptxCOM 201_Inferential Statistics_18032022.pptx
COM 201_Inferential Statistics_18032022.pptx
 
Normal distribution
Normal distributionNormal distribution
Normal distribution
 
Chapter34
Chapter34Chapter34
Chapter34
 
Normal Probability Curve by Dr. Neha Deo
Normal Probability Curve by Dr. Neha DeoNormal Probability Curve by Dr. Neha Deo
Normal Probability Curve by Dr. Neha Deo
 
Basic Probability Distribution
Basic Probability Distribution Basic Probability Distribution
Basic Probability Distribution
 
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docxSTAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
 
Measures of relationship
Measures of relationshipMeasures of relationship
Measures of relationship
 
Normal distribution
Normal distributionNormal distribution
Normal distribution
 
statistics
statisticsstatistics
statistics
 
Graphical presentation of data
Graphical presentation of dataGraphical presentation of data
Graphical presentation of data
 
3. parametric assumptions
3. parametric assumptions3. parametric assumptions
3. parametric assumptions
 
Normal Distribution
Normal DistributionNormal Distribution
Normal Distribution
 
2.statistical DEcision makig.pptx
2.statistical DEcision makig.pptx2.statistical DEcision makig.pptx
2.statistical DEcision makig.pptx
 
Moments, Kurtosis N Skewness
Moments, Kurtosis N SkewnessMoments, Kurtosis N Skewness
Moments, Kurtosis N Skewness
 
Statistik dan Probabilitas Yuni Yamasari 2.pptx
Statistik dan Probabilitas Yuni Yamasari 2.pptxStatistik dan Probabilitas Yuni Yamasari 2.pptx
Statistik dan Probabilitas Yuni Yamasari 2.pptx
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
 
logistic regression analysis
logistic regression analysislogistic regression analysis
logistic regression analysis
 
template.pptx
template.pptxtemplate.pptx
template.pptx
 
Normality evaluation in a data
Normality evaluation in a dataNormality evaluation in a data
Normality evaluation in a data
 

More from BAGARAGAZAROMUALD2 (13)

water-13-00495-v3.pdf
water-13-00495-v3.pdfwater-13-00495-v3.pdf
water-13-00495-v3.pdf
 
5116427.ppt
5116427.ppt5116427.ppt
5116427.ppt
 
240-design.ppt
240-design.ppt240-design.ppt
240-design.ppt
 
Remote Sensing_2020-21 (1).pdf
Remote Sensing_2020-21  (1).pdfRemote Sensing_2020-21  (1).pdf
Remote Sensing_2020-21 (1).pdf
 
Szeliski_NLS1.ppt
Szeliski_NLS1.pptSzeliski_NLS1.ppt
Szeliski_NLS1.ppt
 
AssessingNormalityandDataTransformations.ppt
AssessingNormalityandDataTransformations.pptAssessingNormalityandDataTransformations.ppt
AssessingNormalityandDataTransformations.ppt
 
18-21 Principles of Least Squares.ppt
18-21 Principles of Least Squares.ppt18-21 Principles of Least Squares.ppt
18-21 Principles of Least Squares.ppt
 
Ch 11.2 Chi Squared Test for Independence.pptx
Ch 11.2 Chi Squared Test for Independence.pptxCh 11.2 Chi Squared Test for Independence.pptx
Ch 11.2 Chi Squared Test for Independence.pptx
 
lecture12.ppt
lecture12.pptlecture12.ppt
lecture12.ppt
 
Chi-Square Presentation - Nikki.ppt
Chi-Square Presentation - Nikki.pptChi-Square Presentation - Nikki.ppt
Chi-Square Presentation - Nikki.ppt
 
StatWRLecture6.ppt
StatWRLecture6.pptStatWRLecture6.ppt
StatWRLecture6.ppt
 
chapter18.ppt
chapter18.pptchapter18.ppt
chapter18.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 

Recently uploaded

Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 

AssessingNormalityandDataTransformations.ppt

  • 2. Role of Normality • Many statistical methods require that the numeric variables we are working with have an approximate normal distribution. • For example, t-tests, F-tests, and regression analyses all require in some sense that the numeric variables are approximately normally distributed. Standardized normal distribution with empirical rule percentages.
  • 3. Tools for Assessing Normality • Histogram and Boxplot • Normal Quantile Plot (also called Normal Probability Plot) • Goodness of Fit Tests Shapiro-Wilk Test (JMP) Kolmogorov-Smirnov Test (SPSS) Anderson-Darling Test (MINITAB)
  • 4. Histograms and Boxplots The cholesterol levels of the patients appear to be approximately normal, although there is some evidence of right skewness as the mean is larger than the median. The red curve represents a normal distribution fit to these data and the blue curve the density estimate for these data, these curves should agree if our data is normally distributed.
  • 5. Histograms and Boxplots The systolic volumes of the male heart patients in this study suggest that they come from a right skewed population distribution. The red curve represents a normal distribution fit to these data and the blue is the estimated density from the data which does not agree with the imposed normal. Outliers are not consistent with normality.
  • 6. Normal Quantile Plot • Basically compares the spacing of our data to what we would expect to see in terms of spacing if our data were approximately normal. If our data is approximately normally distributed we should spacing similar to what I attempted to show on the normal curve on the right. Very few observations in both tails and increasingly more observations as we move towards the mean from either side. Also remember the spacing must be symmetric about the mean.
  • 7. Normal Quantile Plot THE IDEAL PLOT: Here is an example where the data is perfectly normal. The plot on right is a normal quantile plot with the data on the vertical axis and the expected z-scores if our data was normal on the horizontal axis. When our data is approximately normal the spacing of the two will agree resulting in a plot with observations lying on the reference line in the normal quantile plot. The points should lie within the dashed lines.
  • 8. Normal Quantile Plot THE IDEAL PLOT: Here is an example where the data is perfectly normal. The plot on right is a normal quantile plot with the data on the vertical axis and the expected z-scores if our data was normal on the horizontal axis. When our data is approximately normal the spacing of the two will agree resulting in a plot with observations lying on the reference line in the normal quantile plot. The points should lie within the dashed lines.
  • 9. Normal Quantile Plot (right skewness) The systolic volumes of the male heart patients are clearly right skewed. When the data is plotted vs. the expected z-scores the normal quantile plot shows right skewness by a upward bending curve
  • 10. Normal Quantile Plot (left skewness) The distribution of birthweights from this study of very low birthweight infants is skewed left. When the data is plotted vs. the expected z- scores the normal quantile plot shows left skewness by a downward bending curve.
  • 11. Normal Quantile Plot (leptokurtosis) The distribution of sodium levels of patients in this right heart catheterization study has heavier tails than a normal distribution (i.e, leptokurtosis). When the data is plotted vs. the expected z- scores the normal quantile plot there is an “S-shape” which indicates kurtosis.
  • 12. Normal Quantile Plot (discrete data) Although the distribution of the gestational age data of infants in the very low birthweight study is approx. normal there is a “staircase” appearance in normal quantile plot. This is due to the discrete coding of the gestational age which was recorded to the nearest week or half week.
  • 13. Normal Quantile Plots IMPORTANT NOTE: • If you plot DATA vs. NORMAL as on the previous slides then: downward bend = left skew upward bend = right skew • If you plot NORMAL vs. DATA then: downward bend = right skew upward bend = left skew
  • 14. Tests of Normality There are several different tests that can be used to test the following hypotheses: Ho: The distribution is normal HA: The distribution is NOT normal Common tests of normality include: Shapiro-Wilk Kolmogorov-Smirnov Anderson-Darling Lillefor’s Problem: THEY DON’T ALWAYS AGREE!!
  • 15. Tests of Normality Ho: The distribution of systolic volume is normal HA: The distribution of systolic volume is NOT normal Because p < .0001 we have strong evidence against normality for the systolic volume population distribution using the Shapiro-Wilk test.
  • 16. Tests of Normality Ho: The distribution of systolic volume is normal HA: The distribution of systolic volume is NOT normal We do not have evidence at the a = .05 level against the normality of the population systolic volume distribution when using the Kolmogorov- Smirnov test from SPSS.
  • 17. Tests of Normality Ho: The distribution of cholesterol level is normal HA: The distribution of cholesterol level is NOT normal We have no evidence against the normality of the population distribution of cholesterol levels for male heart patients (p = .2184).
  • 18. Transformations to Improve Normality (removing skewness) Many statistical methods require that the numeric variables you are working with have an approximately normal distribution. Reality is that this is often times not the case. One of the most common departures from normality is skewness, in particular, right skewness.
  • 19. 2 V UP ) as this of think ( log 0 10 V V V 1  2 1 V  3 V 4 V Bigger Impact Bigger Impact 3 V 2 V V Middle rung: No transformation (l = 1) DOWN Here V represents our variable of interest. We are going to consider this variable raised to a power l, i.e. Vl We go up the ladder to remove left skewness and down the ladder to remove right skewness. Right skewed Left skewed Tukey’s Ladder of Powers
  • 20. Tukey’s Ladder of Powers • To remove right skewness we typically take the square root, cube root, logarithm, or reciprocal of a the variable etc., i.e. V .5, V .333, log10(V) (think of V0) , V -1, etc. • To remove left skewness we raise the variable to a power greater than 1, such as squaring or cubing the values, i.e. V 2, V 3, etc.
  • 21. Removing Right Skewness Example 1: PDP-LI levels for cancer patients In the log base 10 scale the PDP-LI values are approximately normally distributed.
  • 22. Removing Right Skewness Example 2: Systolic Volume for Male Heart Patients sysvol sysvol .5 sysvol .333 log10(sysvol) 1/sysvol
  • 23. Removing Right Skewness Example 2: Systolic Volume for Male Heart Patients 1/sysvol The reciprocal of systolic volume is approximately normally distributed and the Shapiro-Wilk test provides no evidence against normality (p = .5340). CAUTION: The use of the reciprocal transformation reorders the data in the sense that the largest value becomes the smallest and the smallest becomes the largest after transformation. The units after transformation may or may not make sense, e.g. if the original units are mg/ml then after transformation they would be ml/mg.