This document discusses data transformations which can be used to meet parametric assumptions when data is not normally distributed. It describes common transformations like square root and log transformations that change the shape of the distribution. These transformations allow use of more powerful statistical tests by getting the data closer to a normal distribution. The document cautions that transformations may not completely remove skew and provides examples of applying square root and log transformations to positively skewed data.
2. Transformation
• We’ve talked about parametric assumptions
before
• A key one is the assumption that the data is
normally distributed.
• But often it isn’t
• What if we don’t want to sacrifice power?
3. Transformation
• We can transform the data
• This changes all the scores that we collect in
the same way – it changes the shape of the
distribution without altering only some scores
(i.e. cheating)
4. Transformation
• The most common ones are:
• Square Root Transformation
• Log Transformation
• They do slightly different things and are useful
in different situations
5. Transformation
• We’ve already talked about one
transformation quite extensively
• When we calculate z scores, we are actually
applying a transformation
• We are changing the scale on which
participants are measured – not the actual
score they got, but how many standard
deviations they are from zero.
6. • The easiest to explain is the Square Root
transformation.
• Say we had data that looked like this:
Square Root
7. • It is positively skewed
• If we square root every data point, it should
help
• Why? Because square rooting changes big
numbers more than it does small numbers
Square Root
8. • The square root transformation can have a
moderate effect on the skew of the data
• Here’s the plot of the data from before, after
transformation
• It’s better, but it’s not
right
Square Root
9. • This is a bit stronger than the square root
transformation, but works on a similar
principle
• There are a few types of log transformation
that might be used but we will use the “base
10” version here
Log transformation
10. • A log (full word is logarithm) is a power to
which a number must be raised in order to
make another number.
What’s a log?
11. • Well not really.
• Here’s an example using log base 10.
Log 100 = 2
• Why? We start with the base number (10).
We want to turn 10 into 100. To do so, we
have to raise 10 to the power of 2 (square it)
• 102 = 100
What’s a log?
12. • Try this one
Log 1000 = 3
• Why? We start with the base number (10).
We want to turn 10 into 1000. To do so, we
have to raise 10 to the power of 3 (cube it)
• 103 = 1000
What’s a log?
13. • The log transformation can have a hefty effect
on the skew of the data
• Here’s the plot of the data from before, after
log transformation
• It’s better again
Log transformation
14. Transformation
• Transformations can often reduce skew, but it
isn’t likely that it will completely remove it
• We’re just trying to get it to an acceptable
level so we can meet parametric assumptions
• If we succeed, we do the stats test on the
transformed data rather than the raw scores.
15. Descriptives
• The usefulness of a transformation is that it
allows us to use a more powerful inferential
statistic rather than resorting to non-
parametric tests.
• However, reporting means of transformed
variables isn’t very useful cos nobody knows
what they mean…
16. Descriptives
• Before reporting your descriptive statistics,
you need to undo whatever transformation
you did.
• E.g. mean from the analysis = 2
• Analysis was performed on square root
transformed data
• So 2 is the square root of the real mean score
• Report 4 as the mean score (which is 22)
17. Cautionary notes
• You can’t log or square root transform data
with negative numbers in
• The methods in these slides will only work for
positively skewed data
• If your data is negatively skewed, you have to
reverse the scores before applying the
transformation (I’ll show you how in the
video)