Further Maths
Data Transformation



             K McMullen 2012
Data Transformations
When a scatterplot is non-linear we can linearise the
regression line by transforming the data using one of
the following transformations:
   Squared transformation
   Log transformation
   Reciprocal transformation

Always use the table of transformations to help you
choose the right transformations
Once you have chosen your transformations, the one
with the highest r2 value is the best transformation
given that it’s residual plot is randomly scattered
(which should be the case is r2 is high)


                                         K McMullen 2012
Data Transformations
Squared transformations
   Has the effect of decreasing values less than 1
   and increasing values greater than 1
   The effect of the squared transformation is to
   stretch the values

Example of x2 transformation:

Example of y2 transformation:




                                        K McMullen 2012
Data Transformations
Log transformation: reduces all values, and
values between 0 and 1 become negative

Large values are reduced much more than small
values

The effect of the log transformation is to
compress the values
   Example of a log (x) transformation:
   Example of a log (y) transformation:



                                          K McMullen 2012
Data Transformations
Reciprocal transformations: reduces all values
greater than 1

Large values are reduced much more than small
values

The effect of the reciprocal transformation is to
compress the large values to and even greater
extent than the log transformation

Example of a 1/x transformation:

Example of a 1/y transformation:
                                       K McMullen 2012
Data Transformations
To do a transformation:
   Draw your scatterplot
   Look at the table of transformations to decide which
   transformations to try
   For each transformation:
       Transform your x or y values depending on the
       transformation
       Calculate the r2 value for each transformation
       Draw the residual graph against your x values (if you
       transformed x then you use the transformed values) for each
       transformation and comment (remember that residual values
       are always on your y-axis)
       Compare each transformation and decide which is best
       When writing your equation make sure you use the
       transformed variable in your answer and calculations

                                                  K McMullen 2012

Further8 data transformation

  • 1.
  • 2.
    Data Transformations When ascatterplot is non-linear we can linearise the regression line by transforming the data using one of the following transformations: Squared transformation Log transformation Reciprocal transformation Always use the table of transformations to help you choose the right transformations Once you have chosen your transformations, the one with the highest r2 value is the best transformation given that it’s residual plot is randomly scattered (which should be the case is r2 is high) K McMullen 2012
  • 3.
    Data Transformations Squared transformations Has the effect of decreasing values less than 1 and increasing values greater than 1 The effect of the squared transformation is to stretch the values Example of x2 transformation: Example of y2 transformation: K McMullen 2012
  • 4.
    Data Transformations Log transformation:reduces all values, and values between 0 and 1 become negative Large values are reduced much more than small values The effect of the log transformation is to compress the values Example of a log (x) transformation: Example of a log (y) transformation: K McMullen 2012
  • 5.
    Data Transformations Reciprocal transformations:reduces all values greater than 1 Large values are reduced much more than small values The effect of the reciprocal transformation is to compress the large values to and even greater extent than the log transformation Example of a 1/x transformation: Example of a 1/y transformation: K McMullen 2012
  • 6.
    Data Transformations To doa transformation: Draw your scatterplot Look at the table of transformations to decide which transformations to try For each transformation: Transform your x or y values depending on the transformation Calculate the r2 value for each transformation Draw the residual graph against your x values (if you transformed x then you use the transformed values) for each transformation and comment (remember that residual values are always on your y-axis) Compare each transformation and decide which is best When writing your equation make sure you use the transformed variable in your answer and calculations K McMullen 2012