SlideShare a Scribd company logo
1 of 31
Computing Transformations BY
      TARUNGEHLOT


      Transforming variables

   Transformations for normality

   Transformations for linearity
Transforming variables to satisfy assumptions

   When a metric variable fails to satisfy the
    assumption of normality, homogeneity of variance,
    or linearity, we may be able to correct the
    deficiency by using a transformation.

   We will consider three transformations for normality,
    homogeneity of variance, and linearity:
       the logarithmic transformation
       the square root transformation, and
       the inverse transformation


   plus a fourth that is useful for problems of linearity:
       the square transformation
Computing transformations in SPSS

   In SPSS, transformations are obtained by computing a
    new variable. SPSS functions are available for the
    logarithmic (LG10) and square root (SQRT)
    transformations. The inverse transformation uses a
    formula which divides one by the original value for
    each case.

   For each of these calculations, there may be data
    values which are not mathematically permissible.
    For example, the log of zero is not defined
    mathematically, division by zero is not permitted,
    and the square root of a negative number results in
    an “imaginary” value. We will usually adjust the
    values passed to the function to make certain that
    these illegal operations do not occur.
Two forms for computing transformations

   There are two forms for each of the transformations
    to induce normality, depending on whether the
    distribution is skewed negatively to the left or
    skewed positively to the right.

   Both forms use the same SPSS functions and formula
    to calculate the transformations.

   The two forms differ in the value or argument passed
    to the functions and formula. The argument to the
    functions is an adjustment to the original value of
    the variable to make certain that all of the
    calculations are mathematically correct.
Functions and formulas for transformations

   Symbolically, if we let x stand for the argument
    passes to the function or formula, the calculations
    for the transformations are:
       Logarithmic transformation: compute log =
        LG10(x)
       Square root transformation: compute sqrt =
        SQRT(x)
       Inverse transformation: compute inv = 1 / (x)
       Square transformation: compute s2 = x * x
   For all transformations, the argument must be
    greater than zero to guarantee that the calculations
    are mathematically legitimate.
Transformation of positively skewed variables

   For positively skewed variables, the argument is an
    adjustment to the original value based on the
    minimum value for the variable.

   If the minimum value for a variable is zero, the
    adjustment requires that we add one to each value,
    e.g. x + 1.

   If the minimum value for a variable is a negative
    number (e.g., –6), the adjustment requires that we
    add the absolute value of the minimum value (e.g. 6)
    plus one (e.g. x + 6 + 1, which equals x +7).
Example of positively skewed variable


   Suppose our dataset contains the number of books
    read (books) for 5 subjects: 1, 3, 0, 5, and 2, and the
    distribution is positively skewed.

   The minimum value for the variable books is 0. The
    adjustment for each case is books + 1.

   The transformations would be calculated as follows:
      Compute logBooks = LG10(books + 1)

      Compute sqrBooks = SQRT(books + 1)

      Compute invBooks = 1 / (books + 1)
Transformation of negatively skewed variables


   If the distribution of a variable is negatively skewed,
    the adjustment of the values reverses, or reflects,
    the distribution so that it becomes positively
    skewed. The transformations are then computed on
    the values in the positively skewed distribution.

   Reflection is computed by subtracting all of the
    values for a variable from one plus the absolute
    value of maximum value for the variable. This results
    in a positively skewed distribution with all values
    larger than zero.
Example of negatively skewed variable


   Suppose our dataset contains the number of books
    read (books) for 5 subjects: 1, 3, 0, 5, and 2, and the
    distribution is negatively skewed.

   The maximum value for the variable books is 5. The
    adjustment for each case is 6 - books.

   The transformations would be calculated as follows:
      Compute logBooks = LG10(6 - books)

      Compute sqrBooks = SQRT(6 - books)

      Compute invBooks = 1 / (6 - books)
The Square Transformation for Linearity


   The square transformation is computed by
    multiplying the value for the variable by itself.

   It does not matter whether the distribution is
    positively or negatively skewed.

   It does matter if the variable has negative values,
    since we would not be able to distinguish their
    squares from the square of a comparable positive
    value (e.g. the square of -4 is equal to the square of
    +4). If the variable has negative values, we add the
    absolute value of the minimum value to each score
    before squaring it.
Example of the square transformation


   Suppose our dataset contains change scores (chg) for
    5 subjects that indicate the difference between test
    scores at the end of a semester and test scores at
    mid-term: -10, 0, 10, 20, and 30.

   The minimum score is -10. The absolute value of the
    minimum score is 10.

   The transformation would be calculated as follows:
      Compute squarChg = (chg + 10) * (chg + 10)
Transformations for normality


                                                      Both the histogram and the normality plot for Total
                                                      Time Spent on the Internet (netime) indicate that the
                                                      variable is not normally distributed.




                 Histogram
            50                                                                                                        Normal Q-Q Plot of TOTAL TIME SPENT ON THE IN
                                                                                                                 3


            40                                                                                                   2


                                                                                                                 1
            30

                                                                                                                 0

            20
                                                                                              Expected Normal




                                                                                                                 -1
Frequency




            10
                                                                                              Std. Dev = 15.35   -2
                                                                                              Mean = 10.7
                                                                                              N = 93.00          -3
            0
                                                                                                                   -40      -20        0   20   40   60   80   100   120
                 0.0          20.0          40.0          60.0          80.0          100.0
                       10.0          30.0          50.0          70.0          90.0                                   Observed Value

                 TOTAL TIME SPENT ON THE INTERNET
Determine whether reflection is required
                                     Descriptives

                                                         Stat istic   Std. Error
 TOTAL TIME SPENT   Mean                                     10.73          1.59
 ON THE INTERNET    95% Confidence         Lower Bound         7.57
                    Interval for Mean      Upper Bound
                                                             13.89

                    5% Trimmed Mean                          8.29
                    Median                                   5.50
                    Variance                              235. 655
                    Std. Deviation                          15.35
                    Minimum                                      0
                    Maximum                                    102
                    Range                                      102
                    Interquartile Range                     10.20
                    Skewness                                3.532          .250
                    Kurt osis                              15.614          .495


         Skewness, in the table of Descriptive Statistics,
         indicates whether or not reflection (reversing the
         values) is required in the transformation.

         If Skewness is positive, as it is in this problem,
         reflection is not required. If Skewness is negative,
         reflection is required.
Compute the adjustment to the argument
                                       Descriptives

                                                           Stat istic   Std. Error
   TOTAL TIME SPENT   Mean                                     10.73          1.59
   ON THE INTERNET    95% Confidence         Lower Bound         7.57
                      Interval for Mean      Upper Bound
                                                               13.89

                      5% Trimmed Mean                          8.29
                      Median                                   5.50
                      Variance                              235. 655
                      Std. Deviation                          15.35
                      Minimum                                      0
                      Maximum                                    102
                      Range                                      102
                      Interquartile Range                     10.20
                      Skewness                                3.532          .250
                      Kurt osis                              15.614          .495


          In this problem, the minimum value is 0, so 1 will be
          added to each value in the formula, i.e. the argument
          to the SPSS functions and formula for the inverse will
          be:

                                     netime + 1.
Computing the logarithmic transformation




                     To compute the
                     transformation, select the
                     Compute… command from the
                     Transform menu.
Specifying the transform variable name and
                  function

                 First, in the Target Variable text box, type a
                 name for the log transformation variable, e.g.
                 “lgnetime“.




                                                                  Third, click
                                                                  on the up
                                                                  arrow button
                                                                  to move the
                                                                  highlighted
                                                                  function to
    Second, scroll down the list of functions to                  the Numeric
    find LG10, which calculates logarithmic                       Expression
    values use a base of 10. (The logarithmic                     text box.
    values are the power to which 10 is raised
    to produce the original number.)
Adding the variable name to the function




                                 Second, click on the right arrow
                                 button. SPSS will replace the
                                 highlighted text in the function
                                 (?) with the name of the variable.

First, scroll down the list of
variables to locate the
variable we want to
transform. Click on its name
so that it is highlighted.
Adding the constant to the function

Following the rules stated for determining the constant
that needs to be included in the function either to
prevent mathematical errors, or to do reflection, we
include the constant in the function argument. In this
case, we add 1 to the netime variable.




                                                      Click on the OK
                                                      button to complete
                                                      the compute
                                                      request.
The transformed variable




The transformed variable which we
requested SPSS compute is shown in the
data editor in a column to the right of the
other variables in the dataset.
Computing the square root transformation




                     To compute the transformation,
                     select the Compute… command
                     from the Transform menu.
Specifying the transform variable name and
                  function

                 First, in the Target Variable text box, type a
                 name for the square root transformation
                 variable, e.g. “sqnetime“.




                                                                  Third, click
                                                                  on the up
                                                                  arrow button
                                                                  to move the
                                                                  highlighted
                                                                  function to
                                                                  the Numeric
    Second, scroll down the list of functions to                  Expression
    find SQRT, which calculates the square root                   text box.
    of a variable.
Adding the variable name to the function




                                 Second, click on the right arrow
                                 button. SPSS will replace the
                                 highlighted text in the function
                                 (?) with the name of the variable.
First, scroll down the list of
variables to locate the
variable we want to
transform. Click on its name
so that it is highlighted.
Adding the constant to the function

Following the rules stated for determining the constant
that needs to be included in the function either to
prevent mathematical errors, or to do reflection, we
include the constant in the function argument. In this
case, we add 1 to the netime variable.




                                                      Click on the OK
                                                      button to complete
                                                      the compute
                                                      request.
The transformed variable




The transformed variable which we
requested SPSS compute is shown in the
data editor in a column to the right of the
other variables in the dataset.
Computing the inverse transformation




                   To compute the transformation,
                   select the Compute… command
                   from the Transform menu.
Specifying the transform variable name and
                  formula

    First, in the Target
    Variable text box, type a            Second, there is not a function for
    name for the inverse                 computing the inverse, so we type
    transformation variable,             the formula directly into the
    e.g. “innetime“.                     Numeric Expression text box.




                                Third, click on the
                                OK button to
                                complete the
                                compute request.
The transformed variable




The transformed variable which we
requested SPSS compute is shown in the
data editor in a column to the right of the
other variables in the dataset.
Adjustment to the argument for the square
             transformation

    It is mathematically correct to square a value of zero, so the
    adjustment to the argument for the square transformation is
    different. What we need to avoid are negative
    numbers, since the square of a negative number produces
    the same value as the square of a positive number.
                                      Descriptives

                                                           Stat istic   Std. Error
   TOTAL TIME SPENT   Mean                                     10.73          1.59
   ON THE INTERNET    95% Confidence        Lower Bound          7.57
                      Interval for Mean     Upper Bound
                                                               13.89

                      5% Trimmed Mean                        8.29
                      Median                                 5.50
                      Variance                           235. 655
                      Std. Deviation                        15.35
                      Minimum                                   0
                      Maximum                                 102
                      Range                                   102
                      Interquartile Range                   10.20
    In this problem, the minimum value is 0, no adjustment
                      Skewness                              3.532            .250
    is needed for computing the square. If the minimum
                      Kurt osis                            15.614            .495
    was a number less than zero, we would add the
    absolute value of the minimum (dropping the sign) as
    an adjustment to the variable.
Computing the square transformation




                  To compute the
                  transformation, select the
                  Compute… command from the
                  Transform menu.
Specifying the transform variable name and
                  formula

    First, in the Target
    Variable text box, type a
                                         Second, there is not a function for
    name for the inverse
                                         computing the square, so we type
    transformation
                                         the formula directly into the
    variable, e.g.
                                         Numeric Expression text box.
    “s2netime“.




                                Third, click on the
                                OK button to
                                complete the
                                compute request.
The transformed variable




The transformed variable which we
requested SPSS compute is shown in the
data editor in a column to the right of the
other variables in the dataset.

More Related Content

What's hot

書籍「計算社会科学入門」第9章 統計モデリング
書籍「計算社会科学入門」第9章 統計モデリング書籍「計算社会科学入門」第9章 統計モデリング
書籍「計算社会科学入門」第9章 統計モデリングMasanori Takano
 
東京都市大学 データ解析入門 7 回帰分析とモデル選択 2
東京都市大学 データ解析入門 7 回帰分析とモデル選択 2東京都市大学 データ解析入門 7 回帰分析とモデル選択 2
東京都市大学 データ解析入門 7 回帰分析とモデル選択 2hirokazutanaka
 
Negative binomial distribution
Negative binomial distributionNegative binomial distribution
Negative binomial distributionNadeem Uddin
 
Regression and corelation (Biostatistics)
Regression and corelation (Biostatistics)Regression and corelation (Biostatistics)
Regression and corelation (Biostatistics)Muhammadasif909
 
T12 non-parametric tests
T12 non-parametric testsT12 non-parametric tests
T12 non-parametric testskompellark
 
{tidygraph}と{ggraph}によるモダンなネットワーク分析
{tidygraph}と{ggraph}によるモダンなネットワーク分析{tidygraph}と{ggraph}によるモダンなネットワーク分析
{tidygraph}と{ggraph}によるモダンなネットワーク分析Takashi Kitano
 
How to use in R model-agnostic data explanation with DALEX & iml
How to use in R model-agnostic data explanation with DALEX & imlHow to use in R model-agnostic data explanation with DALEX & iml
How to use in R model-agnostic data explanation with DALEX & imlSatoshi Kato
 
Chi square test final
Chi square test finalChi square test final
Chi square test finalHar Jindal
 
みどりぼん3章前半
みどりぼん3章前半みどりぼん3章前半
みどりぼん3章前半Akifumi Eguchi
 
臨床家が知っておくべき臨床疫学・統計
臨床家が知っておくべき臨床疫学・統計臨床家が知っておくべき臨床疫学・統計
臨床家が知っておくべき臨床疫学・統計Yasuaki Sagara
 
Correlation analysis
Correlation analysisCorrelation analysis
Correlation analysisVan Martija
 
混合モデルを使って反復測定分散分析をする
混合モデルを使って反復測定分散分析をする混合モデルを使って反復測定分散分析をする
混合モデルを使って反復測定分散分析をするMasaru Tokuoka
 
状態空間モデルの実行方法と実行環境の比較
状態空間モデルの実行方法と実行環境の比較状態空間モデルの実行方法と実行環境の比較
状態空間モデルの実行方法と実行環境の比較Hiroki Itô
 
Pearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear RegressionPearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear RegressionAzmi Mohd Tamil
 
PLS偏最小平方法免費軟體-SmartPLS論壇申請帳號流程-三星統計謝章升
PLS偏最小平方法免費軟體-SmartPLS論壇申請帳號流程-三星統計謝章升PLS偏最小平方法免費軟體-SmartPLS論壇申請帳號流程-三星統計謝章升
PLS偏最小平方法免費軟體-SmartPLS論壇申請帳號流程-三星統計謝章升Beckett Hsieh
 
Normal Probabilty Distribution and its Problems
Normal Probabilty Distribution and its ProblemsNormal Probabilty Distribution and its Problems
Normal Probabilty Distribution and its ProblemsAli Raza
 
2.4 Scatterplots, correlation, and regression
2.4 Scatterplots, correlation, and regression2.4 Scatterplots, correlation, and regression
2.4 Scatterplots, correlation, and regressionLong Beach City College
 

What's hot (20)

書籍「計算社会科学入門」第9章 統計モデリング
書籍「計算社会科学入門」第9章 統計モデリング書籍「計算社会科学入門」第9章 統計モデリング
書籍「計算社会科学入門」第9章 統計モデリング
 
東京都市大学 データ解析入門 7 回帰分析とモデル選択 2
東京都市大学 データ解析入門 7 回帰分析とモデル選択 2東京都市大学 データ解析入門 7 回帰分析とモデル選択 2
東京都市大学 データ解析入門 7 回帰分析とモデル選択 2
 
Negative binomial distribution
Negative binomial distributionNegative binomial distribution
Negative binomial distribution
 
Regression and corelation (Biostatistics)
Regression and corelation (Biostatistics)Regression and corelation (Biostatistics)
Regression and corelation (Biostatistics)
 
T12 non-parametric tests
T12 non-parametric testsT12 non-parametric tests
T12 non-parametric tests
 
{tidygraph}と{ggraph}によるモダンなネットワーク分析
{tidygraph}と{ggraph}によるモダンなネットワーク分析{tidygraph}と{ggraph}によるモダンなネットワーク分析
{tidygraph}と{ggraph}によるモダンなネットワーク分析
 
How to use in R model-agnostic data explanation with DALEX & iml
How to use in R model-agnostic data explanation with DALEX & imlHow to use in R model-agnostic data explanation with DALEX & iml
How to use in R model-agnostic data explanation with DALEX & iml
 
Chi square test final
Chi square test finalChi square test final
Chi square test final
 
Multiple regression
Multiple regressionMultiple regression
Multiple regression
 
みどりぼん3章前半
みどりぼん3章前半みどりぼん3章前半
みどりぼん3章前半
 
臨床家が知っておくべき臨床疫学・統計
臨床家が知っておくべき臨床疫学・統計臨床家が知っておくべき臨床疫学・統計
臨床家が知っておくべき臨床疫学・統計
 
2. adf tests
2. adf tests2. adf tests
2. adf tests
 
Correlation analysis
Correlation analysisCorrelation analysis
Correlation analysis
 
混合モデルを使って反復測定分散分析をする
混合モデルを使って反復測定分散分析をする混合モデルを使って反復測定分散分析をする
混合モデルを使って反復測定分散分析をする
 
状態空間モデルの実行方法と実行環境の比較
状態空間モデルの実行方法と実行環境の比較状態空間モデルの実行方法と実行環境の比較
状態空間モデルの実行方法と実行環境の比較
 
Pearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear RegressionPearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear Regression
 
PLS偏最小平方法免費軟體-SmartPLS論壇申請帳號流程-三星統計謝章升
PLS偏最小平方法免費軟體-SmartPLS論壇申請帳號流程-三星統計謝章升PLS偏最小平方法免費軟體-SmartPLS論壇申請帳號流程-三星統計謝章升
PLS偏最小平方法免費軟體-SmartPLS論壇申請帳號流程-三星統計謝章升
 
Normal Probabilty Distribution and its Problems
Normal Probabilty Distribution and its ProblemsNormal Probabilty Distribution and its Problems
Normal Probabilty Distribution and its Problems
 
P value part 1
P value part 1P value part 1
P value part 1
 
2.4 Scatterplots, correlation, and regression
2.4 Scatterplots, correlation, and regression2.4 Scatterplots, correlation, and regression
2.4 Scatterplots, correlation, and regression
 

Viewers also liked

Dependent v. independent variables
Dependent v. independent variablesDependent v. independent variables
Dependent v. independent variablesTarun Gehlot
 
Using Spss Compute (Another Method)
Using Spss   Compute (Another Method)Using Spss   Compute (Another Method)
Using Spss Compute (Another Method)Dr Ali Yusob Md Zain
 
Graphing inverse functions
Graphing inverse functionsGraphing inverse functions
Graphing inverse functionsTarun Gehlot
 
Interpolation functions
Interpolation functionsInterpolation functions
Interpolation functionsTarun Gehlot
 
Finite elements for 2‐d problems
Finite elements  for 2‐d problemsFinite elements  for 2‐d problems
Finite elements for 2‐d problemsTarun Gehlot
 
Recode Age Variable in SPSS
Recode  Age  Variable in SPSSRecode  Age  Variable in SPSS
Recode Age Variable in SPSSRuth M Tappin
 
Mapping the sphere
Mapping the sphereMapping the sphere
Mapping the sphereTarun Gehlot
 
basic concepts of Functions
basic concepts of Functionsbasic concepts of Functions
basic concepts of FunctionsTarun Gehlot
 
Theories of change and logic models
Theories of change and logic modelsTheories of change and logic models
Theories of change and logic modelsTarun Gehlot
 
Numerical conformal mapping of an irregular area
Numerical conformal mapping of an irregular areaNumerical conformal mapping of an irregular area
Numerical conformal mapping of an irregular areaTarun Gehlot
 
Get pampered retreat in goa - ajit patel sanda wellbeing and Wemet
Get pampered retreat in goa  - ajit patel sanda wellbeing and WemetGet pampered retreat in goa  - ajit patel sanda wellbeing and Wemet
Get pampered retreat in goa - ajit patel sanda wellbeing and WemetAjitPatelsandawellness
 
Introduction to matlab
Introduction to matlabIntroduction to matlab
Introduction to matlabTarun Gehlot
 
Android起動周りのノウハウ
Android起動周りのノウハウAndroid起動周りのノウハウ
Android起動周りのノウハウchancelab
 
The beauty of mathematics
The beauty of mathematicsThe beauty of mathematics
The beauty of mathematicsTarun Gehlot
 
Modeling Transformations
Modeling TransformationsModeling Transformations
Modeling TransformationsTarun Gehlot
 
Numerical differentiation integration
Numerical differentiation integrationNumerical differentiation integration
Numerical differentiation integrationTarun Gehlot
 
Local linear approximation
Local linear approximationLocal linear approximation
Local linear approximationTarun Gehlot
 
Finite elements : basis functions
Finite elements : basis functionsFinite elements : basis functions
Finite elements : basis functionsTarun Gehlot
 
Numerical integration
Numerical integrationNumerical integration
Numerical integrationTarun Gehlot
 
The New Era of (Non-) Discoverability
The New Era of (Non-) DiscoverabilityThe New Era of (Non-) Discoverability
The New Era of (Non-) DiscoverabilityDan Saffer
 

Viewers also liked (20)

Dependent v. independent variables
Dependent v. independent variablesDependent v. independent variables
Dependent v. independent variables
 
Using Spss Compute (Another Method)
Using Spss   Compute (Another Method)Using Spss   Compute (Another Method)
Using Spss Compute (Another Method)
 
Graphing inverse functions
Graphing inverse functionsGraphing inverse functions
Graphing inverse functions
 
Interpolation functions
Interpolation functionsInterpolation functions
Interpolation functions
 
Finite elements for 2‐d problems
Finite elements  for 2‐d problemsFinite elements  for 2‐d problems
Finite elements for 2‐d problems
 
Recode Age Variable in SPSS
Recode  Age  Variable in SPSSRecode  Age  Variable in SPSS
Recode Age Variable in SPSS
 
Mapping the sphere
Mapping the sphereMapping the sphere
Mapping the sphere
 
basic concepts of Functions
basic concepts of Functionsbasic concepts of Functions
basic concepts of Functions
 
Theories of change and logic models
Theories of change and logic modelsTheories of change and logic models
Theories of change and logic models
 
Numerical conformal mapping of an irregular area
Numerical conformal mapping of an irregular areaNumerical conformal mapping of an irregular area
Numerical conformal mapping of an irregular area
 
Get pampered retreat in goa - ajit patel sanda wellbeing and Wemet
Get pampered retreat in goa  - ajit patel sanda wellbeing and WemetGet pampered retreat in goa  - ajit patel sanda wellbeing and Wemet
Get pampered retreat in goa - ajit patel sanda wellbeing and Wemet
 
Introduction to matlab
Introduction to matlabIntroduction to matlab
Introduction to matlab
 
Android起動周りのノウハウ
Android起動周りのノウハウAndroid起動周りのノウハウ
Android起動周りのノウハウ
 
The beauty of mathematics
The beauty of mathematicsThe beauty of mathematics
The beauty of mathematics
 
Modeling Transformations
Modeling TransformationsModeling Transformations
Modeling Transformations
 
Numerical differentiation integration
Numerical differentiation integrationNumerical differentiation integration
Numerical differentiation integration
 
Local linear approximation
Local linear approximationLocal linear approximation
Local linear approximation
 
Finite elements : basis functions
Finite elements : basis functionsFinite elements : basis functions
Finite elements : basis functions
 
Numerical integration
Numerical integrationNumerical integration
Numerical integration
 
The New Era of (Non-) Discoverability
The New Era of (Non-) DiscoverabilityThe New Era of (Non-) Discoverability
The New Era of (Non-) Discoverability
 

Similar to Computing transformations

Computing Transformations Spring2005
Computing Transformations Spring2005Computing Transformations Spring2005
Computing Transformations Spring2005guest5989655
 
Computingtransformations Spring2005
Computingtransformations Spring2005Computingtransformations Spring2005
Computingtransformations Spring20051Leu
 
Ali, Redescending M-estimator
Ali, Redescending M-estimator Ali, Redescending M-estimator
Ali, Redescending M-estimator Muhammad Ali
 
Linear Regression
Linear Regression Linear Regression
Linear Regression Rupak Roy
 
Monte carlo-simulation
Monte carlo-simulationMonte carlo-simulation
Monte carlo-simulationjaimarbustos
 
Topic17 regression spss
Topic17 regression spssTopic17 regression spss
Topic17 regression spssSizwan Ahammed
 
Theory of uncertainty of measurement.pdf
Theory of uncertainty of measurement.pdfTheory of uncertainty of measurement.pdf
Theory of uncertainty of measurement.pdfHnCngnh1
 
Adjusting PageRank parameters and comparing results : REPORT
Adjusting PageRank parameters and comparing results : REPORTAdjusting PageRank parameters and comparing results : REPORT
Adjusting PageRank parameters and comparing results : REPORTSubhajit Sahu
 
Multimedia lossy compression algorithms
Multimedia lossy compression algorithmsMultimedia lossy compression algorithms
Multimedia lossy compression algorithmsMazin Alwaaly
 
Regression with Time Series Data
Regression with Time Series DataRegression with Time Series Data
Regression with Time Series DataRizano Ahdiat R
 
Temporal disaggregation methods
Temporal disaggregation methodsTemporal disaggregation methods
Temporal disaggregation methodsStephen Bradley
 
Why Study Statistics Arunesh Chand Mankotia 2004
Why Study Statistics   Arunesh Chand Mankotia 2004Why Study Statistics   Arunesh Chand Mankotia 2004
Why Study Statistics Arunesh Chand Mankotia 2004Consultonmic
 
SAMPLING MEAN DEFINITION The term sampling mean .docx
SAMPLING MEAN DEFINITION The term sampling mean .docxSAMPLING MEAN DEFINITION The term sampling mean .docx
SAMPLING MEAN DEFINITION The term sampling mean .docxanhlodge
 
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920Karl Rudeen
 
Numerical approximation
Numerical approximationNumerical approximation
Numerical approximationMileacre
 

Similar to Computing transformations (20)

Computing Transformations Spring2005
Computing Transformations Spring2005Computing Transformations Spring2005
Computing Transformations Spring2005
 
Computingtransformations Spring2005
Computingtransformations Spring2005Computingtransformations Spring2005
Computingtransformations Spring2005
 
Ali, Redescending M-estimator
Ali, Redescending M-estimator Ali, Redescending M-estimator
Ali, Redescending M-estimator
 
Transformation of variables
Transformation of variablesTransformation of variables
Transformation of variables
 
Linear Regression
Linear Regression Linear Regression
Linear Regression
 
Monte carlo-simulation
Monte carlo-simulationMonte carlo-simulation
Monte carlo-simulation
 
Topic17 regression spss
Topic17 regression spssTopic17 regression spss
Topic17 regression spss
 
Theory of uncertainty of measurement.pdf
Theory of uncertainty of measurement.pdfTheory of uncertainty of measurement.pdf
Theory of uncertainty of measurement.pdf
 
Adjusting PageRank parameters and comparing results : REPORT
Adjusting PageRank parameters and comparing results : REPORTAdjusting PageRank parameters and comparing results : REPORT
Adjusting PageRank parameters and comparing results : REPORT
 
Multimedia lossy compression algorithms
Multimedia lossy compression algorithmsMultimedia lossy compression algorithms
Multimedia lossy compression algorithms
 
Regression with Time Series Data
Regression with Time Series DataRegression with Time Series Data
Regression with Time Series Data
 
Data Analysis Assignment Help
Data Analysis Assignment HelpData Analysis Assignment Help
Data Analysis Assignment Help
 
Temporal disaggregation methods
Temporal disaggregation methodsTemporal disaggregation methods
Temporal disaggregation methods
 
Why Study Statistics Arunesh Chand Mankotia 2004
Why Study Statistics   Arunesh Chand Mankotia 2004Why Study Statistics   Arunesh Chand Mankotia 2004
Why Study Statistics Arunesh Chand Mankotia 2004
 
Inorganic CHEMISTRY
Inorganic CHEMISTRYInorganic CHEMISTRY
Inorganic CHEMISTRY
 
Central tendency
Central tendencyCentral tendency
Central tendency
 
SAMPLING MEAN DEFINITION The term sampling mean .docx
SAMPLING MEAN DEFINITION The term sampling mean .docxSAMPLING MEAN DEFINITION The term sampling mean .docx
SAMPLING MEAN DEFINITION The term sampling mean .docx
 
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
 
Project Paper
Project PaperProject Paper
Project Paper
 
Numerical approximation
Numerical approximationNumerical approximation
Numerical approximation
 

More from Tarun Gehlot

Materials 11-01228
Materials 11-01228Materials 11-01228
Materials 11-01228Tarun Gehlot
 
Continuity and end_behavior
Continuity and  end_behaviorContinuity and  end_behavior
Continuity and end_behaviorTarun Gehlot
 
Continuity of functions by graph (exercises with detailed solutions)
Continuity of functions by graph   (exercises with detailed solutions)Continuity of functions by graph   (exercises with detailed solutions)
Continuity of functions by graph (exercises with detailed solutions)Tarun Gehlot
 
Factoring by the trial and-error method
Factoring by the trial and-error methodFactoring by the trial and-error method
Factoring by the trial and-error methodTarun Gehlot
 
Introduction to finite element analysis
Introduction to finite element analysisIntroduction to finite element analysis
Introduction to finite element analysisTarun Gehlot
 
Error analysis statistics
Error analysis   statisticsError analysis   statistics
Error analysis statisticsTarun Gehlot
 
Linear approximations and_differentials
Linear approximations and_differentialsLinear approximations and_differentials
Linear approximations and_differentialsTarun Gehlot
 
Propeties of-triangles
Propeties of-trianglesPropeties of-triangles
Propeties of-trianglesTarun Gehlot
 
Gaussian quadratures
Gaussian quadraturesGaussian quadratures
Gaussian quadraturesTarun Gehlot
 
Basics of set theory
Basics of set theoryBasics of set theory
Basics of set theoryTarun Gehlot
 
Applications of set theory
Applications of  set theoryApplications of  set theory
Applications of set theoryTarun Gehlot
 
Miscellneous functions
Miscellneous  functionsMiscellneous  functions
Miscellneous functionsTarun Gehlot
 
Intervals of validity
Intervals of validityIntervals of validity
Intervals of validityTarun Gehlot
 
Modelling with first order differential equations
Modelling with first order differential equationsModelling with first order differential equations
Modelling with first order differential equationsTarun Gehlot
 
Review taylor series
Review taylor seriesReview taylor series
Review taylor seriesTarun Gehlot
 
Review power series
Review power seriesReview power series
Review power seriesTarun Gehlot
 
Convergence Criteria
Convergence CriteriaConvergence Criteria
Convergence CriteriaTarun Gehlot
 
Applications of numerical methods
Applications of numerical methodsApplications of numerical methods
Applications of numerical methodsTarun Gehlot
 

More from Tarun Gehlot (20)

Materials 11-01228
Materials 11-01228Materials 11-01228
Materials 11-01228
 
Binary relations
Binary relationsBinary relations
Binary relations
 
Continuity and end_behavior
Continuity and  end_behaviorContinuity and  end_behavior
Continuity and end_behavior
 
Continuity of functions by graph (exercises with detailed solutions)
Continuity of functions by graph   (exercises with detailed solutions)Continuity of functions by graph   (exercises with detailed solutions)
Continuity of functions by graph (exercises with detailed solutions)
 
Factoring by the trial and-error method
Factoring by the trial and-error methodFactoring by the trial and-error method
Factoring by the trial and-error method
 
Introduction to finite element analysis
Introduction to finite element analysisIntroduction to finite element analysis
Introduction to finite element analysis
 
Error analysis statistics
Error analysis   statisticsError analysis   statistics
Error analysis statistics
 
Matlab commands
Matlab commandsMatlab commands
Matlab commands
 
Linear approximations and_differentials
Linear approximations and_differentialsLinear approximations and_differentials
Linear approximations and_differentials
 
Propeties of-triangles
Propeties of-trianglesPropeties of-triangles
Propeties of-triangles
 
Gaussian quadratures
Gaussian quadraturesGaussian quadratures
Gaussian quadratures
 
Basics of set theory
Basics of set theoryBasics of set theory
Basics of set theory
 
Applications of set theory
Applications of  set theoryApplications of  set theory
Applications of set theory
 
Miscellneous functions
Miscellneous  functionsMiscellneous  functions
Miscellneous functions
 
Intervals of validity
Intervals of validityIntervals of validity
Intervals of validity
 
Modelling with first order differential equations
Modelling with first order differential equationsModelling with first order differential equations
Modelling with first order differential equations
 
Review taylor series
Review taylor seriesReview taylor series
Review taylor series
 
Review power series
Review power seriesReview power series
Review power series
 
Convergence Criteria
Convergence CriteriaConvergence Criteria
Convergence Criteria
 
Applications of numerical methods
Applications of numerical methodsApplications of numerical methods
Applications of numerical methods
 

Computing transformations

  • 1. Computing Transformations BY TARUNGEHLOT Transforming variables Transformations for normality Transformations for linearity
  • 2. Transforming variables to satisfy assumptions  When a metric variable fails to satisfy the assumption of normality, homogeneity of variance, or linearity, we may be able to correct the deficiency by using a transformation.  We will consider three transformations for normality, homogeneity of variance, and linearity:  the logarithmic transformation  the square root transformation, and  the inverse transformation  plus a fourth that is useful for problems of linearity:  the square transformation
  • 3. Computing transformations in SPSS  In SPSS, transformations are obtained by computing a new variable. SPSS functions are available for the logarithmic (LG10) and square root (SQRT) transformations. The inverse transformation uses a formula which divides one by the original value for each case.  For each of these calculations, there may be data values which are not mathematically permissible. For example, the log of zero is not defined mathematically, division by zero is not permitted, and the square root of a negative number results in an “imaginary” value. We will usually adjust the values passed to the function to make certain that these illegal operations do not occur.
  • 4. Two forms for computing transformations  There are two forms for each of the transformations to induce normality, depending on whether the distribution is skewed negatively to the left or skewed positively to the right.  Both forms use the same SPSS functions and formula to calculate the transformations.  The two forms differ in the value or argument passed to the functions and formula. The argument to the functions is an adjustment to the original value of the variable to make certain that all of the calculations are mathematically correct.
  • 5. Functions and formulas for transformations  Symbolically, if we let x stand for the argument passes to the function or formula, the calculations for the transformations are:  Logarithmic transformation: compute log = LG10(x)  Square root transformation: compute sqrt = SQRT(x)  Inverse transformation: compute inv = 1 / (x)  Square transformation: compute s2 = x * x  For all transformations, the argument must be greater than zero to guarantee that the calculations are mathematically legitimate.
  • 6. Transformation of positively skewed variables  For positively skewed variables, the argument is an adjustment to the original value based on the minimum value for the variable.  If the minimum value for a variable is zero, the adjustment requires that we add one to each value, e.g. x + 1.  If the minimum value for a variable is a negative number (e.g., –6), the adjustment requires that we add the absolute value of the minimum value (e.g. 6) plus one (e.g. x + 6 + 1, which equals x +7).
  • 7. Example of positively skewed variable  Suppose our dataset contains the number of books read (books) for 5 subjects: 1, 3, 0, 5, and 2, and the distribution is positively skewed.  The minimum value for the variable books is 0. The adjustment for each case is books + 1.  The transformations would be calculated as follows:  Compute logBooks = LG10(books + 1)  Compute sqrBooks = SQRT(books + 1)  Compute invBooks = 1 / (books + 1)
  • 8. Transformation of negatively skewed variables  If the distribution of a variable is negatively skewed, the adjustment of the values reverses, or reflects, the distribution so that it becomes positively skewed. The transformations are then computed on the values in the positively skewed distribution.  Reflection is computed by subtracting all of the values for a variable from one plus the absolute value of maximum value for the variable. This results in a positively skewed distribution with all values larger than zero.
  • 9. Example of negatively skewed variable  Suppose our dataset contains the number of books read (books) for 5 subjects: 1, 3, 0, 5, and 2, and the distribution is negatively skewed.  The maximum value for the variable books is 5. The adjustment for each case is 6 - books.  The transformations would be calculated as follows:  Compute logBooks = LG10(6 - books)  Compute sqrBooks = SQRT(6 - books)  Compute invBooks = 1 / (6 - books)
  • 10. The Square Transformation for Linearity  The square transformation is computed by multiplying the value for the variable by itself.  It does not matter whether the distribution is positively or negatively skewed.  It does matter if the variable has negative values, since we would not be able to distinguish their squares from the square of a comparable positive value (e.g. the square of -4 is equal to the square of +4). If the variable has negative values, we add the absolute value of the minimum value to each score before squaring it.
  • 11. Example of the square transformation  Suppose our dataset contains change scores (chg) for 5 subjects that indicate the difference between test scores at the end of a semester and test scores at mid-term: -10, 0, 10, 20, and 30.  The minimum score is -10. The absolute value of the minimum score is 10.  The transformation would be calculated as follows:  Compute squarChg = (chg + 10) * (chg + 10)
  • 12. Transformations for normality Both the histogram and the normality plot for Total Time Spent on the Internet (netime) indicate that the variable is not normally distributed. Histogram 50 Normal Q-Q Plot of TOTAL TIME SPENT ON THE IN 3 40 2 1 30 0 20 Expected Normal -1 Frequency 10 Std. Dev = 15.35 -2 Mean = 10.7 N = 93.00 -3 0 -40 -20 0 20 40 60 80 100 120 0.0 20.0 40.0 60.0 80.0 100.0 10.0 30.0 50.0 70.0 90.0 Observed Value TOTAL TIME SPENT ON THE INTERNET
  • 13. Determine whether reflection is required Descriptives Stat istic Std. Error TOTAL TIME SPENT Mean 10.73 1.59 ON THE INTERNET 95% Confidence Lower Bound 7.57 Interval for Mean Upper Bound 13.89 5% Trimmed Mean 8.29 Median 5.50 Variance 235. 655 Std. Deviation 15.35 Minimum 0 Maximum 102 Range 102 Interquartile Range 10.20 Skewness 3.532 .250 Kurt osis 15.614 .495 Skewness, in the table of Descriptive Statistics, indicates whether or not reflection (reversing the values) is required in the transformation. If Skewness is positive, as it is in this problem, reflection is not required. If Skewness is negative, reflection is required.
  • 14. Compute the adjustment to the argument Descriptives Stat istic Std. Error TOTAL TIME SPENT Mean 10.73 1.59 ON THE INTERNET 95% Confidence Lower Bound 7.57 Interval for Mean Upper Bound 13.89 5% Trimmed Mean 8.29 Median 5.50 Variance 235. 655 Std. Deviation 15.35 Minimum 0 Maximum 102 Range 102 Interquartile Range 10.20 Skewness 3.532 .250 Kurt osis 15.614 .495 In this problem, the minimum value is 0, so 1 will be added to each value in the formula, i.e. the argument to the SPSS functions and formula for the inverse will be: netime + 1.
  • 15. Computing the logarithmic transformation To compute the transformation, select the Compute… command from the Transform menu.
  • 16. Specifying the transform variable name and function First, in the Target Variable text box, type a name for the log transformation variable, e.g. “lgnetime“. Third, click on the up arrow button to move the highlighted function to Second, scroll down the list of functions to the Numeric find LG10, which calculates logarithmic Expression values use a base of 10. (The logarithmic text box. values are the power to which 10 is raised to produce the original number.)
  • 17. Adding the variable name to the function Second, click on the right arrow button. SPSS will replace the highlighted text in the function (?) with the name of the variable. First, scroll down the list of variables to locate the variable we want to transform. Click on its name so that it is highlighted.
  • 18. Adding the constant to the function Following the rules stated for determining the constant that needs to be included in the function either to prevent mathematical errors, or to do reflection, we include the constant in the function argument. In this case, we add 1 to the netime variable. Click on the OK button to complete the compute request.
  • 19. The transformed variable The transformed variable which we requested SPSS compute is shown in the data editor in a column to the right of the other variables in the dataset.
  • 20. Computing the square root transformation To compute the transformation, select the Compute… command from the Transform menu.
  • 21. Specifying the transform variable name and function First, in the Target Variable text box, type a name for the square root transformation variable, e.g. “sqnetime“. Third, click on the up arrow button to move the highlighted function to the Numeric Second, scroll down the list of functions to Expression find SQRT, which calculates the square root text box. of a variable.
  • 22. Adding the variable name to the function Second, click on the right arrow button. SPSS will replace the highlighted text in the function (?) with the name of the variable. First, scroll down the list of variables to locate the variable we want to transform. Click on its name so that it is highlighted.
  • 23. Adding the constant to the function Following the rules stated for determining the constant that needs to be included in the function either to prevent mathematical errors, or to do reflection, we include the constant in the function argument. In this case, we add 1 to the netime variable. Click on the OK button to complete the compute request.
  • 24. The transformed variable The transformed variable which we requested SPSS compute is shown in the data editor in a column to the right of the other variables in the dataset.
  • 25. Computing the inverse transformation To compute the transformation, select the Compute… command from the Transform menu.
  • 26. Specifying the transform variable name and formula First, in the Target Variable text box, type a Second, there is not a function for name for the inverse computing the inverse, so we type transformation variable, the formula directly into the e.g. “innetime“. Numeric Expression text box. Third, click on the OK button to complete the compute request.
  • 27. The transformed variable The transformed variable which we requested SPSS compute is shown in the data editor in a column to the right of the other variables in the dataset.
  • 28. Adjustment to the argument for the square transformation It is mathematically correct to square a value of zero, so the adjustment to the argument for the square transformation is different. What we need to avoid are negative numbers, since the square of a negative number produces the same value as the square of a positive number. Descriptives Stat istic Std. Error TOTAL TIME SPENT Mean 10.73 1.59 ON THE INTERNET 95% Confidence Lower Bound 7.57 Interval for Mean Upper Bound 13.89 5% Trimmed Mean 8.29 Median 5.50 Variance 235. 655 Std. Deviation 15.35 Minimum 0 Maximum 102 Range 102 Interquartile Range 10.20 In this problem, the minimum value is 0, no adjustment Skewness 3.532 .250 is needed for computing the square. If the minimum Kurt osis 15.614 .495 was a number less than zero, we would add the absolute value of the minimum (dropping the sign) as an adjustment to the variable.
  • 29. Computing the square transformation To compute the transformation, select the Compute… command from the Transform menu.
  • 30. Specifying the transform variable name and formula First, in the Target Variable text box, type a Second, there is not a function for name for the inverse computing the square, so we type transformation the formula directly into the variable, e.g. Numeric Expression text box. “s2netime“. Third, click on the OK button to complete the compute request.
  • 31. The transformed variable The transformed variable which we requested SPSS compute is shown in the data editor in a column to the right of the other variables in the dataset.