SlideShare a Scribd company logo
1 of 90
MANOVA

Multivariate Analysis of Variance
One way Analysis of Variance
        (ANOVA)
     Comparing k Populations
The F test – for comparing k means
Situation
• We have k normal populations
• Let µi and σ denote the mean and standard
  deviation of population i.
• i = 1, 2, 3, … k.
• Note: we assume that the standard deviation
  for each population is the same.
  σ1 = σ2 = … = σk = σ
We want to test
    H 0 : µ1 = µ 2 = µ3 =  = µ k
against

 H A : µ i ≠ µ j for at least one pair i, j
The F statistic                        k

                                       ∑n (x − x)
                                 1                              2
                                k −1          i    i
                          F=           i =1
                                       k nj

                                      ∑∑ ( x                − xi )
                                1                                    2
                               N −k                    ij
                                       i =1 j =1
where xij = the jth observation in the i th sample.
       ( i = 1,2,, k and j = 1,2,, ni )
        ni

        ∑x     ij

                    = mean for i th sample ( i = 1,2,, k )
        j =1
 xi =
          ni
         k
 N = ∑ ni = Total sample size
        i =1
          k    ni

        ∑∑ x
         i =1 j =1
                     ij

  x=                      = Overall mean
               N
The ANOVA table
Source                 S.S                 d.f,                   M.S.                                 F
                   k
                                                                                                       MS B
          SS B = ∑ ni ( xi − x )
                                                                   k

                                                                  ∑n (x − x)
                                   2
Between                                           MS B =    1                          2          F=
                  i =1
                                           k −1            k −1          i    i                        MSW
                                                                  i =1

                         nj                                              nj

          SSW = ∑∑ ( xij − xi )                                   ∑∑ ( x          ij − xi )
                  k                                                k
                                       2                                                      2
                                           N − k MSW =
                                                            1
Within                                                     N −k
                 i =1 j =1                                        i =1 j =1




 The ANOVA table is a tool for displaying the
 computations for the F test. It is very important when
 the Between Sample variability is due to two or more
 factors
Computing Formulae:
Compute
                 ni
 1)   Ti = ∑ xij = Total for sample i
                 j =1
                 k               k    ni
 2)   G = ∑ Ti = ∑∑ xij = Grand Total
               i =1              i =1 j =1
                 k
 3)   N = ∑ ni = Total sample size
                i =1
       k       ni

      ∑∑ x
                             2
 4)                     ij
       i =1 j =1

           k       2
            Ti
 5)    ∑n
       i =1    i
The data
• Assume we have collected data from each of
  k populations
• Let xi1, xi2 , xi3 , … denote the ni observations
  from population i.
• i = 1, 2, 3, … k.
Then
                          k        2       2
                          Ti G
1)     SS Between   =∑       −
                     i =1 ni   N
                      k       ni               k   2
                                     Ti
                    = ∑∑ xij − ∑
                                       2
 2)    SSWithin
                      i =1 j =1 i =1 ni


            SS Between ( k − 1)
 3)      F=
            SSWithin ( N − k )
Anova Table
 Source   d.f.    Sum of      Mean         F-ratio
                  Squares     Square
Between   k-1     SSBetween   MSBetween   MSB /MSW
Within    N-k     SSWithin    MSWithin
Total     N-1      SSTotal

                      SS
                 MS =
                      df
Example
In the following example we are comparing weight
    gains resulting from the following six diets
1. Diet 1 - High Protein , Beef
2. Diet 2 - High Protein , Cereal
3. Diet 3 - High Protein , Pork
4. Diet 4 - Low protein , Beef
5. Diet 5 - Low protein , Cereal
6. Diet 6 - Low protein , Pork
Gains in weight (grams) for rats under six diets
                  differing in level of protein (High or Low)
                 and source of protein (Beef, Cereal, or Pork)
Diet           1        2         3          4        5          6
              73       98        94         90      107         49
             102       74        79         76       95         82
             118       56        96         90       97         73
             104      111        98         64       80         86
              81       95       102         86       98         81
             107       88       102         51       74         97
             100       82       108         72       74        106
              87       77        91         90       67         70
             117       86       120         95       89         61
             111       92       105         78       58         82
Mean         100.0     85.9      99.5       79.2     83.9       78.7
Std. Dev.     15.14    15.02     10.92      13.89    15.71      16.55
Σx           1000      859      995        792  839           787
Σx2         102062    75819    100075     64462 72613        64401
Thus
             k
                     Ti 2 G 2               5272 2
SS Between = ∑           −     = 467846 −          = 4612.933
                i =1 ni    N                  60
              k ni           k
                                Ti 2
SSWithin = ∑∑ xij − ∑
                         2
                                     = 479432 − 467846 = 11586
            i =1 j =1      i =1 ni


    SS Between ( k − 1) 4612.933 / 5 922.6
 F=                    =            =        = 4.3
    SSWithin ( N − k )   11586 / 54   214.56

 F0.05 = 2.386 with ν 1 = 5 and ν 2 = 54

 Thus since F > 2.386 we reject H0
Anova Table
 Source   d.f.    Sum of         Mean           F-ratio
                  Squares        Square
Between    5     4612.933        922.587         4.3**
                                               (p = 0.0023)
                            SS
Within    54     11586.000       214.556
Total     59     16198.933

          * - Significant at 0.05 (not 0.01)
          ** - Significant at 0.01
Equivalence of the F-test and the t-test
 when k = 2
the t-test
                  x−y
         t=
                     1 1
            s Pooled  +
                     n m
                    ( n − 1) sx2 + ( m − 1) s 2
        sPooled =                             y

                           n+m−2
the F-test                      k

                               ∑n ( x − x)
                                                          2
          2                               i       i           k −1
       s
    F=    Between
                    =          i =1
       s  2              k
                                                       k        
          Pooled
                        ∑ ( ni − 1) si2                ∑ ni − k 
                        i =1                           i =1     
                   n1 ( x1 − x ) + n2 ( x1 − x )
                                              2                   2
          =
             [                                        ]
            ( n1 − 1) s12 + ( n1 − 1) s12 ( n1 + n2 − 2)
  denominator = s       2
                        pooled

 numerator = n1 ( x1 − x ) + n2 ( x1 − x )
                                      2                       2
2
                            n1 x1 + n2 x2 
  n1 ( x1 − x ) = n1  x1 −                   
                    2
                               n1 + n2 
                                             
                          2
                      n1n2
                =               ( x1 − x2 ) 2
                  ( n1 + n2 ) 2


                                                    2
                               n1 x1 + n2 x2 
n2 ( x2 − x )       = n2  x2 −               
                2
                                  n1 + n2 
                                             

                          n12 n2
                    =               ( x1 − x2 ) 2

                      ( n1 + n2 ) 2
nn +n n
                                                 2               2
 n1 ( x1 − x )       + n2 ( x2 − x )                                 ( x1 − x2 ) 2
                 2                     2
                                           =   1 2            2 1

                                              ( n1 + n )     2
                                                               2

                          n1n2
                     =             ( x1 − x2 ) 2
                       ( n1 + n2 )
                              1
                     =                     ( x1 − x2 )   2

                         1 1
                          + 
                         n n 
                          1 2 


Hence            F=
                               1            ( x1 − x2 )      2
                                                                 = t2
                          1 1  sPooled
                                  2
                          + 
                         n n 
                          1  2 
Factorial Experiments

   Analysis of Variance
• Dependent variable Y
• k Categorical independent variables A, B, C,
  … (the Factors)
• Let
  –   a = the number of categories of A
  –   b = the number of categories of B
  –   c = the number of categories of C
  –   etc.
The Completely Randomized Design

• We form the set of all treatment combinations
  – the set of all combinations of the k factors
• Total number of treatment combinations
  – t = abc….
• In the completely randomized design n
  experimental units (test animals , test plots,
  etc. are randomly assigned to each treatment
  combination.
  – Total number of experimental units N = nt=nabc..
The treatment combinations can thought to be
arranged in a k-dimensional rectangular block

                        B
          1    2                     b
      1
      2

A


      a
C




B




    A
• The Completely Randomized Design is called
  balanced
• If the number of observations per treatment
  combination is unequal the design is called
  unbalanced. (resulting mathematically more
  complex analysis and computations)
• If for some of the treatment combinations
  there are no observations the design is called
  incomplete. (In this case it may happen that
  some of the parameters - main effects and
  interactions - cannot be estimated.)
Example
In this example we are examining the effect of

The level of protein A (High or Low) and
the source of protein B (Beef, Cereal, or
Pork) on weight gains (grams) in rats.

We have n = 10 test animals randomly
assigned to k = 6 diets
The k = 6 diets are the 6 = 3×2 Level-Source
combinations
      1. High - Beef
      2. High - Cereal
      3. High - Pork
      4. Low - Beef
      5. Low - Cereal
       6. Low - Pork
Table
       Gains in weight (grams) for rats under six diets
       differing in level of protein (High or Low) and s
            ource of protein (Beef, Cereal, or Pork)

  Level
 of Protein High Protein       Low protein
Source
of Protein Beef Cereal Pork Beef Cereal Pork
    Diet     1    2     3     4    5       6
             73   98    94   90 107       49
            102   74    79   76    95     82
            118   56    96   90    97     73
            104 111     98   64    80     86
             81   95 102     86    98     81
            107   88 102     51    74     97
            100   82 108     72    74 106
             87   77    91   90    67     70
            117   86 120     95    89     61
            111   92 105     78    58     82
Mean       100.0 85.9 99.5 79.2 83.9 78.7
Std. Dev. 15.14 15.02 10.92 13.89 15.71 16.55
Treatment combinations

                          Source of Protein
                 Beef         Cereal          Pork

Level     High   Diet 1        Diet 2         Diet 3
of
Protein
          Low    Diet 4       Diet 5          Diet 6
Summary Table of Means

                   Source of Protein

Level of Protein Beef     Cereal   Pork Overall
      High       100.00   85.90    99.50 95.13
      Low         79.20   83.90    78.70 80.60
      Overall     89.60   84.90    89.10 87.87
Profiles of the response relative
           to a factor
    A graphical representation of the
     effect of a factor on a reponse
     variable (dependent variable)
Profile Y for A
Y                              This could be for an
                               individual case or
                               averaged over a group
                               of cases


        This could be for
        specific level of
        another factor or
        averaged levels of
        another factor
    1    2          3         …             a
                Levels of A
Profiles of Weight Gain for
                         Source and Level of Protein
               110
                                            High Protein
                                            Low Protein
                                            Overall
               100
We ight Gain




                90




                80




                70
                  Beef             Cereal                  Pork
Profiles of Weight Gain for
                        Source and Level of Protein
               110
                                          Beef
                                          Cereal
                                          Pork
               100
                                          Overall
We ight Gain




                 90




                 80



                 70
               High Protein                           Low Protein
Example – Four factor experiment
Four factors are studied for their effect on Y (luster
of paint film). The four factors are:

1)     Film Thickness - (1 or 2 mils)
2)     Drying conditions (Regular or Special)
3)     Length of wash (10,30,40 or 60 Minutes), and
4)     Temperature of wash (92 ˚C or 100 ˚C)

Two observations of film luster (Y) are taken
for each treatment combination
The data is tabulated below:
                Regular Dry         Special Dry
   Minutes 92 °C       100 °C     92°C       100 °C
1-mil Thickness
       20     3.4 3.4 19.6 14.5   2.1   3.8   17.2    13.4
       30     4.1 4.1 17.5 17.0   4.0   4.6   13.5    14.3
       40     4.9 4.2 17.6 15.2   5.1   3.3   16.0    17.8
       60     5.0 4.9 20.9 17.1   8.3   4.3   17.5    13.9
2-mil Thickness
       20     5.5 3.7 26.6 29.5   4.5   4.5   25.6    22.5
       30     5.7 6.1 31.6 30.2   5.9   5.9   29.2    29.8
       40     5.5 5.6 30.5 30.2   5.5   5.8   32.6    27.4
       60     7.2 6.0 31.4 29.6   8.0   9.9   33.5    29.5
Definition:
A factor is said to not affect the response if
the profile of the factor is horizontal for all
combinations of levels of the other factors:
No change in the response when you change
the levels of the factor (true for all
combinations of levels of the other factors)
Otherwise the factor is said to affect the
response:
Profile Y for A – A affects the response
Y
                                      
                                      
                                      
                                         Levels of B
                                      
                                      
                                      



       1       2      3           …        a
                    Levels of A
Profile Y for A – no affect on the response
Y
                                   
                                   
                                   
                                      Levels of B
                                   
                                   
                                   




     1      2       3          …         a
                 Levels of A
Definition:
• Two (or more) factors are said to interact if
  changes in the response when you change
  the level of one factor depend on the
  level(s) of the other factor(s).
• Profiles of the factor for different levels of
  the other factor(s) are not parallel
• Otherwise the factors are said to be
  additive .
• Profiles of the factor for different levels of
  the other factor(s) are parallel.
Interacting factors A and B
Y
                                 
                                 
                                 
                                     Levels of B
                                 
                                 
                                 



    1     2      3           …         a
               Levels of A
Additive factors A and B
Y
                               
                               
                               
                                  Levels of B
                               
                               
                               



    1    2      3          …        a
             Levels of A
• If two (or more) factors interact each factor
  effects the response.
• If two (or more) factors are additive it still
  remains to be determined if the factors
  affect the response
• In factorial experiments we are interested in
  determining
    – which factors effect the response and
    – which groups of factors interact .
The testing in factorial experiments
1. Test first the higher order interactions.
2. If an interaction is present there is no need
   to test lower order interactions or main
   effects involving those factors. All factors
   in the interaction affect the response and
   they interact
3. The testing continues with for lower order
   interactions and main effects for factors
   which have not yet been determined to
   affect the response.
Models for factorial
  Experiments
The Single Factor Experiment
Situation
• We have t = a treatment combinations
• Let µi and σ denote the mean and standard
  deviation of observations from treatment i.
• i = 1, 2, 3, … a.
• Note: we assume that the standard deviation
  for each population is the same.
  σ1 = σ2 = … = σa = σ
The data
• Assume we have collected data for each of
  the a treatments
• Let yi1, yi2 , yi3 , … , yin denote the n observations
  for treatment i.
• i = 1, 2, 3, … a.
The model
Note:
        yij = µi + ( yij − µi ) = µi + ε ij
            = µ + ( µi − µ ) + ε ij = µ + α i + ε ij
where           ε ij = yij − µi     has N(0,σ2) distribution
                1 k
             µ = ∑ µi              (overall mean effect)
                k i =1
                  α i = µi − µ       (Effect of Factor A)
            a
Note:     ∑α
           i =1
                    i   =0        by their definition.
Model 1:
  yij (i = 1, … , a; j = 1, …, n) are independent
  Normal with mean µi and variance σ2.
Model 2:        yij = µi + ε ij
 where εij (i = 1, … , a; j = 1, …, n) are independent
 Normal with mean 0 and variance σ2.

Model 3:       yij = µ + α i + ε ij
 where εij (i = 1, … , a; j = 1, …, n) are independent
 Normal with mean 0 and variance σ2 and
                  a

                 ∑α
                 i =1
                        i   =0
The Two Factor Experiment
Situation
• We have t = ab treatment combinations
• Let µij and σ denote the mean and standard
  deviation of observations from the
  treatment combination when A = i and B =
  j.
• i = 1, 2, 3, … a, j = 1, 2, 3, … b.
The data
• Assume we have collected data (n observations)
  for each of the t = ab treatment combinations.
• Let yij1, yij2 , yij3 , … , yijn denote the n observations for
  treatment combination - A = i, B = j.
• i = 1, 2, 3, … a, j = 1, 2, 3, … b.
The model
 Note:
yijk = µij + ( yijk − µij ) = µij + ε ijk
    = µ + ( µi• − µ ) + ( µ• j − µ ) + ( µij − µi• − µ• j + µ ) + ε ij
   = µ + α i + β j + ( αβ ) ij + ε ijk

 where   ε ijk = yijk − µij has N(0,σ2) distribution
       1 a b                1 b                   1 a
    µ = ∑∑ µij , µi• = ∑ µij and µ• j = ∑ µij
       ab i =1 j =1         b j =1                a i =1
    α i = µi • − µ , β j = µ• j − µ ,

     and       ( αβ ) ij = µij − µi• − µ• j + µ
The model
 Note:
yijk = µij + ( yijk − µij ) = µij + ε ijk
    = µ + ( µi• − µ ) + ( µ• j − µ ) + ( µij − µi• − µ• j + µ ) + ε ij
   = µ + α i + β j + ( αβ ) ij + ε ijk

 where   ε ijk = yijk − µij has N(0,σ2) distribution
       1 a b                1 b                   1 a
    µ = ∑∑ µij , µi• = ∑ µij and µ• j = ∑ µij
       ab i =1 j =1         b j =1                a i =1
    α i = µi • − µ , β j = µ• j − µ ,
                 a
     Note:     ∑α
                i =1
                       i   =0      by their definition.
Main effects Error
                                    Interaction
Model :          Mean
                                      Effect
    yijk = µ + α i + β j + ( αβ ) ij + ε ijk

where εijk (i = 1, … , a; j = 1, …, b ; k = 1, …, n) are
independent Normal with mean 0 and variance σ2 and
     a                          b

    ∑α
    i =1
           i   =0              ∑β
                               j =1
                                           j   =0
                a                     b
      and      ∑ ( αβ )
               i =1
                          ij
                               = ∑ ( αβ ) ij = 0
                                    j =1
Maximum Likelihood Estimates

       yijk = µ + α i + β j + ( αβ ) ij + ε ijk
where εijk (i = 1, … , a; j = 1, …, b ; k = 1, …, n) are
independent Normal with mean 0 and variance σ2 and

                   a    b    n
     µ = y••• = ∑∑∑ yijk abn
     ˆ
                  i =1 j =1 k =1

                             b     n
    α i = yi •• − y••• = ∑∑ yijk bn − y•••
    ˆ
                            j =1 k =1
                             a     n
    β j = y• j • − y••• = ∑∑ yijk an − y•••
    ˆ
                            i =1 k =1
^
  ( αβ ) ij = yij• − yi•• − y• j • + y•••
               n
          = ∑ yijk n − yi •• − y• j • + y•••
              k =1
             a     b  n
        1
            ∑∑∑ ( yijk − yij• )
                                 2
σ =
 ˆ 2

      nab i =1 j =1 k =1
                                                         2
         1 a b n                             ^ 
     =       ∑∑∑  yijk −  µ + αi + βˆ j + ( αβ ) ij  ÷
        nab i =1 j =1 k =1 
                                ˆ ˆ
                             
                                                     ÷
                                                      
  This is not an unbiased estimator of σ2 (usually the
 case when estimating variance.)
 The unbiased estimator results when we divide by
 ab(n -1) instead of abn
The unbiased estimator of σ2 is
                       a     b       n
           1
                       ∑∑∑ ( y                 − yij • )
                                                           2
  s =
   2

      ab ( n − 1)
                                         ijk
                       i =1 j =1 k =1
                                                                             2
             1          a    b       n                            ^ 
      =
        ab ( n − 1)
                       ∑∑∑            yijk −  µ + α i + β j + ( αβ ) ij  ÷
                                                 ˆ ˆ       ˆ
                       i =1 j =1 k =1
                                              
                                                                          ÷
                                                                           
               1
        =             SS Error = MS Error
          ab ( n − 1)
where
                   a     b       n
      SS Error = ∑∑∑ ( yijk − yij • )
                                                    2

                  i =1 j =1 k =1
Testing for Interaction:

We want to test:
        H0: (αβ)ij = 0 for all i and j, against
          HA: (αβ)ij ≠ 0 for at least one i and j.

 The test statistic                 1
                                            SS AB
          F=
             MS AB
                      =
                        ( a − 1) ( b − 1)
             MS Error          MS Error
where
              a   ^b       a b
          = ∑∑ ( αβ ) ij = ∑∑ ( yij • − yi•• − y• j • + y••• )
                      2                                        2
  SS AB
             i =1 j =1        i =1 j =1
We reject
      H0: (αβ)ij = 0 for all i and j,

If
        MS AB
     F=          > Fα ( (a − 1)(b − 1), ab(n − 1) )
        MS Error
Testing for the Main Effect of A:

We want to test:
        H0: αi = 0 for all i, against
        HA: αi ≠ 0 for at least one i.

 The test statistic                   1
                                            SS A
             F=
                 MS A
                         =
                                   ( a − 1)
                MS Error             MS Error
where
                        a              a
               SS A = ∑ α = ∑ ( yi•• − y••• )
                               2                   2
                        ˆ     i
                       i =1           i =1
We reject
      H0: αi = 0 for all i,

If
            MS A
        F=          > Fα ( (a − 1), ab(n − 1) )
           MS Error
Testing for the Main Effect of B:

We want to test:
        H0: βj = 0 for all j, against
        HA: βj ≠ 0 for at least one j.

 The test statistic               1
                                       SS B
             F=
                 MS B
                         =
                              ( b − 1)
                MS Error        MS Error
where
                      b         b
                      ˆ2 = ( y − y ) 2
             SS B = ∑ β j ∑ • j • •••
                      j =1     j =1
We reject
      H0: βj = 0 for all j,

If
            MS B
        F=          > Fα ( (b − 1), ab(n − 1) )
           MS Error
The ANOVA Table
Source    S.S.          d.f.        MS =SS/df        F
  A       SSA          a-1            MSA       MSA / MSError

  B       SSB          b-1            MSB       MSB / MSError

 AB      SSAB      (a - 1)(b - 1)     MSAB      MSAB/ MSError

Error    SSError     ab(n - 1)       MSError

Total    SSTotal      abn - 1
Computing Formulae
                   a       b          n
 Let T••• = ∑∑∑ yijk
               i =1 j =1 k =1
          b    n                                    a     n                 n
  Ti•• = ∑∑ yijk , T• j • = ∑∑ yijk , Tij • = ∑ yijk
         j =1 k =1                                  i =1 k =1              k =1
                           a     b  2     n
                                  T•••
Then SSTotal   = ∑∑∑ yijk −     2

                 i =1 j =1 k =1   nab
                                                                     2
                    T  a
                            T    2             2
                                                           T    a   T              2
       SS A = ∑                   , SS B = ∑
                                                                    • j•
                         −     i ••           •••
                                                       −                          •••

              i =1 nb      nab               i =1 na       nab
                   a T2        a    2     a T2          2
                                  Ti••                T•••
       SS AB = ∑           −∑          −∑
                      ij •                     • j•
                                                    +       ,
                 i =1 n      i =1 nb    i =1 na       nab
and SS Error = SSTotal − SS A − SS B − SS AB
MANOVA

Multivariate Analysis of Variance
One way Multivariate Analysis
  of Variance (MANOVA)
    Comparing k p-variate Normal
            Populations
The F test – for comparing k means
Situation
• We have k normal populations
       r
• Let µi and Σ denote the mean vector and
  covariance matrix of population i.
• i = 1, 2, 3, … k.
• Note: we assume that the covariance matrix
  for each population is the same.
    Σ1 = Σ 2 = K = Σ k = Σ
We want to test
          r r        r        r
    H 0 : µ1 = µ 2 = µ3 = K = µ k
against
       r r
 H A : µi ≠ µ j for at least one pair i, j
The data
• Assume we have collected data from each of
  k populations
       r r            r
• Let xi1 , xi 2 ,K , xin denote the n observations
  from population i.
• i = 1, 2, 3, … k.
Computing Formulae:
                                              n         
 Compute                                     ∑     x1ij 
                                                            
    r   n                                     j =1   T1i 
          r
1) Ti = ∑ xij = Total vector for sample i   = M = M 
                                              n           
         j =1                                 x  Tpi 
                                              ∑ pij 
                                              j =1      

                         G1 
   r k r k ni r  
2) G = ∑ Ti = ∑∑ xij =  M  = Grand Total vector
       i =1   i =1 j =1
                        G p 
                         

  3)     N = kn = Total sample size
 k n 2              k   n
                                                      
                   ∑∑ x1ij       L   ∑∑ x1ij x pij 
     k   n
             r r   i =1 j =1         i =1 j =1
                                                      
4) ∑∑ xij xij = 
                ′            M    O             M     
   i =1 j =1       k n                    k   n
                                                      
                                                  x2 
                    ∑∑ x1ij x pij L
                   i =1 j =1            ∑∑ pij 
                                        i =1 j =1    

                     1 k 2        1 k        
                     n ∑ T1i    L   ∑ T1iTpi 
                                   n i =1
   5)    1   k r r   i =1                    
           ∑ TTi′ =  M
         n i =1
                i                O        M   
                     k                  k    
                     1 ∑ T1iTpi L 1 ∑ Tpi 2

                     n i =1
                                   n i =1    
                                              
Let
         1 k rr 1 r r
      H = ∑ TTi′− GG ′
                i
         n i =1   N
          1 k 2 G12                   1 k          G1G p 
              ∑ T1i − N             L   ∑ T1iTpi − N 
          n i =1                      n i =1             
        =        M                  O          M         
          k                                              
          1 T T − G1G p                 1 k 2 G12 
          n ∑ 1i pi                         ∑ T1i − N 
                                     L
          i =1         N                n i =1           
                   k                                 k
                                                                                        
                n∑ ( x1i• − x1•• )              L n ∑ ( x1i• − x1•• ) ( x pi• − x p•• ) 
                                      2
         
                 i =1                              i =1
                                                                                        
        =                M                     O                 M                     
          k                                                k                           
          n∑ ( x1i• − x1•• ) ( x pi• − x p•• ) L        n∑ ( x pi• − x p•• )
                                                                              2
                                                                                        
          i =1
                                                         i =1                          
                                                                                        

        = the Between SS and SP matrix
k    n
                    r r 1 k rr
Let   E = ∑∑ xij xij − ∑ TTi ′
                       ′        i
          i =1 j =1      n i =1

          k n 2 1 k 2                         k   n
                                                                    1 k       
          ∑∑ x1ij − n ∑ T1i         L        ∑∑        x1ij x pij − ∑ T1iTpi 
                                                                    n i =1
          i =1 j =1        i =1              i =1 j =1
                                                                              
        =               M           O                            M           
          k n                                                                
                          1 k                      k   n
                                                                    1  k
                                                            x 2 − ∑ Tpi 
          ∑∑ 1ij pij n ∑ 1i pi                   ∑∑ pij n i=1 2 
                     x x −        TT L
          i =1 j =1         i =1                 i =1 j =1                   

                     k   n                            k   n
                                                                                      
                   ∑∑ ( x1ij − x1i• )         L ∑∑ ( x1ij − x1i• ) ( x pij − x pi • ) 
                                      2
        
                   i =1 j =1                   i =1 j =1
                                                                                      
       =                     M               O                   M                   
         k n                                             k   n
                                                                                      
         ∑∑ ( 1ij 1i• ) ( pij
                               x − x pi• )             ∑∑ ( x pij − x pi• )          
                                                                             2
                        x −x                  L
         i =1 j =1                                     i =1 j =1
                                                                                      
                                                                                      

        = the Within SS and SP matrix
The Manova Table
 Source          SS and SP matrix

                h11 L h1 p 
                           
Between     H = M O M 
                h1 p L hpp 
                           

                 e11 L e1 p 
                            
Within        E= M O M 
                e1 p L e pp 
                            
There are several test statistics for testing
          r r        r        r
    H 0 : µ1 = µ 2 = µ3 = K = µ k
against
       r r
 H A : µi ≠ µ j for at least one pair i, j
1. Roy’s largest root
     λ1 = largest eigenvalue of HE −1
This test statistic is derived using Roy’s union
intersection principle


2. Wilk’s lambda (Λ)
                    E           1
            Λ=           =
                  H+E        HE −1 + I

  This test statistic is derived using the generalized
  Likelihood ratio principle
3. Lawley-Hotelling trace statistic

T02 = trHE −1 = sum of the eigenvalues of HE−1




4. Pillai trace statistic (V)

              V = trH ( H + E )
                                  −1
Example
In the following study, n = 15 first year
university students from three different School
regions (A, B and C) who were each taking the
following four courses (Math, biology, English
and Sociology) were observed: The marks on
these courses is tabulated on the following slide:
The data
                                                            Educational Region
                     A                                              B                                          C
Student   Math   Biology   English   Sociology Student   Math   Biology English Sociology Student   Math   Biology   English Sociology
   1       62      65        67         76        1       65       55        35    43        1       47      47        98       78
   2       54      61        75         70        2       87       81        59    64        2       57      69        68       45
   3       53      53        53         59        3       75       67        56    68        3       65      71        77       62
   4       48      56        73         81        4       74       70        55    66        4       41      64        68       58
   5       60      55        49         60        5       83       71        40    52        5       56      54        86       64
   6       55      52        34         41        6       59       48        48    57        6       63      73        88       76
   7       76      71        35         40        7       61       47        46    54        7       43      62        84       78
   8       58      52        58         46        8       81       77        51    45        8       28      47        65       58
   9       75      71        60         59        9       77       68        42    49        9       47      54        90       78
  10       55      51        69         75       10       82       84        63    70       10       42      44        79       73
  11       72      74        64         59       11       68       64        35    44       11       50      53        89       89
  12       72      75        51         47       12       60       53        60    65       12       46      61        91       82
  13       76      69        69         57       13       94       88        51    63       13       74      78        99       86
  14       44      48        65         65       14       96       88        67    81       14       63      66        94       86
  15       89      71        59         67       15       84       75        46    67       15       69      82        78       73
Summary Statistics
r                                                      r 15 r 15 r 15 r
x′ =
  A
        63.267      61.600     58.733     60.133        ′
                                                       x• =    x′ +
                                                                A
                                                                        ′     ′
                                                                       xB + xC =
                                                            45      45     45
        160.638 104.829 -32.638 -47.110
SA =    104.829 92.543   -4.900 -22.229
                                                             64.133    64.111    64.200    63.911
        -32.638 -4.900 155.638 128.967
        -47.110 -22.229 128.967 159.552

r        76.400      69.067    50.267     59.200
x′ =
 B                                                                 14      14    14
         141.257    155.829    45.100     60.914      S Pooled =      S A + S B + SC =
         155.829    185.924    61.767     71.057                   42      42    42
SB =     45.100     61.767     96.495     93.371
         60.914     71.057     93.371     123.600          152.654    125.878    22.092    16.354
                                                           125.878    138.283    20.003    16.133
                                                            22.092     20.003   122.892   112.408
 r
  ′
 xC =     52.733      61.667     83.600     72.400          16.354     16.133   112.408   146.517

          156.067    116.976     53.814      35.257
 SC =     116.976
          53.814
                     136.381
                       3.143
                                  3.143
                                116.543
                                             -0.429
                                            114.886
          35.257      -0.429    114.886     156.400
Computations :
    r   n
          r
1) Ti = ∑ xij = Total vector for sample i
           j =1


                           G1 
   r k r k              ni
                        r  
2) G = ∑ Ti = ∑∑ xij =  M  = Grand Total vector
       i =1   i =1 j =1
                          G p 
                           
                             Math   Biology English Sociology
                    A           949      924    881       902
          Totals    B         1146     1036     754       888
                    C           791      925   1254     1086
     Grand Totals   G         2886     2885    2889     2876


  3) N = kn = Total sample size = 45
 k n 2              k   n
                                                      
                   ∑∑ x1ij       L   ∑∑ x1ij x pij 
     k   n
             r r   i =1 j =1         i =1 j =1
                                                      
4) ∑∑ xij xij = 
                ′            M    O             M     
   i =1 j =1       k n                    k   n
                                                      
                                                  x2 
                    ∑∑ x1ij x pij L
                   i =1 j =1            ∑∑ pij 
                                        i =1 j =1    

               195718    191674       180399       182865
               191674    191321       184516       184542
      =        180399    184516       199641       193125
               182865    184542       193125       191590
 1 k 2        1 k        
                     n ∑ T1i    L   ∑ T1iTpi 
                                   n i =1
5)       1   k r r   i =1                    
           ∑ TTi′ =  M
         n i =1
                i                O        M   
                     k                  k    
                     1 ∑ T1iTpi L 1 ∑ Tpi 2

                     n i =1
                                   n i =1    
                                              


           189306.53   186387.13   179471.13   182178.13
           186387.13   185513.13   183675.87   183864.40
     =     179471.13   183675.87   194479.53   188403.87
           182178.13   183864.40   188403.87   185436.27
Now
         1 k rr 1 r r
      H = ∑ TTi′− GG ′
                i
         n i =1   N
             4217.733333 1362.466667 -5810.066667 -2269.333333
             1362.466667 552.5777778 -1541.133333 -519.1555556
        =
            -5810.066667 -1541.133333 9005.733333 3764.666667
            -2269.333333 -519.1555556 3764.666667 1627.911111




        = the Between SS and SP matrix
k     n
                    r r 1 k rr
Let   E = ∑∑ xij xij − ∑ TTi ′
                       ′        i
          i =1 j =1      n i =1

          k n 2 1 k 2                    k   n
                                                               1 k       
          ∑∑ x1ij − n ∑ T1i         L   ∑∑        x1ij x pij − ∑ T1iTpi 
                                                               n i =1
          i =1 j =1        i =1         i =1 j =1
                                                                         
        =               M           O                       M           
          k n                                                           
                          1 k                 k   n
                                                               1  k
                                                       x 2 − ∑ Tpi 
          ∑∑ 1ij pij n ∑ 1i pi              ∑∑ pij n i=1 2 
                     x x −        TT L
          i =1 j =1         i =1            i =1 j =1                   


                6411.467    5286.867           927.867            686.867
        =       5286.867    5807.867           840.133            677.600
                 927.867     840.133          5161.467            4721.133
                 686.867     677.600          4721.133            6153.733



         = the Within SS and SP matrix
Using SPSS to perform MANOVA
Selecting the variables and the Factors
The output

                                                           c
                                          Multivariate Tests

Effect                                  Value          F      Hypothesis df        Error df      Sig.
Intercept       Pillai's Trace             .984      586.890a        4.000          39.000          .000
                Wilks' Lambda              .016      586.890a        4.000          39.000          .000
                Hotelling's Trace        60.194      586.890a        4.000          39.000          .000
                Roy's Largest Root       60.194      586.890a        4.000          39.000          .000
High_School     Pillai's Trace             .883        7.913         8.000          80.000          .000
                Wilks' Lambda              .161       14.571a        8.000          78.000          .000
                Hotelling's Trace         4.947       23.501         8.000          76.000          .000
                Roy's Largest Root        4.891       48.913b        4.000          40.000          .000
  a. Exact statistic
  b. The statistic is an upper bound on F that yields a lower bound on the significance level.
  c. Design: Intercept+High_School
Univariate Tests
                                  Tests of Between-Subjects Effects

                                        Type III Sum
Source            Dependent Variable     of Squares     df        Mean Square      F       Sig.
Corrected Model   Math                      4217.733a         2      2108.867     13.815      .000
                  Biology                    552.578b         2       276.289      1.998      .148
                  English                   9005.733c         2      4502.867     36.641      .000
                  Sociology                 1627.911d         2       813.956      5.555      .007
Intercept         Math                   185088.800           1    185088.800   1212.473      .000
                  Biology                184960.556           1    184960.556   1337.555      .000
                  English                185473.800           1    185473.800   1509.241      .000
                  Sociology              183808.356           1    183808.356   1254.515      .000
High_School       Math                      4217.733          2      2108.867     13.815      .000
                  Biology                    552.578          2       276.289      1.998      .148
                  English                   9005.733          2      4502.867     36.641      .000
                  Sociology                 1627.911          2       813.956      5.555      .007
Error             Math                      6411.467         42       152.654
                  Biology                   5807.867         42       138.283
                  English                   5161.467         42       122.892
                  Sociology                 6153.733         42       146.517
Total             Math                   195718.000          45
                  Biology                191321.000          45
                  English                199641.000          45
                  Sociology              191590.000          45
Corrected Total   Math                    10629.200          44
                  Biology                   6360.444         44
                  English                 14167.200          44
                  Sociology                 7781.644         44
  a. R Squared = .397 (Adjusted R Squared = .368)
  b. R Squared = .087 (Adjusted R Squared = .043)
  c. R Squared = .636 (Adjusted R Squared = .618)
  d. R Squared = .209 (Adjusted R Squared = .172)

More Related Content

What's hot

SUA_Seminar_Presentation
SUA_Seminar_PresentationSUA_Seminar_Presentation
SUA_Seminar_PresentationKhushbu Mishra
 
Statistics
Statistics Statistics
Statistics KafiPati
 
Bivariate Discrete Distribution
Bivariate Discrete DistributionBivariate Discrete Distribution
Bivariate Discrete DistributionArijitDhali
 
Cunningham slides-ch2
Cunningham slides-ch2Cunningham slides-ch2
Cunningham slides-ch2cunningjames
 
IVR - Chapter 5 - Bayesian methods
IVR - Chapter 5 - Bayesian methodsIVR - Chapter 5 - Bayesian methods
IVR - Chapter 5 - Bayesian methodsCharles Deledalle
 
C2 st lecture 12 the chi squared-test handout
C2 st lecture 12   the chi squared-test handoutC2 st lecture 12   the chi squared-test handout
C2 st lecture 12 the chi squared-test handoutfatima d
 
Linear regression analysis
Linear regression analysisLinear regression analysis
Linear regression analysisNimrita Koul
 
Presentation on Regression Analysis
Presentation on Regression AnalysisPresentation on Regression Analysis
Presentation on Regression AnalysisJ P Verma
 
Bayesian Variable Selection in Linear Regression and A Comparison
Bayesian Variable Selection in Linear Regression and A ComparisonBayesian Variable Selection in Linear Regression and A Comparison
Bayesian Variable Selection in Linear Regression and A ComparisonAtilla YARDIMCI
 
C2 st lecture 10 basic statistics and the z test handout
C2 st lecture 10   basic statistics and the z test handoutC2 st lecture 10   basic statistics and the z test handout
C2 st lecture 10 basic statistics and the z test handoutfatima d
 
Statistical computing 1
Statistical computing 1Statistical computing 1
Statistical computing 1Padma Metta
 
Graph theoretic approach to solve measurement placement problem for power system
Graph theoretic approach to solve measurement placement problem for power systemGraph theoretic approach to solve measurement placement problem for power system
Graph theoretic approach to solve measurement placement problem for power systemIAEME Publication
 
An alternative approach to estimation of population
An alternative approach to estimation of populationAn alternative approach to estimation of population
An alternative approach to estimation of populationAlexander Decker
 

What's hot (20)

5 regression
5 regression5 regression
5 regression
 
SUA_Seminar_Presentation
SUA_Seminar_PresentationSUA_Seminar_Presentation
SUA_Seminar_Presentation
 
Statistics
Statistics Statistics
Statistics
 
Bivariate Discrete Distribution
Bivariate Discrete DistributionBivariate Discrete Distribution
Bivariate Discrete Distribution
 
Cunningham slides-ch2
Cunningham slides-ch2Cunningham slides-ch2
Cunningham slides-ch2
 
IVR - Chapter 5 - Bayesian methods
IVR - Chapter 5 - Bayesian methodsIVR - Chapter 5 - Bayesian methods
IVR - Chapter 5 - Bayesian methods
 
MDEs presentation
MDEs presentationMDEs presentation
MDEs presentation
 
COVARIANCE IN PROBABILITY
COVARIANCE IN PROBABILITYCOVARIANCE IN PROBABILITY
COVARIANCE IN PROBABILITY
 
C2 st lecture 12 the chi squared-test handout
C2 st lecture 12   the chi squared-test handoutC2 st lecture 12   the chi squared-test handout
C2 st lecture 12 the chi squared-test handout
 
X61
X61X61
X61
 
Econometrics ch11
Econometrics ch11Econometrics ch11
Econometrics ch11
 
Linear regression analysis
Linear regression analysisLinear regression analysis
Linear regression analysis
 
Presentation on Regression Analysis
Presentation on Regression AnalysisPresentation on Regression Analysis
Presentation on Regression Analysis
 
Bayesian Variable Selection in Linear Regression and A Comparison
Bayesian Variable Selection in Linear Regression and A ComparisonBayesian Variable Selection in Linear Regression and A Comparison
Bayesian Variable Selection in Linear Regression and A Comparison
 
C2 st lecture 10 basic statistics and the z test handout
C2 st lecture 10   basic statistics and the z test handoutC2 st lecture 10   basic statistics and the z test handout
C2 st lecture 10 basic statistics and the z test handout
 
Statistical computing 1
Statistical computing 1Statistical computing 1
Statistical computing 1
 
Probability Distribution
Probability DistributionProbability Distribution
Probability Distribution
 
Correlation
CorrelationCorrelation
Correlation
 
Graph theoretic approach to solve measurement placement problem for power system
Graph theoretic approach to solve measurement placement problem for power systemGraph theoretic approach to solve measurement placement problem for power system
Graph theoretic approach to solve measurement placement problem for power system
 
An alternative approach to estimation of population
An alternative approach to estimation of populationAn alternative approach to estimation of population
An alternative approach to estimation of population
 

Viewers also liked

Case Presentation (specific Phobia)
Case Presentation (specific Phobia)Case Presentation (specific Phobia)
Case Presentation (specific Phobia)Mariwan Barznje
 
Chosing the appropriate_statistical_test
Chosing the appropriate_statistical_testChosing the appropriate_statistical_test
Chosing the appropriate_statistical_testBRAJESH KUMAR PARASHAR
 
Overview spss instructor
Overview spss instructorOverview spss instructor
Overview spss instructoraswhite
 
Full anova and manova by ammara aftab
Full anova and manova by ammara aftabFull anova and manova by ammara aftab
Full anova and manova by ammara aftabUniversity of Karachi
 
MANOVA/ANOVA (July 2014 updated)
MANOVA/ANOVA (July 2014 updated)MANOVA/ANOVA (July 2014 updated)
MANOVA/ANOVA (July 2014 updated)Michael Ling
 
Multivariate Analaysis of Variance (MANOVA): Sharma, Chapter 11 - Bijan Yavar
Multivariate Analaysis of Variance (MANOVA): Sharma, Chapter 11 - Bijan YavarMultivariate Analaysis of Variance (MANOVA): Sharma, Chapter 11 - Bijan Yavar
Multivariate Analaysis of Variance (MANOVA): Sharma, Chapter 11 - Bijan YavarBijan Yavar
 
ANOVA and its application
ANOVA and its applicationANOVA and its application
ANOVA and its applicationGuruprasad P
 
Manova reporting
Manova reportingManova reporting
Manova reportingAbereA
 
Financial Management Investment Appraisal
Financial  Management    Investment  AppraisalFinancial  Management    Investment  Appraisal
Financial Management Investment AppraisalMaroof Hussain Sabri
 
Anova ancova manova_mancova
Anova  ancova manova_mancovaAnova  ancova manova_mancova
Anova ancova manova_mancovaCarlo Magno
 
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc testsStatistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc testsEugene Yan Ziyou
 
Commonly used statistical tests in research
Commonly used statistical tests in researchCommonly used statistical tests in research
Commonly used statistical tests in researchNaqeeb Ullah Khan
 
Quickreminder nature of the data (relationship)
Quickreminder nature of the data (relationship)Quickreminder nature of the data (relationship)
Quickreminder nature of the data (relationship)Ken Plummer
 
Multivariate analysis
Multivariate analysisMultivariate analysis
Multivariate analysisNaveen Deswal
 
Choosing appropriate statistical test RSS6 2104
Choosing appropriate statistical test RSS6 2104Choosing appropriate statistical test RSS6 2104
Choosing appropriate statistical test RSS6 2104RSS6
 

Viewers also liked (20)

Case Presentation (specific Phobia)
Case Presentation (specific Phobia)Case Presentation (specific Phobia)
Case Presentation (specific Phobia)
 
Chosing the appropriate_statistical_test
Chosing the appropriate_statistical_testChosing the appropriate_statistical_test
Chosing the appropriate_statistical_test
 
Overview spss instructor
Overview spss instructorOverview spss instructor
Overview spss instructor
 
Full anova and manova by ammara aftab
Full anova and manova by ammara aftabFull anova and manova by ammara aftab
Full anova and manova by ammara aftab
 
MANOVA/ANOVA (July 2014 updated)
MANOVA/ANOVA (July 2014 updated)MANOVA/ANOVA (July 2014 updated)
MANOVA/ANOVA (July 2014 updated)
 
Multivariate Analaysis of Variance (MANOVA): Sharma, Chapter 11 - Bijan Yavar
Multivariate Analaysis of Variance (MANOVA): Sharma, Chapter 11 - Bijan YavarMultivariate Analaysis of Variance (MANOVA): Sharma, Chapter 11 - Bijan Yavar
Multivariate Analaysis of Variance (MANOVA): Sharma, Chapter 11 - Bijan Yavar
 
Stat topics
Stat topicsStat topics
Stat topics
 
ANOVA and its application
ANOVA and its applicationANOVA and its application
ANOVA and its application
 
Manova Report
Manova ReportManova Report
Manova Report
 
Manova
ManovaManova
Manova
 
Manova reporting
Manova reportingManova reporting
Manova reporting
 
Financial Management Investment Appraisal
Financial  Management    Investment  AppraisalFinancial  Management    Investment  Appraisal
Financial Management Investment Appraisal
 
Anova ancova manova_mancova
Anova  ancova manova_mancovaAnova  ancova manova_mancova
Anova ancova manova_mancova
 
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc testsStatistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests
 
Manova
ManovaManova
Manova
 
BBA 020
BBA 020BBA 020
BBA 020
 
Commonly used statistical tests in research
Commonly used statistical tests in researchCommonly used statistical tests in research
Commonly used statistical tests in research
 
Quickreminder nature of the data (relationship)
Quickreminder nature of the data (relationship)Quickreminder nature of the data (relationship)
Quickreminder nature of the data (relationship)
 
Multivariate analysis
Multivariate analysisMultivariate analysis
Multivariate analysis
 
Choosing appropriate statistical test RSS6 2104
Choosing appropriate statistical test RSS6 2104Choosing appropriate statistical test RSS6 2104
Choosing appropriate statistical test RSS6 2104
 

Similar to Section 07 manova

集合知プログラミングゼミ第1回
集合知プログラミングゼミ第1回集合知プログラミングゼミ第1回
集合知プログラミングゼミ第1回Shunta Saito
 
Additional notes EC220
Additional notes EC220Additional notes EC220
Additional notes EC220Guo Xu
 
Advance statistics 2
Advance statistics 2Advance statistics 2
Advance statistics 2Tim Arroyo
 
Basics of probability in statistical simulation and stochastic programming
Basics of probability in statistical simulation and stochastic programmingBasics of probability in statistical simulation and stochastic programming
Basics of probability in statistical simulation and stochastic programmingSSA KPI
 
Gaussian Integration
Gaussian IntegrationGaussian Integration
Gaussian IntegrationReza Rahimi
 
Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Pierre Jacob
 
Physics of Algorithms Talk
Physics of Algorithms TalkPhysics of Algorithms Talk
Physics of Algorithms Talkjasonj383
 
Jackknife algorithm for the estimation of logistic regression parameters
Jackknife algorithm for the estimation of logistic regression parametersJackknife algorithm for the estimation of logistic regression parameters
Jackknife algorithm for the estimation of logistic regression parametersAlexander Decker
 
2 senarai rumus add maths k2 trial spm sbp 2010
2 senarai rumus add maths k2 trial spm sbp 20102 senarai rumus add maths k2 trial spm sbp 2010
2 senarai rumus add maths k2 trial spm sbp 2010zabidah awang
 
2 senarai rumus add maths k1 trial spm sbp 2010
2 senarai rumus add maths k1 trial spm sbp 20102 senarai rumus add maths k1 trial spm sbp 2010
2 senarai rumus add maths k1 trial spm sbp 2010zabidah awang
 
2 senarai rumus add maths k1 trial spm sbp 2010
2 senarai rumus add maths k1 trial spm sbp 20102 senarai rumus add maths k1 trial spm sbp 2010
2 senarai rumus add maths k1 trial spm sbp 2010zabidah awang
 
2 senarai rumus add maths k2 trial spm sbp 2010
2 senarai rumus add maths k2 trial spm sbp 20102 senarai rumus add maths k2 trial spm sbp 2010
2 senarai rumus add maths k2 trial spm sbp 2010zabidah awang
 

Similar to Section 07 manova (20)

集合知プログラミングゼミ第1回
集合知プログラミングゼミ第1回集合知プログラミングゼミ第1回
集合知プログラミングゼミ第1回
 
11.1 anova1
11.1 anova111.1 anova1
11.1 anova1
 
Symmetrical2
Symmetrical2Symmetrical2
Symmetrical2
 
Chapter 04
Chapter 04Chapter 04
Chapter 04
 
Additional notes EC220
Additional notes EC220Additional notes EC220
Additional notes EC220
 
Advance statistics 2
Advance statistics 2Advance statistics 2
Advance statistics 2
 
Anov af03
Anov af03Anov af03
Anov af03
 
Two way example
Two way exampleTwo way example
Two way example
 
Dsp lecture vol 2 dft & fft
Dsp lecture vol 2 dft & fftDsp lecture vol 2 dft & fft
Dsp lecture vol 2 dft & fft
 
Basics of probability in statistical simulation and stochastic programming
Basics of probability in statistical simulation and stochastic programmingBasics of probability in statistical simulation and stochastic programming
Basics of probability in statistical simulation and stochastic programming
 
Gaussian Integration
Gaussian IntegrationGaussian Integration
Gaussian Integration
 
Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods
 
Physics of Algorithms Talk
Physics of Algorithms TalkPhysics of Algorithms Talk
Physics of Algorithms Talk
 
Jackknife algorithm for the estimation of logistic regression parameters
Jackknife algorithm for the estimation of logistic regression parametersJackknife algorithm for the estimation of logistic regression parameters
Jackknife algorithm for the estimation of logistic regression parameters
 
Math report
Math reportMath report
Math report
 
2 senarai rumus add maths k2 trial spm sbp 2010
2 senarai rumus add maths k2 trial spm sbp 20102 senarai rumus add maths k2 trial spm sbp 2010
2 senarai rumus add maths k2 trial spm sbp 2010
 
2 senarai rumus add maths k1 trial spm sbp 2010
2 senarai rumus add maths k1 trial spm sbp 20102 senarai rumus add maths k1 trial spm sbp 2010
2 senarai rumus add maths k1 trial spm sbp 2010
 
Semi-Magic Squares From Snake-Shaped Matrices
Semi-Magic Squares From Snake-Shaped MatricesSemi-Magic Squares From Snake-Shaped Matrices
Semi-Magic Squares From Snake-Shaped Matrices
 
2 senarai rumus add maths k1 trial spm sbp 2010
2 senarai rumus add maths k1 trial spm sbp 20102 senarai rumus add maths k1 trial spm sbp 2010
2 senarai rumus add maths k1 trial spm sbp 2010
 
2 senarai rumus add maths k2 trial spm sbp 2010
2 senarai rumus add maths k2 trial spm sbp 20102 senarai rumus add maths k2 trial spm sbp 2010
2 senarai rumus add maths k2 trial spm sbp 2010
 

Section 07 manova

  • 2. One way Analysis of Variance (ANOVA) Comparing k Populations
  • 3. The F test – for comparing k means Situation • We have k normal populations • Let µi and σ denote the mean and standard deviation of population i. • i = 1, 2, 3, … k. • Note: we assume that the standard deviation for each population is the same. σ1 = σ2 = … = σk = σ
  • 4. We want to test H 0 : µ1 = µ 2 = µ3 =  = µ k against H A : µ i ≠ µ j for at least one pair i, j
  • 5. The F statistic k ∑n (x − x) 1 2 k −1 i i F= i =1 k nj ∑∑ ( x − xi ) 1 2 N −k ij i =1 j =1 where xij = the jth observation in the i th sample. ( i = 1,2,, k and j = 1,2,, ni ) ni ∑x ij = mean for i th sample ( i = 1,2,, k ) j =1 xi = ni k N = ∑ ni = Total sample size i =1 k ni ∑∑ x i =1 j =1 ij x= = Overall mean N
  • 6. The ANOVA table Source S.S d.f, M.S. F k MS B SS B = ∑ ni ( xi − x ) k ∑n (x − x) 2 Between MS B = 1 2 F= i =1 k −1 k −1 i i MSW i =1 nj nj SSW = ∑∑ ( xij − xi ) ∑∑ ( x ij − xi ) k k 2 2 N − k MSW = 1 Within N −k i =1 j =1 i =1 j =1 The ANOVA table is a tool for displaying the computations for the F test. It is very important when the Between Sample variability is due to two or more factors
  • 7. Computing Formulae: Compute ni 1) Ti = ∑ xij = Total for sample i j =1 k k ni 2) G = ∑ Ti = ∑∑ xij = Grand Total i =1 i =1 j =1 k 3) N = ∑ ni = Total sample size i =1 k ni ∑∑ x 2 4) ij i =1 j =1 k 2 Ti 5) ∑n i =1 i
  • 8. The data • Assume we have collected data from each of k populations • Let xi1, xi2 , xi3 , … denote the ni observations from population i. • i = 1, 2, 3, … k.
  • 9. Then k 2 2 Ti G 1) SS Between =∑ − i =1 ni N k ni k 2 Ti = ∑∑ xij − ∑ 2 2) SSWithin i =1 j =1 i =1 ni SS Between ( k − 1) 3) F= SSWithin ( N − k )
  • 10. Anova Table Source d.f. Sum of Mean F-ratio Squares Square Between k-1 SSBetween MSBetween MSB /MSW Within N-k SSWithin MSWithin Total N-1 SSTotal SS MS = df
  • 11. Example In the following example we are comparing weight gains resulting from the following six diets 1. Diet 1 - High Protein , Beef 2. Diet 2 - High Protein , Cereal 3. Diet 3 - High Protein , Pork 4. Diet 4 - Low protein , Beef 5. Diet 5 - Low protein , Cereal 6. Diet 6 - Low protein , Pork
  • 12. Gains in weight (grams) for rats under six diets differing in level of protein (High or Low) and source of protein (Beef, Cereal, or Pork) Diet 1 2 3 4 5 6 73 98 94 90 107 49 102 74 79 76 95 82 118 56 96 90 97 73 104 111 98 64 80 86 81 95 102 86 98 81 107 88 102 51 74 97 100 82 108 72 74 106 87 77 91 90 67 70 117 86 120 95 89 61 111 92 105 78 58 82 Mean 100.0 85.9 99.5 79.2 83.9 78.7 Std. Dev. 15.14 15.02 10.92 13.89 15.71 16.55 Σx 1000 859 995 792 839 787 Σx2 102062 75819 100075 64462 72613 64401
  • 13. Thus k Ti 2 G 2 5272 2 SS Between = ∑ − = 467846 − = 4612.933 i =1 ni N 60 k ni k Ti 2 SSWithin = ∑∑ xij − ∑ 2 = 479432 − 467846 = 11586 i =1 j =1 i =1 ni SS Between ( k − 1) 4612.933 / 5 922.6 F= = = = 4.3 SSWithin ( N − k ) 11586 / 54 214.56 F0.05 = 2.386 with ν 1 = 5 and ν 2 = 54 Thus since F > 2.386 we reject H0
  • 14. Anova Table Source d.f. Sum of Mean F-ratio Squares Square Between 5 4612.933 922.587 4.3** (p = 0.0023) SS Within 54 11586.000 214.556 Total 59 16198.933 * - Significant at 0.05 (not 0.01) ** - Significant at 0.01
  • 15. Equivalence of the F-test and the t-test when k = 2 the t-test x−y t= 1 1 s Pooled + n m ( n − 1) sx2 + ( m − 1) s 2 sPooled = y n+m−2
  • 16. the F-test k ∑n ( x − x) 2 2 i i k −1 s F= Between = i =1 s 2 k  k  Pooled ∑ ( ni − 1) si2  ∑ ni − k  i =1  i =1  n1 ( x1 − x ) + n2 ( x1 − x ) 2 2 = [ ] ( n1 − 1) s12 + ( n1 − 1) s12 ( n1 + n2 − 2) denominator = s 2 pooled numerator = n1 ( x1 − x ) + n2 ( x1 − x ) 2 2
  • 17. 2  n1 x1 + n2 x2  n1 ( x1 − x ) = n1  x1 −  2  n1 + n2    2 n1n2 = ( x1 − x2 ) 2 ( n1 + n2 ) 2 2  n1 x1 + n2 x2  n2 ( x2 − x ) = n2  x2 −  2  n1 + n2    n12 n2 = ( x1 − x2 ) 2 ( n1 + n2 ) 2
  • 18. nn +n n 2 2 n1 ( x1 − x ) + n2 ( x2 − x ) ( x1 − x2 ) 2 2 2 = 1 2 2 1 ( n1 + n ) 2 2 n1n2 = ( x1 − x2 ) 2 ( n1 + n2 ) 1 = ( x1 − x2 ) 2 1 1  +  n n   1 2  Hence F= 1 ( x1 − x2 ) 2 = t2  1 1  sPooled 2  +  n n   1 2 
  • 19. Factorial Experiments Analysis of Variance
  • 20. • Dependent variable Y • k Categorical independent variables A, B, C, … (the Factors) • Let – a = the number of categories of A – b = the number of categories of B – c = the number of categories of C – etc.
  • 21. The Completely Randomized Design • We form the set of all treatment combinations – the set of all combinations of the k factors • Total number of treatment combinations – t = abc…. • In the completely randomized design n experimental units (test animals , test plots, etc. are randomly assigned to each treatment combination. – Total number of experimental units N = nt=nabc..
  • 22. The treatment combinations can thought to be arranged in a k-dimensional rectangular block B 1 2 b 1 2 A a
  • 23. C B A
  • 24. • The Completely Randomized Design is called balanced • If the number of observations per treatment combination is unequal the design is called unbalanced. (resulting mathematically more complex analysis and computations) • If for some of the treatment combinations there are no observations the design is called incomplete. (In this case it may happen that some of the parameters - main effects and interactions - cannot be estimated.)
  • 25. Example In this example we are examining the effect of The level of protein A (High or Low) and the source of protein B (Beef, Cereal, or Pork) on weight gains (grams) in rats. We have n = 10 test animals randomly assigned to k = 6 diets
  • 26. The k = 6 diets are the 6 = 3×2 Level-Source combinations 1. High - Beef 2. High - Cereal 3. High - Pork 4. Low - Beef 5. Low - Cereal 6. Low - Pork
  • 27. Table Gains in weight (grams) for rats under six diets differing in level of protein (High or Low) and s ource of protein (Beef, Cereal, or Pork) Level of Protein High Protein Low protein Source of Protein Beef Cereal Pork Beef Cereal Pork Diet 1 2 3 4 5 6 73 98 94 90 107 49 102 74 79 76 95 82 118 56 96 90 97 73 104 111 98 64 80 86 81 95 102 86 98 81 107 88 102 51 74 97 100 82 108 72 74 106 87 77 91 90 67 70 117 86 120 95 89 61 111 92 105 78 58 82 Mean 100.0 85.9 99.5 79.2 83.9 78.7 Std. Dev. 15.14 15.02 10.92 13.89 15.71 16.55
  • 28. Treatment combinations Source of Protein Beef Cereal Pork Level High Diet 1 Diet 2 Diet 3 of Protein Low Diet 4 Diet 5 Diet 6
  • 29. Summary Table of Means Source of Protein Level of Protein Beef Cereal Pork Overall High 100.00 85.90 99.50 95.13 Low 79.20 83.90 78.70 80.60 Overall 89.60 84.90 89.10 87.87
  • 30. Profiles of the response relative to a factor A graphical representation of the effect of a factor on a reponse variable (dependent variable)
  • 31. Profile Y for A Y This could be for an individual case or averaged over a group of cases This could be for specific level of another factor or averaged levels of another factor 1 2 3 … a Levels of A
  • 32. Profiles of Weight Gain for Source and Level of Protein 110 High Protein Low Protein Overall 100 We ight Gain 90 80 70 Beef Cereal Pork
  • 33. Profiles of Weight Gain for Source and Level of Protein 110 Beef Cereal Pork 100 Overall We ight Gain 90 80 70 High Protein Low Protein
  • 34. Example – Four factor experiment Four factors are studied for their effect on Y (luster of paint film). The four factors are: 1) Film Thickness - (1 or 2 mils) 2) Drying conditions (Regular or Special) 3) Length of wash (10,30,40 or 60 Minutes), and 4) Temperature of wash (92 ˚C or 100 ˚C) Two observations of film luster (Y) are taken for each treatment combination
  • 35. The data is tabulated below: Regular Dry Special Dry Minutes 92 °C 100 °C 92°C 100 °C 1-mil Thickness 20 3.4 3.4 19.6 14.5 2.1 3.8 17.2 13.4 30 4.1 4.1 17.5 17.0 4.0 4.6 13.5 14.3 40 4.9 4.2 17.6 15.2 5.1 3.3 16.0 17.8 60 5.0 4.9 20.9 17.1 8.3 4.3 17.5 13.9 2-mil Thickness 20 5.5 3.7 26.6 29.5 4.5 4.5 25.6 22.5 30 5.7 6.1 31.6 30.2 5.9 5.9 29.2 29.8 40 5.5 5.6 30.5 30.2 5.5 5.8 32.6 27.4 60 7.2 6.0 31.4 29.6 8.0 9.9 33.5 29.5
  • 36. Definition: A factor is said to not affect the response if the profile of the factor is horizontal for all combinations of levels of the other factors: No change in the response when you change the levels of the factor (true for all combinations of levels of the other factors) Otherwise the factor is said to affect the response:
  • 37. Profile Y for A – A affects the response Y     Levels of B    1 2 3 … a Levels of A
  • 38. Profile Y for A – no affect on the response Y     Levels of B    1 2 3 … a Levels of A
  • 39. Definition: • Two (or more) factors are said to interact if changes in the response when you change the level of one factor depend on the level(s) of the other factor(s). • Profiles of the factor for different levels of the other factor(s) are not parallel • Otherwise the factors are said to be additive . • Profiles of the factor for different levels of the other factor(s) are parallel.
  • 40. Interacting factors A and B Y     Levels of B    1 2 3 … a Levels of A
  • 41. Additive factors A and B Y     Levels of B    1 2 3 … a Levels of A
  • 42. • If two (or more) factors interact each factor effects the response. • If two (or more) factors are additive it still remains to be determined if the factors affect the response • In factorial experiments we are interested in determining – which factors effect the response and – which groups of factors interact .
  • 43. The testing in factorial experiments 1. Test first the higher order interactions. 2. If an interaction is present there is no need to test lower order interactions or main effects involving those factors. All factors in the interaction affect the response and they interact 3. The testing continues with for lower order interactions and main effects for factors which have not yet been determined to affect the response.
  • 44. Models for factorial Experiments
  • 45. The Single Factor Experiment Situation • We have t = a treatment combinations • Let µi and σ denote the mean and standard deviation of observations from treatment i. • i = 1, 2, 3, … a. • Note: we assume that the standard deviation for each population is the same. σ1 = σ2 = … = σa = σ
  • 46. The data • Assume we have collected data for each of the a treatments • Let yi1, yi2 , yi3 , … , yin denote the n observations for treatment i. • i = 1, 2, 3, … a.
  • 47. The model Note: yij = µi + ( yij − µi ) = µi + ε ij = µ + ( µi − µ ) + ε ij = µ + α i + ε ij where ε ij = yij − µi has N(0,σ2) distribution 1 k µ = ∑ µi (overall mean effect) k i =1 α i = µi − µ (Effect of Factor A) a Note: ∑α i =1 i =0 by their definition.
  • 48. Model 1: yij (i = 1, … , a; j = 1, …, n) are independent Normal with mean µi and variance σ2. Model 2: yij = µi + ε ij where εij (i = 1, … , a; j = 1, …, n) are independent Normal with mean 0 and variance σ2. Model 3: yij = µ + α i + ε ij where εij (i = 1, … , a; j = 1, …, n) are independent Normal with mean 0 and variance σ2 and a ∑α i =1 i =0
  • 49. The Two Factor Experiment Situation • We have t = ab treatment combinations • Let µij and σ denote the mean and standard deviation of observations from the treatment combination when A = i and B = j. • i = 1, 2, 3, … a, j = 1, 2, 3, … b.
  • 50. The data • Assume we have collected data (n observations) for each of the t = ab treatment combinations. • Let yij1, yij2 , yij3 , … , yijn denote the n observations for treatment combination - A = i, B = j. • i = 1, 2, 3, … a, j = 1, 2, 3, … b.
  • 51. The model Note: yijk = µij + ( yijk − µij ) = µij + ε ijk = µ + ( µi• − µ ) + ( µ• j − µ ) + ( µij − µi• − µ• j + µ ) + ε ij = µ + α i + β j + ( αβ ) ij + ε ijk where ε ijk = yijk − µij has N(0,σ2) distribution 1 a b 1 b 1 a µ = ∑∑ µij , µi• = ∑ µij and µ• j = ∑ µij ab i =1 j =1 b j =1 a i =1 α i = µi • − µ , β j = µ• j − µ , and ( αβ ) ij = µij − µi• − µ• j + µ
  • 52. The model Note: yijk = µij + ( yijk − µij ) = µij + ε ijk = µ + ( µi• − µ ) + ( µ• j − µ ) + ( µij − µi• − µ• j + µ ) + ε ij = µ + α i + β j + ( αβ ) ij + ε ijk where ε ijk = yijk − µij has N(0,σ2) distribution 1 a b 1 b 1 a µ = ∑∑ µij , µi• = ∑ µij and µ• j = ∑ µij ab i =1 j =1 b j =1 a i =1 α i = µi • − µ , β j = µ• j − µ , a Note: ∑α i =1 i =0 by their definition.
  • 53. Main effects Error Interaction Model : Mean Effect yijk = µ + α i + β j + ( αβ ) ij + ε ijk where εijk (i = 1, … , a; j = 1, …, b ; k = 1, …, n) are independent Normal with mean 0 and variance σ2 and a b ∑α i =1 i =0 ∑β j =1 j =0 a b and ∑ ( αβ ) i =1 ij = ∑ ( αβ ) ij = 0 j =1
  • 54. Maximum Likelihood Estimates yijk = µ + α i + β j + ( αβ ) ij + ε ijk where εijk (i = 1, … , a; j = 1, …, b ; k = 1, …, n) are independent Normal with mean 0 and variance σ2 and a b n µ = y••• = ∑∑∑ yijk abn ˆ i =1 j =1 k =1 b n α i = yi •• − y••• = ∑∑ yijk bn − y••• ˆ j =1 k =1 a n β j = y• j • − y••• = ∑∑ yijk an − y••• ˆ i =1 k =1
  • 55. ^ ( αβ ) ij = yij• − yi•• − y• j • + y••• n = ∑ yijk n − yi •• − y• j • + y••• k =1 a b n 1 ∑∑∑ ( yijk − yij• ) 2 σ = ˆ 2 nab i =1 j =1 k =1 2 1 a b n   ^  = ∑∑∑  yijk −  µ + αi + βˆ j + ( αβ ) ij  ÷ nab i =1 j =1 k =1  ˆ ˆ    ÷  This is not an unbiased estimator of σ2 (usually the case when estimating variance.) The unbiased estimator results when we divide by ab(n -1) instead of abn
  • 56. The unbiased estimator of σ2 is a b n 1 ∑∑∑ ( y − yij • ) 2 s = 2 ab ( n − 1) ijk i =1 j =1 k =1 2 1 a b n  ^  = ab ( n − 1) ∑∑∑   yijk −  µ + α i + β j + ( αβ ) ij  ÷ ˆ ˆ ˆ i =1 j =1 k =1    ÷  1 = SS Error = MS Error ab ( n − 1) where a b n SS Error = ∑∑∑ ( yijk − yij • ) 2 i =1 j =1 k =1
  • 57. Testing for Interaction: We want to test: H0: (αβ)ij = 0 for all i and j, against HA: (αβ)ij ≠ 0 for at least one i and j. The test statistic 1 SS AB F= MS AB = ( a − 1) ( b − 1) MS Error MS Error where a ^b a b = ∑∑ ( αβ ) ij = ∑∑ ( yij • − yi•• − y• j • + y••• ) 2 2 SS AB i =1 j =1 i =1 j =1
  • 58. We reject H0: (αβ)ij = 0 for all i and j, If MS AB F= > Fα ( (a − 1)(b − 1), ab(n − 1) ) MS Error
  • 59. Testing for the Main Effect of A: We want to test: H0: αi = 0 for all i, against HA: αi ≠ 0 for at least one i. The test statistic 1 SS A F= MS A = ( a − 1) MS Error MS Error where a a SS A = ∑ α = ∑ ( yi•• − y••• ) 2 2 ˆ i i =1 i =1
  • 60. We reject H0: αi = 0 for all i, If MS A F= > Fα ( (a − 1), ab(n − 1) ) MS Error
  • 61. Testing for the Main Effect of B: We want to test: H0: βj = 0 for all j, against HA: βj ≠ 0 for at least one j. The test statistic 1 SS B F= MS B = ( b − 1) MS Error MS Error where b b ˆ2 = ( y − y ) 2 SS B = ∑ β j ∑ • j • ••• j =1 j =1
  • 62. We reject H0: βj = 0 for all j, If MS B F= > Fα ( (b − 1), ab(n − 1) ) MS Error
  • 63. The ANOVA Table Source S.S. d.f. MS =SS/df F A SSA a-1 MSA MSA / MSError B SSB b-1 MSB MSB / MSError AB SSAB (a - 1)(b - 1) MSAB MSAB/ MSError Error SSError ab(n - 1) MSError Total SSTotal abn - 1
  • 64. Computing Formulae a b n Let T••• = ∑∑∑ yijk i =1 j =1 k =1 b n a n n Ti•• = ∑∑ yijk , T• j • = ∑∑ yijk , Tij • = ∑ yijk j =1 k =1 i =1 k =1 k =1 a b 2 n T••• Then SSTotal = ∑∑∑ yijk − 2 i =1 j =1 k =1 nab 2 T a T 2 2 T a T 2 SS A = ∑ , SS B = ∑ • j• − i •• ••• − ••• i =1 nb nab i =1 na nab a T2 a 2 a T2 2 Ti•• T••• SS AB = ∑ −∑ −∑ ij • • j• + , i =1 n i =1 nb i =1 na nab
  • 65. and SS Error = SSTotal − SS A − SS B − SS AB
  • 67. One way Multivariate Analysis of Variance (MANOVA) Comparing k p-variate Normal Populations
  • 68. The F test – for comparing k means Situation • We have k normal populations r • Let µi and Σ denote the mean vector and covariance matrix of population i. • i = 1, 2, 3, … k. • Note: we assume that the covariance matrix for each population is the same. Σ1 = Σ 2 = K = Σ k = Σ
  • 69. We want to test r r r r H 0 : µ1 = µ 2 = µ3 = K = µ k against r r H A : µi ≠ µ j for at least one pair i, j
  • 70. The data • Assume we have collected data from each of k populations r r r • Let xi1 , xi 2 ,K , xin denote the n observations from population i. • i = 1, 2, 3, … k.
  • 71. Computing Formulae:  n  Compute ∑ x1ij    r n  j =1   T1i  r 1) Ti = ∑ xij = Total vector for sample i = M = M   n    j =1  x  Tpi   ∑ pij   j =1   G1  r k r k ni r   2) G = ∑ Ti = ∑∑ xij =  M  = Grand Total vector i =1 i =1 j =1 G p    3) N = kn = Total sample size
  • 72.  k n 2 k n   ∑∑ x1ij L ∑∑ x1ij x pij  k n r r  i =1 j =1 i =1 j =1  4) ∑∑ xij xij =  ′ M O M  i =1 j =1  k n k n   x2  ∑∑ x1ij x pij L  i =1 j =1 ∑∑ pij   i =1 j =1   1 k 2 1 k   n ∑ T1i L ∑ T1iTpi  n i =1 5) 1 k r r  i =1  ∑ TTi′ =  M n i =1 i O M   k k   1 ∑ T1iTpi L 1 ∑ Tpi 2  n i =1  n i =1  
  • 73. Let 1 k rr 1 r r H = ∑ TTi′− GG ′ i n i =1 N  1 k 2 G12 1 k G1G p   ∑ T1i − N L ∑ T1iTpi − N   n i =1 n i =1  = M O M   k   1 T T − G1G p 1 k 2 G12   n ∑ 1i pi ∑ T1i − N  L  i =1 N n i =1   k k  n∑ ( x1i• − x1•• ) L n ∑ ( x1i• − x1•• ) ( x pi• − x p•• )  2   i =1 i =1  = M O M   k k   n∑ ( x1i• − x1•• ) ( x pi• − x p•• ) L n∑ ( x pi• − x p•• ) 2   i =1  i =1   = the Between SS and SP matrix
  • 74. k n r r 1 k rr Let E = ∑∑ xij xij − ∑ TTi ′ ′ i i =1 j =1 n i =1  k n 2 1 k 2 k n 1 k   ∑∑ x1ij − n ∑ T1i L ∑∑ x1ij x pij − ∑ T1iTpi  n i =1  i =1 j =1 i =1 i =1 j =1  = M O M   k n   1 k k n 1 k x 2 − ∑ Tpi   ∑∑ 1ij pij n ∑ 1i pi ∑∑ pij n i=1 2  x x − TT L  i =1 j =1 i =1 i =1 j =1   k n k n  ∑∑ ( x1ij − x1i• ) L ∑∑ ( x1ij − x1i• ) ( x pij − x pi • )  2   i =1 j =1 i =1 j =1  = M O M   k n k n   ∑∑ ( 1ij 1i• ) ( pij  x − x pi• ) ∑∑ ( x pij − x pi• )  2 x −x L  i =1 j =1 i =1 j =1   = the Within SS and SP matrix
  • 75. The Manova Table Source SS and SP matrix  h11 L h1 p    Between H = M O M   h1 p L hpp     e11 L e1 p    Within E= M O M  e1 p L e pp   
  • 76. There are several test statistics for testing r r r r H 0 : µ1 = µ 2 = µ3 = K = µ k against r r H A : µi ≠ µ j for at least one pair i, j
  • 77. 1. Roy’s largest root λ1 = largest eigenvalue of HE −1 This test statistic is derived using Roy’s union intersection principle 2. Wilk’s lambda (Λ) E 1 Λ= = H+E HE −1 + I This test statistic is derived using the generalized Likelihood ratio principle
  • 78. 3. Lawley-Hotelling trace statistic T02 = trHE −1 = sum of the eigenvalues of HE−1 4. Pillai trace statistic (V) V = trH ( H + E ) −1
  • 79. Example In the following study, n = 15 first year university students from three different School regions (A, B and C) who were each taking the following four courses (Math, biology, English and Sociology) were observed: The marks on these courses is tabulated on the following slide:
  • 80. The data Educational Region A B C Student Math Biology English Sociology Student Math Biology English Sociology Student Math Biology English Sociology 1 62 65 67 76 1 65 55 35 43 1 47 47 98 78 2 54 61 75 70 2 87 81 59 64 2 57 69 68 45 3 53 53 53 59 3 75 67 56 68 3 65 71 77 62 4 48 56 73 81 4 74 70 55 66 4 41 64 68 58 5 60 55 49 60 5 83 71 40 52 5 56 54 86 64 6 55 52 34 41 6 59 48 48 57 6 63 73 88 76 7 76 71 35 40 7 61 47 46 54 7 43 62 84 78 8 58 52 58 46 8 81 77 51 45 8 28 47 65 58 9 75 71 60 59 9 77 68 42 49 9 47 54 90 78 10 55 51 69 75 10 82 84 63 70 10 42 44 79 73 11 72 74 64 59 11 68 64 35 44 11 50 53 89 89 12 72 75 51 47 12 60 53 60 65 12 46 61 91 82 13 76 69 69 57 13 94 88 51 63 13 74 78 99 86 14 44 48 65 65 14 96 88 67 81 14 63 66 94 86 15 89 71 59 67 15 84 75 46 67 15 69 82 78 73
  • 81. Summary Statistics r r 15 r 15 r 15 r x′ = A 63.267 61.600 58.733 60.133 ′ x• = x′ + A ′ ′ xB + xC = 45 45 45 160.638 104.829 -32.638 -47.110 SA = 104.829 92.543 -4.900 -22.229 64.133 64.111 64.200 63.911 -32.638 -4.900 155.638 128.967 -47.110 -22.229 128.967 159.552 r 76.400 69.067 50.267 59.200 x′ = B 14 14 14 141.257 155.829 45.100 60.914 S Pooled = S A + S B + SC = 155.829 185.924 61.767 71.057 42 42 42 SB = 45.100 61.767 96.495 93.371 60.914 71.057 93.371 123.600 152.654 125.878 22.092 16.354 125.878 138.283 20.003 16.133 22.092 20.003 122.892 112.408 r ′ xC = 52.733 61.667 83.600 72.400 16.354 16.133 112.408 146.517 156.067 116.976 53.814 35.257 SC = 116.976 53.814 136.381 3.143 3.143 116.543 -0.429 114.886 35.257 -0.429 114.886 156.400
  • 82. Computations : r n r 1) Ti = ∑ xij = Total vector for sample i j =1  G1  r k r k ni r   2) G = ∑ Ti = ∑∑ xij =  M  = Grand Total vector i =1 i =1 j =1 G p    Math Biology English Sociology A 949 924 881 902 Totals B 1146 1036 754 888 C 791 925 1254 1086 Grand Totals G 2886 2885 2889 2876 3) N = kn = Total sample size = 45
  • 83.  k n 2 k n   ∑∑ x1ij L ∑∑ x1ij x pij  k n r r  i =1 j =1 i =1 j =1  4) ∑∑ xij xij =  ′ M O M  i =1 j =1  k n k n   x2  ∑∑ x1ij x pij L  i =1 j =1 ∑∑ pij   i =1 j =1  195718 191674 180399 182865 191674 191321 184516 184542 = 180399 184516 199641 193125 182865 184542 193125 191590
  • 84.  1 k 2 1 k   n ∑ T1i L ∑ T1iTpi  n i =1 5) 1 k r r  i =1  ∑ TTi′ =  M n i =1 i O M   k k   1 ∑ T1iTpi L 1 ∑ Tpi 2  n i =1  n i =1   189306.53 186387.13 179471.13 182178.13 186387.13 185513.13 183675.87 183864.40 = 179471.13 183675.87 194479.53 188403.87 182178.13 183864.40 188403.87 185436.27
  • 85. Now 1 k rr 1 r r H = ∑ TTi′− GG ′ i n i =1 N 4217.733333 1362.466667 -5810.066667 -2269.333333 1362.466667 552.5777778 -1541.133333 -519.1555556 = -5810.066667 -1541.133333 9005.733333 3764.666667 -2269.333333 -519.1555556 3764.666667 1627.911111 = the Between SS and SP matrix
  • 86. k n r r 1 k rr Let E = ∑∑ xij xij − ∑ TTi ′ ′ i i =1 j =1 n i =1  k n 2 1 k 2 k n 1 k   ∑∑ x1ij − n ∑ T1i L ∑∑ x1ij x pij − ∑ T1iTpi  n i =1  i =1 j =1 i =1 i =1 j =1  = M O M   k n   1 k k n 1 k x 2 − ∑ Tpi   ∑∑ 1ij pij n ∑ 1i pi ∑∑ pij n i=1 2  x x − TT L  i =1 j =1 i =1 i =1 j =1  6411.467 5286.867 927.867 686.867 = 5286.867 5807.867 840.133 677.600 927.867 840.133 5161.467 4721.133 686.867 677.600 4721.133 6153.733 = the Within SS and SP matrix
  • 87. Using SPSS to perform MANOVA
  • 88. Selecting the variables and the Factors
  • 89. The output c Multivariate Tests Effect Value F Hypothesis df Error df Sig. Intercept Pillai's Trace .984 586.890a 4.000 39.000 .000 Wilks' Lambda .016 586.890a 4.000 39.000 .000 Hotelling's Trace 60.194 586.890a 4.000 39.000 .000 Roy's Largest Root 60.194 586.890a 4.000 39.000 .000 High_School Pillai's Trace .883 7.913 8.000 80.000 .000 Wilks' Lambda .161 14.571a 8.000 78.000 .000 Hotelling's Trace 4.947 23.501 8.000 76.000 .000 Roy's Largest Root 4.891 48.913b 4.000 40.000 .000 a. Exact statistic b. The statistic is an upper bound on F that yields a lower bound on the significance level. c. Design: Intercept+High_School
  • 90. Univariate Tests Tests of Between-Subjects Effects Type III Sum Source Dependent Variable of Squares df Mean Square F Sig. Corrected Model Math 4217.733a 2 2108.867 13.815 .000 Biology 552.578b 2 276.289 1.998 .148 English 9005.733c 2 4502.867 36.641 .000 Sociology 1627.911d 2 813.956 5.555 .007 Intercept Math 185088.800 1 185088.800 1212.473 .000 Biology 184960.556 1 184960.556 1337.555 .000 English 185473.800 1 185473.800 1509.241 .000 Sociology 183808.356 1 183808.356 1254.515 .000 High_School Math 4217.733 2 2108.867 13.815 .000 Biology 552.578 2 276.289 1.998 .148 English 9005.733 2 4502.867 36.641 .000 Sociology 1627.911 2 813.956 5.555 .007 Error Math 6411.467 42 152.654 Biology 5807.867 42 138.283 English 5161.467 42 122.892 Sociology 6153.733 42 146.517 Total Math 195718.000 45 Biology 191321.000 45 English 199641.000 45 Sociology 191590.000 45 Corrected Total Math 10629.200 44 Biology 6360.444 44 English 14167.200 44 Sociology 7781.644 44 a. R Squared = .397 (Adjusted R Squared = .368) b. R Squared = .087 (Adjusted R Squared = .043) c. R Squared = .636 (Adjusted R Squared = .618) d. R Squared = .209 (Adjusted R Squared = .172)