Dummy Variable Models
“ Using Dummy Variables in Wage Discrimination Cases” Multiple Regression  Sandy:  pages 603 - 613 Also  read  paper  titled:
Are Male Nurses Discriminated Against? male  nurses  0 female  nurses Years of experience, X i W f _  4 ^ W m _  3 ^ ~ m W  3 ~ W f ~  4 ~   ~ adjusted  for  experience  not  adjusted  for  experience  o o o o o o o o o o o o + + + + + + + + + + + + + + + + + + + + + + + + + o o o   ~
I.  Dummy Variables -    Adjusting the  intercept .   Adjusting the  slope .   Adjusting both  intercept   and  slope .
Intercept Dummy Variables Dummy variables are binary (0,1) D t  = 1 if  red  car, D t  = 0 otherwise. y t   =   1   +   2 X t  +   3 D t  +  e t y t   =  speed of car in miles per hour X t   =  age of car in years Police:  red  cars travel faster . H 0 :    3  = 0 H 1 :    3  > 0
y t   =   1   +   2 X t  +   3 D t  +  e t red  cars :  y t   =  (  1  +   3 ) +   2 X t  +  e t   other cars :  y t   =   1  +   2 X t  +  e t   y t X t miles per  hour age in years 0  1  +   3  1  2  2 red  cars other cars
Slope Dummy Variables y t   =   1   +   2 X t  +   3 D t X t  +  e t y t   =   1  + (  2  +   3 )X t  +  e t   y t   =   1  +   2 X t  +  e t   y t X t value of porfolio years 0  2  +   3  1  2 stocks bonds Stock portfolio: D t  = 1  Bond portfolio: D t  = 0  1  = initial investment
Different Intercepts & Slopes y t   =   1   +   2 X t  +   3 D t   +   4 D t X t  +  e t y t   =  (  1  +   3 ) + (  2  +   4 )X t  +  e t y t   =   1  +   2 X t  +  e t   y t X t harvest weight of  corn rainfall  2  +   4  1  2 “ miracle” regular “ miracle” seed: D t  = 1  regular seed: D t  = 0   1  +   3
y t  =   1  +   2  X t  +   3   D t   + e t  2  1 +   3  2  1 y t X t Men Women 0 y t   =   1  +   2  X t   + e t For  men    D t   = 1. For  women    D t  = 0. years of experience y t   = (  1 +   3 ) +   2  X t   + e t wage rate . . Testing for discrimination in starting wage H 0 :    3   =  0   H 1 :    3   >  0
y t  =   1  +   5   X t  +   6   D t  X t  + e t  5  5  +  6  1 y t X t Men Women 0 y t   =   1   + (  5  +  6   )X t   + e t y t  =   1   +   5  X t   + e t For men  D t  = 1. For women  D t  = 0. Men and women have the same  starting  wage,   1  , but  their  wage rates increase at different  rates  (diff.=   6   ).  6   >      means that  men’s wage rates are increasing  faster  than  women's wage rates. years of experience wage rate
y t  =   1  +   2  X t  +   3  D t  +   4  D t  X t  + e t  1  +   3  1  2  2  +   4 y t X t Men Women 0 y t  = (  1  +   3 ) + (  2  +   4 ) X t  + e t y t  =   1  +   2  X t  + e t Women are given a higher starting wage,   1   ,  while men get the lower starting wage,   1  +   3   , (  3   <  0   ).  But, men get a faster rate of increase in their wages,   2  +   4   , which is higher than the rate of increase for women,   2  , (since   4   >  0  ). years  of  experience An  Ineffective  Affirmative  Action  Plan women are started at a higher wage. Note : (  3   <  0  ) wage rate
Testing Qualitative Effects 1.  Test for differences in  intercept . 2.  Test for differences in  slope . Test for differences in both  intercept  and  slope .
H 0 :     vs  1 :     H 0 :     vs  1 :     Y t     1   2 X t   3 D t   4 D t X t b   3 Est . Var b  3 ˜ t n  4 b    4 Est . Var b  4 ˜ t n  4 men:  D t  = 1 ;  women:  D t   = 0   Testing for discrimination in starting wage. Testing for discrimination in wage increases. intercept slope  e t
Why NOW wants one-sided test and Chauvinist Industries wants two-sided.
Are Two  Regressions  Equal? y t  =   1  +   2  X t  +   3   D t  +   4   D t  X t  + e t variations of “The Chow Test”  I.  Assuming equal variances (pooling): men:  D t  = 1 ;  women:  D t   = 0  H o :   3  =   4  = 0  vs.  H 1 : otherwise y t  = wage rate This model assumes equal wage rate variance. X t  = years of experience
Testing    H o :           H 1   :  otherwise  and SSE R   y t  b 1  b 2 X t  2 t  1 T  SSE U   y t  b 1  b  X t  b  D t  b  D t X t  2 t  1 T   SSE R  SSE U   2 SSE U   T  4   F T  4  intercept and slope
y t  =   1  +   2  X t  + e t II.  Allowing for unequal variances: y tm  =   1  +   2  X tm  + e tm y tw  =   1  +   2  X tw  + e tw Everyone: Men only: Women only: SSE R Forcing men and women to have same   1 ,   2 . Allowing men and women to be different. SSE m SSE w where  SSE U  =  SSE m  +   SSE w F = (SSE R     SSE U )/J SSE U  /(T  K) J = # restrictions K=unrestricted coefs.  (running three regressions) J = 2  K = 4
Polynomial Terms y t  =   1  +   2  X   t  +   3   X 2 t   +   4  X 3 t   + e t Linear in parameters but nonlinear in variables: y t  = income;  X t  = age Polynomial Regression y t X   t People retire at different ages or not at all. 90 20 30 40 50 60 80 70
y t  =   1  +   2  X   t  +   3   X 2 t   +   4  X 3 t   + e t y t  = income;  X t  = age Polynomial Regression Rate income is changing as we age : Slope changes as  X   t  changes.  y t  X t =   2  + 2   3   X   t   + 3   4  X 2 t
Continuous Interaction y t  =   1  +   2   Z t   +   3  B t  +   4   Z t  B t   + e t Exam grade = f(sleep: Z t   , study time: B t ) Sleep and study time do not act independently. More study time  will be more effective when combined with  more sleep  and less effective when combined with  less sleep .
Your mind sorts things out while you sleep (when you have things to sort out.) y t  =   1  +   2   Z t   +   3  B t  +   4   Z t   B t   + e t Exam grade = f(sleep: Z t   , study time: B t ) Your studying is  more effective with more sleep . continuous interaction   y t  B t =   2  +   4  Z t  y t  Z t =   2  +   4  B t
y t  =   1  +   2   Z t   +   3  B t  +   4   Z t   B t   + e t Exam grade = f(sleep: Z t   , study time: B t ) If  Z t  +  B t  = 24 hours,  then  B t  = (24     Z t ) y t  =   1 +   2   Z t   +  3 (24     Z t )   +  4   Z t  (24     Z t )   + e t y t  = (  1 + 24    3 ) + (  2   3 + 24   4 ) Z t       4 Z 2 t   + e t y t  =   1  +   2   Z t   +   3  Z 2 t   + e t Sleep needed to maximize your exam grade : where   2  > 0  and    3  < 0  y t  Z t =   2  + 2  3  Z t   = 0  2  3 Z t   =
Multicollinearity Correlation among the “ independent” variables. Note: They are independent of the error term, and not of one another.
Let  yi  represent  the  ith person's wage rate and  Xi  represent their months of work experience in the equation:   yi = b1 + b2 Xi + ei  (1) b1 = intercept (starting wage) b2 = increase in the person's    wage for each additional month    of work experience.  ei = error term with mean zero    and estimated variance  s2.
yi  =  b1 + b2 Xi + b3  Mi  + b4  Fi  + ei  (2) Fi  = 1  if   female   Fi   = 0 if  male . Mi  = 1  if  male  Mi  = 0  if   female .
yi  =  b1 + b2 Xi + b3   Mi   + b4  Fi  + ei  (2)   Unfortunately this equation contains   an   underidentified   set of parameters   (b1, b3, and b4) and cannot be estimated   without some  restriction    on the coefficients.
To see this point, separate out the   men's  equation implied by equation (2)  from the  women's  equation.   For the  men's  equation  Mi  =1 and  Fi  =0.    For  men , equation (2) becomes:   yi  =  (b1 + b3) + b2 Xi + ei  (3) yi  =  b1 + b2 Xi + b3  Mi  + b4  Fi  + ei  (2)
For  women ,  Mi  =0 and  Fi  =1.   For  women , equation (2) becomes:   yi  =  (b1 + b4) + b2 Xi + ei  (4)
Unfortunately, although we get estimates  of the intercepts (b1 + b3) and (b1 + b4),  the value of b1  cannot be separated    from the values of b3 and b4.   Some  restriction  is needed   to achieve  identification     of b1, b3 and b4.
One such restriction is b1 = 0.     We can drop the original intercept term,   b1, since  men  and  women  already  have their own intercept terms,    b3  and  b4 , respectively.
Underidentification of equation (2)   can also be expressed in matrix terms.    First, rewrite equation (2) putting the   explanatory variables in a row vector   multiplied by the corresponding column   vector of their respective coefficients: y i    1  X i  M i  F i     2  3  4    i   5  1
This only represents the   ith  observation where i = 1, ..., n.   To represent the entire set   of n observations at once, we need to  &quot;pull the window shade down&quot; as follows: y 1 y 2 M y n  1 X 1 M 1 F 1 1 X 2 M 2 F 2 M M M M 1 X n M n F n  1  2  3  4   1  2 M  n (6)
Equation (6) presents us with an X matrix  whose first column (the column of ones)  is an exact linear combination of the last  two columns (the M and F columns).  Since Mi is always zero when Fi is equal  to one and Mi is always one when Fi is  equal to zero, then it always holds  that Mi + Fi = 1.   Therefore, the first column is equal to the  sum of the last two columns.
Since Mi is always zero when Fi is equal  to one and Mi is always one when Fi is  equal to zero, then it always holds  that Mi + Fi = 1.  1 1 M 1  M 1 M 2 M M n  F 1 F 2 M F n ( 9 )
Equation (6) and, therefore,equation (2),  represent a case of perfect  multicollinearity .   This means that a restriction must be  introduced that drops one of these columns  out of the regression. One such restriction is  b1 = 0 ,  which means dropping the original intercept out of the regression model to  provide the following reduced model:   yi  =  b2 Xi  +  b3  Mi  +  b4  Fi   +  ei  (10) Now  men and women have separate intercepts and no common intercept is necessary.
yi = b2 Xi + b3  Mi  + b4  Fi  + ei b2 b3 b2 b4 yi Xi Male Female 0 yi  =  b3  +  b2 Xi  + ei yi  =  b4  +  b2 Xi  + ei For  males  Mi  = 1  and  Fi  = 0. For  females  Mi  = 0  and  Fi  = 1. Males  and  females  have  different starting  salaries ,  b3 > b4 , but  their salaries  increase  at  the  same  rate, b2.
y i  = b2 X i  + b3  M i  + b4  F i  + e i b2 b3 b2 b4 y i X i Male Female 0 y i   =  b3  +  b2 X i   + e i y i   =  b4  +  b2 X i   + e i For  males  Mi  = 1  and  Fi  = 0. For  females  Mi  = 0  and  Fi  = 1. Males  and  females  have  different starting  salaries ,  b3 > b4 , but  their salaries  increase  at  the  same  rate, b2. years of experience
y i  = b1 + b5  M i  X i  + b6  F i  X i  + e i b6 b5 b1 y i X i Male Female 0 y i   =  b1   +  b5 X i   + e i y i   =  b1   +  b6 X i   + e i For males  Mi = 1  and  Fi = 0. For females  Mi = 0  and  Fi = 1. Males and Females have the same  starting  salary  b1, but  their  salaries increase at different  rates  (  b5  vs.  b6  ). b5   >   b6   means that  men salaries  are increasing  faster  than  women's salaries. years of experience
y i  =  b3 M i  + b4 F i  + b5 M i  X i  + b6 F i  X i  + e i b3 b4 For males  Mi = 1  and  Fi = 0. For females  Mi = 0  and  Fi = 1. b6 b5 y i X i Male Female 0 y i   =  b3  +  b5 X i   + e i y i   =  b4  +  b6 X i   + e i Females start with a higher starting salary,  b4 ,  while men get the lower starting salary,  b3 . But, men get a faster rate of increase in their salaries,  b5 , which is higher than the rate of increase for females,  b6 .  (  b5  >  b6  ). years  of  experience Chauvinist Industries Affirmative Action Plan
y i  = b2 X i  + b3  M i  + b4  F i  + e i b2 b3 b2 b4 y i X i Male Female 0 y i   =  b3  +  b2 X i   + e i y i   =  b4  +  b2 X i   + e i For  males  Mi  = 1  and  Fi  = 0. For  females  Mi  = 0  and  Fi  = 1. Males  and  females  have  different starting  salaries ,  b3 > b4 , but  their salaries  increase  at  the  same  rate, b2. Back to our basic model: years of experience
Since under our null hypothesis  the raw score test statistic:    has a  mean    and a  variance ,    we can standardize    by subtracting the mean (zero)  and dividing by the standard deviation  (square root of the variance)  to get the standardized test statistic:   b 3 – b 4 Var ( b 3 – b 4 ) b 3 – b 4
To test the null hypothesis: Z  ( b   b   )  0 Var ( b    b   ) ~  ( 0 , 1 )
If the var iance of the y i ,  2 , is unknown , then Var ( b  3  b  4 ) is also unknown and must be estimated from the exp ression : Est . Var ( b  3  b  4 )  Est . Var ( b  3 )  Est . Var ( b  4 )  2 Est . Cov ( b  3 , b  4 )
Use the  sample variance  as an estimator of the  population variance :
The values for the following expression are obtained in practice from the  diagonal and  off-diagonal  elements of the  estimated variance-covariance matrix : Est . Var ( b  3  b  4 )  Est . Var ( b  3 )  Est . Var ( b  4 )  2 Est . Cov ( b  3 , b  4 )
y i  = b1 + b2 X i  + b3  M i b2 (b1 + b3) b2 b1 y i X i Male Female 0 y i   =  ( b1 + b3 ) +  b2  X i   y i   =  b1 +  b2  X i   Males  and  females  have  different starting  salaries ,  b3   >   0  , but  their salaries  increase  at  the  same  rate, b2. years of experience Alternative :  make women the default group ^ ^ ^
y i   =  b1 + b2 X i  + b3 M i  + b4 D i y i   =  (b1 + b3 + b4)  +  b2 X i y i   =  (b1 + b4)  +  b2 X i y i   =  (b1 + b3)  +  b2 X i y i   =  b1  +  b2 X i characteristic  dummy variables: male college grad: female college grad: male not a grad: female not a grad: ^ ^ ^ ^ ^
years of experience 0 X  i M-D  (male-degree) F-D  (female-degree) M-N  (male-no degree) F-N  (female-no degree) y i wage rate very restrictive assumption  y i   =  b1 + b2 X i  + b3 M i  + b4 D i b1 b1+b3 b1+b4 b1+b3+b4 very rigid !!! ^
Creating  Composite   Dummy Variables  ( vs.  characteristic  dummy variables )
Job:  Gender: Karnaugh map for  gender  vs. status of  job :  S I M 15 25 40 F 13 27 40 28 52 80 S =  supervisor I  =  individual men : women :
Occupation  vs.  Job  vs.  Gender Gender: Occupation: Job: C T U S I S I S I M 2 4 3 5 10 16 40 F 1 6 0 7 12 14 40 3 10 3 12 22 30 80 C = Computer T = Other Technical U = Untechnical
Karnaugh Map for  Occupation , Job  Status,  Gender , and  Degree  Status: Degree No Degree C T U S I S I S I D M 1 3 2 5 6 13 30 F 0 3 0 6 7 8 24 N M 1 1 1 0 4 3 10 F 1 3 0 1 5 6 16 3 10 3 12 22 30 80
composite  dummy variables: This defines combined ( instead of separate ) general characteristics. y i   =  b1 + b2 X i  + b3 MN i  + b4 FD i  + b5 MD i years of experience 0 X  i M-D  (male-degree) F-D  (female-degree) M-N  (male-no degree) F-N  (female-no degree) y i wage rate b1 b1 + b3 b1 + b4 b1 + b5 ^
Multiple  Regression Analysis value of  residential property ( buying a home )
A i  = bathrooms  X i  = sq. ft. living space H 0 :     vs. H 1 :       H 0 :     vs. H 1 :       ˆ  Y  i  b  1  b  2 X i  b  3 A i  b  4 A i X i b  3 Est . Var b  3 ˜ t n  4 b  4 Est . Var b  4 ˜ t n  4
Testing   Ho:       H1 :  otherwise   and SSE R   y i  b 1  b 2 X i  2 i  1 n  SSE U   y i  b 1  b  X i  b  A i  b  A i X i  2 i  1 n 
Sale  of House with  Bed and Bath Dummies 800  0  0  0  10.000 1000  0  0  1  20.000 1200  1  0  0  30.000 1500  1  0  0  40.000 1800  1  0  1  50.000 2000  1  0  1  60.000 2200  0  1  0  70.000 2500  0  1  0  80.000 3000  0  1  1  90.000 3500  0  1  1  100.000 PRICE = f ( SQFEET, D2BED, B3BED, A2BATH ) I.  II.  III.  IV.  PRICE (thousands) I.  SQFEET  =  square feet of living space II.  D2BED  =  dummy=1 if two-bedroom house III.  D3BED  =  dummy=1 if three-bedroom house IV.  A2BATH  =  dummy=1 if two-bathroom house
PRICE = f ( SQFEET, D2BED, B3BED, A2BATH ) Sale  of House with  Bed and Bath Dummies ANALYSIS OF VARIANCE SOURCE SUM-OF-SQUARES  DF  MEAN-SQ  F-RATIO  P REGRESSION  8191.943  4  2047.986  176.378  0.000 RESIDUAL  58.057  5  11.611 DURBIN-WATSON  D  STATISTIC:  2.216 FIRST ORDER  AUTOCORRELATION  COEFF:  - 0.153 DEP VAR:  PRICE  N:  10  MULTIPLE R: 0.996  SQUARED MULTIPLE R: 0.993 ADJUSTED SQUARED MULTIPLE R: 0.987  STD ERROR OF ESTIMATE:  3.40
PRICE = f ( SQFEET, D2BED, B3BED, A2BATH ) Sale  of House with  Bed and Bath Dummies DEP VAR:  PRICE   N:  10  MULTIPLE R:  0.996   SQUARED MULTIPLE R:  0.993 ADJUSTED SQUARED MULTIPLE R:  0.987 STD ERROR OF ESTIMATE:  3.40 VARIABLE  COEFF  STD ERR  T  P(2-TAIL) INTERCEPT   - 6.482  4.112  -1.576  0.176  SQFEET   0.021  0.005  3.958  0.011 D2BED   14.662  4.871  3.010  0.030 D3BED   29.803  10.575  2.818  0.037 A2BATH   4.883  3.953  1.235  0.272 ( for 1,000 square feet:  21 - 6.482 = 14.518  or  $14,518 )
VARIABLE  COEFF  STD ERR  T  P(2-TAIL) INTERCEPT   - 6.482  4.112  -1.576  0.176  SQFEET   0.021  0.005  3.958  0.011 D2BED   14.662  4.871  3.010  0.030 D3BED   29.803  10.575  2.818  0.037 A2BATH   4.883  3.953  1.235  0.272 for 1,000 square feet:  21 - 6.482 = 14.518  or  $14,518  add a bathroom : $14,518 4,883 $19,401 add a bedroom : $14,518 14,662 $29,180 add 2 bedrooms : $14,518 29,803 $44,321 add bath and 2 bedrooms: 14,518 + 4,883 + 29,803 = $49,204 Regression Analysis of Sale of Residential Property
Sales Value of Residential Property y = sales value of the property (dollars) X = square feet of living space D1 =dummy vble for  one bedroom  home D2 =dummy vble for  two bedroom  home D3 =dummy vble for  three bedroom  home A1 =dummy vble for  one bathroom  home A2 =dummy vble for  two bathroom  home For a one-bedroom, one-bathroom home,  such that D2=0, D3=0, and A2=0, we have: y i  b 1  b 2 X i  b 3 D 2 i  b 4 D 3 i  b 5 A 2 i ^ y i  b 1  b 2 X i 1 bedroom , 1 bathroom ^
Sales Value of Residential Property For a 2-bedroom, 1-bathroom home,  we have  D2=1, D3=0, and A2=0 ^ ^ y i  b 1  b 2 X i  b 3 D 2 i  b 4 D 3 i  b 5 A 2 i y i  ( b 1  b 3 )  b 2 X i 2 bedroom , 1 bathroom
Sales Value of Residential Property For a 1-bedroom, 2-bathroom home, we have  D2=0, D3=0, and A2=1 ^ ^ y i  b 1  b 2 X i  b 3 D 2 i  b 4 D 3 i  b 5 A 2 i y i  ( b 1  b 5 )  b 2 X i 1 bedroom , 2 bathroom
Sales Value of Residential Property For a 2-bedroom, 2-bathroom home,  we have  D2=1, D3=0, and A2=1 y i  b 1  b 2 X i  b 3 D 2 i  b 4 D 3 i  b 5 A 2 i ^ y i  ( b 1  b 3  b 5 )  b 2 X i 2 bedroom , 2 bathroom ^ y i  ( b 1  b 4  b 5 )  b 2 X i 3 bedroom , 2 bathroom ^ y i  ( b 1  b 4 )  b 2 X i 3 bedroom , 1 bathroom ^
square feet of living space 0 X  i House Sales Model with  Restricted  Intercepts b   b   b  D2-A2  (two bed, two bath) b   b  D2-A1  (two bed, one bath) b   b  D1-A2  (one bed, two bath) b  D1-A1  (one bed,one bath) y i selling price b   b   b  D3-A2  (three bed, two bath) b   b  D3-A1  (three bed, one bath) b  y i  b 1  b 2 X i  b 3 D 2 i  b 4 D 3 i  b 5 A 2 i ^ ^ Rigid !!!
Creating  Composite   Dummy Variables  ( vs.  characteristic  dummy variables )
Bath- rooms How do we create  composite   dummy variables ?  Need  to  account  for  the  interaction effect  between bathrooms  and bedrooms. 1  2  3 1  6  8  26  40 2  7  7  26  40 13  15  52  80 Bedrooms
Composite   dummy variables   are created for each nonempty cell.  Create six  composite  dummy variables:   D1A1=1  if one bed and one bath,  or  D1A1= 0   D1A2=1  if one bed and two bath,  or  D1A2= 0   D2A1=1  if two bed and one bath,  or  D2A1= 0     D2A2=1  if two bed and two bath,  or  D2A2= 0     D3A1=1  if three bed and one bath, or  D3A1= 0     D3A2=1  if three bed and two bath, or  D3A2= 0
Sales Value of Residential Property y = sales value of the property (dollars) X = square feet of living space D1 A1  = interaction  one-bed  &  one-bath D1 A2  = interaction  one-bed  &  two-bath D2 A1  = interaction  two-bed  &  one-bath D2 A2  = interaction  two-bed  &  two-bath D3 A1  = interaction  three-bed  &  one-bath D3 A2  = interaction  three-bed  &  two-bath y i  b 1  b 2 X i  b 3 D1A2 i  b 4 D2A1 i  b 5 D2A2 i ^  b 6 D3A1 i  b 7 D3A2 i
This one equation with all these dummy variables actually is representing  six equations .  You must  substitute in for each of the dummy variables  to generate the  six equations  that are implied by this  one dummy variable equation. For a one-bedroom, one-bathroom home, Since  D1A1 = 1,  while the others are zero: y i  b 1  b 2 X i 1 bedroom , 1 bathroom ^ y i  b 1  b 2 X i  b 3 D1A2 i  b 4 D2A1 i  b 5 D2A2 i ^  b 6 D3A1 i  b 7 D3A2 i
square feet of living space 0 X  i House Sales Model with Unrestricted Intercepts D2-A2  (two bed, two bath) D2-A1  (two bed, one bath) D1-A2  (one bed, two bath) b  D1-A1  (one bed,one bath) y i selling price D3-A2  (three bed, two bath) D3-A1  (three bed, one bath) b 
one-bedroom ,  two-bathroom D1A2 =1, while the others are zero: now  graph  it  !  =======> y i  ( 1  b 3 )  b 2 X i 1 bedroom , 2 bathroom ^ y i  b 1  b 2 X i  b 3 D1A2 i  b 4 D2A1 i  b 5 D2A2 i ^  b 6 D3A1 i  b 7 D3A2 i b
square feet of living space 0 X  i House Sales Model with Unrestricted Intercepts D2-A2  (two bed, two bath) b   b  D2-A1  (two bed, one bath) D1-A2  (one bed, two bath) b  D1-A1  (one bed,one bath) y i selling price D3-A2  (three bed, two bath) D3-A1  (three bed, one bath)
two-bedroom ,  one-bathroom now  graph  it  !  =======> y i  ( b 1  b 4 )  b 2 X i 2 bedroom , 1 bathroom ^ y i  b 1  b 2 X i  b 3 D1A2 i  b 4 D2A1 i  b 5 D2A2 i ^  b 6 D3A1 i  b 7 D3A2 i D2A1 =1, while the others are zero:
square feet of living space 0 X  i House Sales Model with Unrestricted Intercepts D2-A2  (two bed, two bath) b   b  D2-A1  (two bed, one bath) b   b  D1-A2  (one bed, two bath) b  D1-A1  (one bed,one bath) y i selling price D3-A2  (three bed, two bath) D3-A1  (three bed, one bath)
two-bedroom ,  two-bathroom now  graph  it  !  =======> y i  ( b 1  b 5 )  b 2 X i 2 bedroom , 2 bathroom ^ y i  b 1  b 2 X i  b 3 D1A2 i  b 4 D2A1 i  b 5 D2A2 i ^  b 6 D3A1 i  b 7 D3A2 i D2A2 =1, while the others are zero:
square feet of living space 0 X  i House Sales Model with Unrestricted Intercepts b   b  D2-A2  (two bed, two bath) b   b  D2-A1  (two bed, one bath) b   b  D1-A2  (one bed, two bath) b 1 D1-A1  (one bed,one bath) y i selling price D3-A2  (three bed, two bath) D3-A1  (three bed, one bath)
square feet of living space 0 X  i House Sales Model with Unrestricted Intercepts b   b   D2-A2  (two bed, two bath) b   b  D2-A1  (two bed, one bath) b   b  D1-A2  (one bed, two bath) b 1 D1-A1  (one bed,one bath) y i selling price b   b  D3-A2  (three bed, two bath) b   b  D3-A1  (three bed, one bath)
Creating  Composite   Dummy Variables  ( vs.  characteristic  dummy variables )
Bath- rooms How do we create  composite   dummy variables ?  Need  to  account  for  the  interaction effect  between bathrooms  and bedrooms. 1  2  3 1  6  8  26  40 2  7  7  26  40 13  15  52  80 Bedrooms
Bedrooms  vs.  Baths  vs.  Garage Baths Bedrooms Cars in Garage: 1 2 3 1 2 1 2 1 2 1 2 4 3 5 10 16 40 2 1 6 0 7 12 14 40 3 10 3 12 22 30 80
Karnaugh Map for  Bedrooms ,  Baths ,  Garage , and  School : Adams Saint Joseph 1 2 3 1 2 1 2 1 2 A 1 1 3 2 5 6 13 30 2 0 3 0 6 7 8 24 J 1 1 1 1 0 4 3 10 2 1 3 0 1 5 6 16 3 10 3 12 22 30 80

Dummy Variable Regression

  • 1.
  • 2.
    “ Using DummyVariables in Wage Discrimination Cases” Multiple Regression Sandy: pages 603 - 613 Also read paper titled:
  • 3.
    Are Male NursesDiscriminated Against? male nurses 0 female nurses Years of experience, X i W f _  4 ^ W m _  3 ^ ~ m W  3 ~ W f ~  4 ~   ~ adjusted for experience not adjusted for experience o o o o o o o o o o o o + + + + + + + + + + + + + + + + + + + + + + + + + o o o   ~
  • 4.
    I. DummyVariables - Adjusting the intercept . Adjusting the slope . Adjusting both intercept and slope .
  • 5.
    Intercept Dummy VariablesDummy variables are binary (0,1) D t = 1 if red car, D t = 0 otherwise. y t =  1 +  2 X t +  3 D t + e t y t = speed of car in miles per hour X t = age of car in years Police: red cars travel faster . H 0 :  3 = 0 H 1 :  3 > 0
  • 6.
    y t =  1 +  2 X t +  3 D t + e t red cars : y t = (  1 +  3 ) +  2 X t + e t other cars : y t =  1 +  2 X t + e t y t X t miles per hour age in years 0  1 +  3  1  2  2 red cars other cars
  • 7.
    Slope Dummy Variablesy t =  1 +  2 X t +  3 D t X t + e t y t =  1 + (  2 +  3 )X t + e t y t =  1 +  2 X t + e t y t X t value of porfolio years 0  2 +  3  1  2 stocks bonds Stock portfolio: D t = 1 Bond portfolio: D t = 0  1 = initial investment
  • 8.
    Different Intercepts &Slopes y t =  1 +  2 X t +  3 D t +  4 D t X t + e t y t = (  1 +  3 ) + (  2 +  4 )X t + e t y t =  1 +  2 X t + e t y t X t harvest weight of corn rainfall  2 +  4  1  2 “ miracle” regular “ miracle” seed: D t = 1 regular seed: D t = 0  1 +  3
  • 9.
    y t =  1 +  2 X t +  3 D t + e t  2  1 +  3  2  1 y t X t Men Women 0 y t =  1 +  2 X t + e t For men  D t = 1. For women  D t = 0. years of experience y t = (  1 +  3 ) +  2 X t + e t wage rate . . Testing for discrimination in starting wage H 0 :  3 = 0 H 1 :  3 > 0
  • 10.
    y t =  1 +  5 X t +  6 D t X t + e t  5  5 +  6  1 y t X t Men Women 0 y t =  1 + (  5 +  6 )X t + e t y t =  1 +  5 X t + e t For men D t = 1. For women D t = 0. Men and women have the same starting wage,  1 , but their wage rates increase at different rates (diff.=  6 ).  6 >   means that men’s wage rates are increasing faster than women's wage rates. years of experience wage rate
  • 11.
    y t =  1 +  2 X t +  3 D t +  4 D t X t + e t  1 +  3  1  2  2 +  4 y t X t Men Women 0 y t = (  1 +  3 ) + (  2 +  4 ) X t + e t y t =  1 +  2 X t + e t Women are given a higher starting wage,  1 , while men get the lower starting wage,  1 +  3 , (  3 < 0 ). But, men get a faster rate of increase in their wages,  2 +  4 , which is higher than the rate of increase for women,  2 , (since  4 > 0 ). years of experience An Ineffective Affirmative Action Plan women are started at a higher wage. Note : (  3 < 0 ) wage rate
  • 12.
    Testing Qualitative Effects1. Test for differences in intercept . 2. Test for differences in slope . Test for differences in both intercept and slope .
  • 13.
    H 0 :    vs  1 :    H 0 :    vs  1 :    Y t   1   2 X t   3 D t   4 D t X t b   3 Est . Var b  3 ˜ t n  4 b    4 Est . Var b  4 ˜ t n  4 men: D t = 1 ; women: D t = 0 Testing for discrimination in starting wage. Testing for discrimination in wage increases. intercept slope  e t
  • 14.
    Why NOW wantsone-sided test and Chauvinist Industries wants two-sided.
  • 15.
    Are Two Regressions Equal? y t =  1 +  2 X t +  3 D t +  4 D t X t + e t variations of “The Chow Test” I. Assuming equal variances (pooling): men: D t = 1 ; women: D t = 0 H o :  3 =  4 = 0 vs. H 1 : otherwise y t = wage rate This model assumes equal wage rate variance. X t = years of experience
  • 16.
    Testing  H o :        H 1 : otherwise and SSE R   y t  b 1  b 2 X t  2 t  1 T  SSE U   y t  b 1  b  X t  b  D t  b  D t X t  2 t  1 T   SSE R  SSE U   2 SSE U   T  4   F T  4  intercept and slope
  • 17.
    y t =  1 +  2 X t + e t II. Allowing for unequal variances: y tm =  1 +  2 X tm + e tm y tw =  1 +  2 X tw + e tw Everyone: Men only: Women only: SSE R Forcing men and women to have same  1 ,  2 . Allowing men and women to be different. SSE m SSE w where SSE U = SSE m + SSE w F = (SSE R  SSE U )/J SSE U /(T  K) J = # restrictions K=unrestricted coefs. (running three regressions) J = 2 K = 4
  • 18.
    Polynomial Terms yt =  1 +  2 X t +  3 X 2 t +  4 X 3 t + e t Linear in parameters but nonlinear in variables: y t = income; X t = age Polynomial Regression y t X t People retire at different ages or not at all. 90 20 30 40 50 60 80 70
  • 19.
    y t =  1 +  2 X t +  3 X 2 t +  4 X 3 t + e t y t = income; X t = age Polynomial Regression Rate income is changing as we age : Slope changes as X t changes.  y t  X t =  2 + 2  3 X t + 3  4 X 2 t
  • 20.
    Continuous Interaction yt =  1 +  2 Z t +  3 B t +  4 Z t B t + e t Exam grade = f(sleep: Z t , study time: B t ) Sleep and study time do not act independently. More study time will be more effective when combined with more sleep and less effective when combined with less sleep .
  • 21.
    Your mind sortsthings out while you sleep (when you have things to sort out.) y t =  1 +  2 Z t +  3 B t +  4 Z t B t + e t Exam grade = f(sleep: Z t , study time: B t ) Your studying is more effective with more sleep . continuous interaction  y t  B t =  2 +  4 Z t  y t  Z t =  2 +  4 B t
  • 22.
    y t =  1 +  2 Z t +  3 B t +  4 Z t B t + e t Exam grade = f(sleep: Z t , study time: B t ) If Z t + B t = 24 hours, then B t = (24  Z t ) y t =  1 +  2 Z t +  3 (24  Z t ) +  4 Z t (24  Z t ) + e t y t = (  1 + 24  3 ) + (  2   3 + 24  4 ) Z t   4 Z 2 t + e t y t =  1 +  2 Z t +  3 Z 2 t + e t Sleep needed to maximize your exam grade : where  2 > 0 and  3 < 0  y t  Z t =  2 + 2  3 Z t = 0  2  3 Z t =
  • 23.
    Multicollinearity Correlation amongthe “ independent” variables. Note: They are independent of the error term, and not of one another.
  • 24.
    Let yi represent the ith person's wage rate and Xi represent their months of work experience in the equation: yi = b1 + b2 Xi + ei (1) b1 = intercept (starting wage) b2 = increase in the person's wage for each additional month of work experience. ei = error term with mean zero and estimated variance s2.
  • 25.
    yi = b1 + b2 Xi + b3 Mi + b4 Fi + ei (2) Fi = 1 if female Fi = 0 if male . Mi = 1 if male Mi = 0 if female .
  • 26.
    yi = b1 + b2 Xi + b3 Mi + b4 Fi + ei (2) Unfortunately this equation contains an underidentified set of parameters (b1, b3, and b4) and cannot be estimated without some restriction on the coefficients.
  • 27.
    To see thispoint, separate out the men's equation implied by equation (2) from the women's equation. For the men's equation Mi =1 and Fi =0. For men , equation (2) becomes: yi = (b1 + b3) + b2 Xi + ei (3) yi = b1 + b2 Xi + b3 Mi + b4 Fi + ei (2)
  • 28.
    For women, Mi =0 and Fi =1. For women , equation (2) becomes: yi = (b1 + b4) + b2 Xi + ei (4)
  • 29.
    Unfortunately, although weget estimates of the intercepts (b1 + b3) and (b1 + b4), the value of b1 cannot be separated from the values of b3 and b4. Some restriction is needed to achieve identification of b1, b3 and b4.
  • 30.
    One such restrictionis b1 = 0. We can drop the original intercept term, b1, since men and women already have their own intercept terms, b3 and b4 , respectively.
  • 31.
    Underidentification of equation(2) can also be expressed in matrix terms. First, rewrite equation (2) putting the explanatory variables in a row vector multiplied by the corresponding column vector of their respective coefficients: y i    1  X i  M i  F i     2  3  4    i   5  1
  • 32.
    This only representsthe ith observation where i = 1, ..., n. To represent the entire set of n observations at once, we need to &quot;pull the window shade down&quot; as follows: y 1 y 2 M y n  1 X 1 M 1 F 1 1 X 2 M 2 F 2 M M M M 1 X n M n F n  1  2  3  4   1  2 M  n (6)
  • 33.
    Equation (6) presentsus with an X matrix whose first column (the column of ones) is an exact linear combination of the last two columns (the M and F columns). Since Mi is always zero when Fi is equal to one and Mi is always one when Fi is equal to zero, then it always holds that Mi + Fi = 1. Therefore, the first column is equal to the sum of the last two columns.
  • 34.
    Since Mi isalways zero when Fi is equal to one and Mi is always one when Fi is equal to zero, then it always holds that Mi + Fi = 1. 1 1 M 1  M 1 M 2 M M n  F 1 F 2 M F n ( 9 )
  • 35.
    Equation (6) and,therefore,equation (2), represent a case of perfect multicollinearity . This means that a restriction must be introduced that drops one of these columns out of the regression. One such restriction is b1 = 0 , which means dropping the original intercept out of the regression model to provide the following reduced model: yi = b2 Xi + b3 Mi + b4 Fi + ei (10) Now men and women have separate intercepts and no common intercept is necessary.
  • 36.
    yi = b2Xi + b3 Mi + b4 Fi + ei b2 b3 b2 b4 yi Xi Male Female 0 yi = b3 + b2 Xi + ei yi = b4 + b2 Xi + ei For males Mi = 1 and Fi = 0. For females Mi = 0 and Fi = 1. Males and females have different starting salaries , b3 > b4 , but their salaries increase at the same rate, b2.
  • 37.
    y i = b2 X i + b3 M i + b4 F i + e i b2 b3 b2 b4 y i X i Male Female 0 y i = b3 + b2 X i + e i y i = b4 + b2 X i + e i For males Mi = 1 and Fi = 0. For females Mi = 0 and Fi = 1. Males and females have different starting salaries , b3 > b4 , but their salaries increase at the same rate, b2. years of experience
  • 38.
    y i = b1 + b5 M i X i + b6 F i X i + e i b6 b5 b1 y i X i Male Female 0 y i = b1 + b5 X i + e i y i = b1 + b6 X i + e i For males Mi = 1 and Fi = 0. For females Mi = 0 and Fi = 1. Males and Females have the same starting salary b1, but their salaries increase at different rates ( b5 vs. b6 ). b5 > b6 means that men salaries are increasing faster than women's salaries. years of experience
  • 39.
    y i = b3 M i + b4 F i + b5 M i X i + b6 F i X i + e i b3 b4 For males Mi = 1 and Fi = 0. For females Mi = 0 and Fi = 1. b6 b5 y i X i Male Female 0 y i = b3 + b5 X i + e i y i = b4 + b6 X i + e i Females start with a higher starting salary, b4 , while men get the lower starting salary, b3 . But, men get a faster rate of increase in their salaries, b5 , which is higher than the rate of increase for females, b6 . ( b5 > b6 ). years of experience Chauvinist Industries Affirmative Action Plan
  • 40.
    y i = b2 X i + b3 M i + b4 F i + e i b2 b3 b2 b4 y i X i Male Female 0 y i = b3 + b2 X i + e i y i = b4 + b2 X i + e i For males Mi = 1 and Fi = 0. For females Mi = 0 and Fi = 1. Males and females have different starting salaries , b3 > b4 , but their salaries increase at the same rate, b2. Back to our basic model: years of experience
  • 41.
    Since under ournull hypothesis the raw score test statistic: has a mean and a variance , we can standardize by subtracting the mean (zero) and dividing by the standard deviation (square root of the variance) to get the standardized test statistic:   b 3 – b 4 Var ( b 3 – b 4 ) b 3 – b 4
  • 42.
    To test thenull hypothesis: Z  ( b   b   )  0 Var ( b    b   ) ~  ( 0 , 1 )
  • 43.
    If the variance of the y i ,  2 , is unknown , then Var ( b  3  b  4 ) is also unknown and must be estimated from the exp ression : Est . Var ( b  3  b  4 )  Est . Var ( b  3 )  Est . Var ( b  4 )  2 Est . Cov ( b  3 , b  4 )
  • 44.
    Use the sample variance as an estimator of the population variance :
  • 45.
    The values forthe following expression are obtained in practice from the diagonal and off-diagonal elements of the estimated variance-covariance matrix : Est . Var ( b  3  b  4 )  Est . Var ( b  3 )  Est . Var ( b  4 )  2 Est . Cov ( b  3 , b  4 )
  • 46.
    y i = b1 + b2 X i + b3 M i b2 (b1 + b3) b2 b1 y i X i Male Female 0 y i = ( b1 + b3 ) + b2 X i y i = b1 + b2 X i Males and females have different starting salaries , b3 > 0 , but their salaries increase at the same rate, b2. years of experience Alternative : make women the default group ^ ^ ^
  • 47.
    y i = b1 + b2 X i + b3 M i + b4 D i y i = (b1 + b3 + b4) + b2 X i y i = (b1 + b4) + b2 X i y i = (b1 + b3) + b2 X i y i = b1 + b2 X i characteristic dummy variables: male college grad: female college grad: male not a grad: female not a grad: ^ ^ ^ ^ ^
  • 48.
    years of experience0 X i M-D (male-degree) F-D (female-degree) M-N (male-no degree) F-N (female-no degree) y i wage rate very restrictive assumption y i = b1 + b2 X i + b3 M i + b4 D i b1 b1+b3 b1+b4 b1+b3+b4 very rigid !!! ^
  • 49.
    Creating Composite Dummy Variables ( vs. characteristic dummy variables )
  • 50.
    Job: Gender:Karnaugh map for gender vs. status of job : S I M 15 25 40 F 13 27 40 28 52 80 S = supervisor I = individual men : women :
  • 51.
    Occupation vs. Job vs. Gender Gender: Occupation: Job: C T U S I S I S I M 2 4 3 5 10 16 40 F 1 6 0 7 12 14 40 3 10 3 12 22 30 80 C = Computer T = Other Technical U = Untechnical
  • 52.
    Karnaugh Map for Occupation , Job Status, Gender , and Degree Status: Degree No Degree C T U S I S I S I D M 1 3 2 5 6 13 30 F 0 3 0 6 7 8 24 N M 1 1 1 0 4 3 10 F 1 3 0 1 5 6 16 3 10 3 12 22 30 80
  • 53.
    composite dummyvariables: This defines combined ( instead of separate ) general characteristics. y i = b1 + b2 X i + b3 MN i + b4 FD i + b5 MD i years of experience 0 X i M-D (male-degree) F-D (female-degree) M-N (male-no degree) F-N (female-no degree) y i wage rate b1 b1 + b3 b1 + b4 b1 + b5 ^
  • 54.
    Multiple RegressionAnalysis value of residential property ( buying a home )
  • 55.
    A i = bathrooms X i = sq. ft. living space H 0 :    vs. H 1 :    H 0 :    vs. H 1 :    ˆ Y i  b  1  b  2 X i  b  3 A i  b  4 A i X i b  3 Est . Var b  3 ˜ t n  4 b  4 Est . Var b  4 ˜ t n  4
  • 56.
    Testing Ho:    H1 : otherwise and SSE R   y i  b 1  b 2 X i  2 i  1 n  SSE U   y i  b 1  b  X i  b  A i  b  A i X i  2 i  1 n 
  • 57.
    Sale ofHouse with Bed and Bath Dummies 800 0 0 0 10.000 1000 0 0 1 20.000 1200 1 0 0 30.000 1500 1 0 0 40.000 1800 1 0 1 50.000 2000 1 0 1 60.000 2200 0 1 0 70.000 2500 0 1 0 80.000 3000 0 1 1 90.000 3500 0 1 1 100.000 PRICE = f ( SQFEET, D2BED, B3BED, A2BATH ) I. II. III. IV. PRICE (thousands) I. SQFEET = square feet of living space II. D2BED = dummy=1 if two-bedroom house III. D3BED = dummy=1 if three-bedroom house IV. A2BATH = dummy=1 if two-bathroom house
  • 58.
    PRICE = f( SQFEET, D2BED, B3BED, A2BATH ) Sale of House with Bed and Bath Dummies ANALYSIS OF VARIANCE SOURCE SUM-OF-SQUARES DF MEAN-SQ F-RATIO P REGRESSION 8191.943 4 2047.986 176.378 0.000 RESIDUAL 58.057 5 11.611 DURBIN-WATSON D STATISTIC: 2.216 FIRST ORDER AUTOCORRELATION COEFF: - 0.153 DEP VAR: PRICE N: 10 MULTIPLE R: 0.996 SQUARED MULTIPLE R: 0.993 ADJUSTED SQUARED MULTIPLE R: 0.987 STD ERROR OF ESTIMATE: 3.40
  • 59.
    PRICE = f( SQFEET, D2BED, B3BED, A2BATH ) Sale of House with Bed and Bath Dummies DEP VAR: PRICE N: 10 MULTIPLE R: 0.996 SQUARED MULTIPLE R: 0.993 ADJUSTED SQUARED MULTIPLE R: 0.987 STD ERROR OF ESTIMATE: 3.40 VARIABLE COEFF STD ERR T P(2-TAIL) INTERCEPT - 6.482 4.112 -1.576 0.176 SQFEET 0.021 0.005 3.958 0.011 D2BED 14.662 4.871 3.010 0.030 D3BED 29.803 10.575 2.818 0.037 A2BATH 4.883 3.953 1.235 0.272 ( for 1,000 square feet: 21 - 6.482 = 14.518 or $14,518 )
  • 60.
    VARIABLE COEFF STD ERR T P(2-TAIL) INTERCEPT - 6.482 4.112 -1.576 0.176 SQFEET 0.021 0.005 3.958 0.011 D2BED 14.662 4.871 3.010 0.030 D3BED 29.803 10.575 2.818 0.037 A2BATH 4.883 3.953 1.235 0.272 for 1,000 square feet: 21 - 6.482 = 14.518 or $14,518 add a bathroom : $14,518 4,883 $19,401 add a bedroom : $14,518 14,662 $29,180 add 2 bedrooms : $14,518 29,803 $44,321 add bath and 2 bedrooms: 14,518 + 4,883 + 29,803 = $49,204 Regression Analysis of Sale of Residential Property
  • 61.
    Sales Value ofResidential Property y = sales value of the property (dollars) X = square feet of living space D1 =dummy vble for one bedroom home D2 =dummy vble for two bedroom home D3 =dummy vble for three bedroom home A1 =dummy vble for one bathroom home A2 =dummy vble for two bathroom home For a one-bedroom, one-bathroom home, such that D2=0, D3=0, and A2=0, we have: y i  b 1  b 2 X i  b 3 D 2 i  b 4 D 3 i  b 5 A 2 i ^ y i  b 1  b 2 X i 1 bedroom , 1 bathroom ^
  • 62.
    Sales Value ofResidential Property For a 2-bedroom, 1-bathroom home, we have D2=1, D3=0, and A2=0 ^ ^ y i  b 1  b 2 X i  b 3 D 2 i  b 4 D 3 i  b 5 A 2 i y i  ( b 1  b 3 )  b 2 X i 2 bedroom , 1 bathroom
  • 63.
    Sales Value ofResidential Property For a 1-bedroom, 2-bathroom home, we have D2=0, D3=0, and A2=1 ^ ^ y i  b 1  b 2 X i  b 3 D 2 i  b 4 D 3 i  b 5 A 2 i y i  ( b 1  b 5 )  b 2 X i 1 bedroom , 2 bathroom
  • 64.
    Sales Value ofResidential Property For a 2-bedroom, 2-bathroom home, we have D2=1, D3=0, and A2=1 y i  b 1  b 2 X i  b 3 D 2 i  b 4 D 3 i  b 5 A 2 i ^ y i  ( b 1  b 3  b 5 )  b 2 X i 2 bedroom , 2 bathroom ^ y i  ( b 1  b 4  b 5 )  b 2 X i 3 bedroom , 2 bathroom ^ y i  ( b 1  b 4 )  b 2 X i 3 bedroom , 1 bathroom ^
  • 65.
    square feet ofliving space 0 X i House Sales Model with Restricted Intercepts b   b   b  D2-A2 (two bed, two bath) b   b  D2-A1 (two bed, one bath) b   b  D1-A2 (one bed, two bath) b  D1-A1 (one bed,one bath) y i selling price b   b   b  D3-A2 (three bed, two bath) b   b  D3-A1 (three bed, one bath) b  y i  b 1  b 2 X i  b 3 D 2 i  b 4 D 3 i  b 5 A 2 i ^ ^ Rigid !!!
  • 66.
    Creating Composite Dummy Variables ( vs. characteristic dummy variables )
  • 67.
    Bath- rooms Howdo we create composite dummy variables ? Need to account for the interaction effect between bathrooms and bedrooms. 1 2 3 1 6 8 26 40 2 7 7 26 40 13 15 52 80 Bedrooms
  • 68.
    Composite dummy variables are created for each nonempty cell. Create six composite dummy variables: D1A1=1 if one bed and one bath, or D1A1= 0 D1A2=1 if one bed and two bath, or D1A2= 0 D2A1=1 if two bed and one bath, or D2A1= 0 D2A2=1 if two bed and two bath, or D2A2= 0 D3A1=1 if three bed and one bath, or D3A1= 0 D3A2=1 if three bed and two bath, or D3A2= 0
  • 69.
    Sales Value ofResidential Property y = sales value of the property (dollars) X = square feet of living space D1 A1 = interaction one-bed & one-bath D1 A2 = interaction one-bed & two-bath D2 A1 = interaction two-bed & one-bath D2 A2 = interaction two-bed & two-bath D3 A1 = interaction three-bed & one-bath D3 A2 = interaction three-bed & two-bath y i  b 1  b 2 X i  b 3 D1A2 i  b 4 D2A1 i  b 5 D2A2 i ^  b 6 D3A1 i  b 7 D3A2 i
  • 70.
    This one equationwith all these dummy variables actually is representing six equations . You must substitute in for each of the dummy variables to generate the six equations that are implied by this one dummy variable equation. For a one-bedroom, one-bathroom home, Since D1A1 = 1, while the others are zero: y i  b 1  b 2 X i 1 bedroom , 1 bathroom ^ y i  b 1  b 2 X i  b 3 D1A2 i  b 4 D2A1 i  b 5 D2A2 i ^  b 6 D3A1 i  b 7 D3A2 i
  • 71.
    square feet ofliving space 0 X i House Sales Model with Unrestricted Intercepts D2-A2 (two bed, two bath) D2-A1 (two bed, one bath) D1-A2 (one bed, two bath) b  D1-A1 (one bed,one bath) y i selling price D3-A2 (three bed, two bath) D3-A1 (three bed, one bath) b 
  • 72.
    one-bedroom , two-bathroom D1A2 =1, while the others are zero: now graph it ! =======> y i  ( 1  b 3 )  b 2 X i 1 bedroom , 2 bathroom ^ y i  b 1  b 2 X i  b 3 D1A2 i  b 4 D2A1 i  b 5 D2A2 i ^  b 6 D3A1 i  b 7 D3A2 i b
  • 73.
    square feet ofliving space 0 X i House Sales Model with Unrestricted Intercepts D2-A2 (two bed, two bath) b   b  D2-A1 (two bed, one bath) D1-A2 (one bed, two bath) b  D1-A1 (one bed,one bath) y i selling price D3-A2 (three bed, two bath) D3-A1 (three bed, one bath)
  • 74.
    two-bedroom , one-bathroom now graph it ! =======> y i  ( b 1  b 4 )  b 2 X i 2 bedroom , 1 bathroom ^ y i  b 1  b 2 X i  b 3 D1A2 i  b 4 D2A1 i  b 5 D2A2 i ^  b 6 D3A1 i  b 7 D3A2 i D2A1 =1, while the others are zero:
  • 75.
    square feet ofliving space 0 X i House Sales Model with Unrestricted Intercepts D2-A2 (two bed, two bath) b   b  D2-A1 (two bed, one bath) b   b  D1-A2 (one bed, two bath) b  D1-A1 (one bed,one bath) y i selling price D3-A2 (three bed, two bath) D3-A1 (three bed, one bath)
  • 76.
    two-bedroom , two-bathroom now graph it ! =======> y i  ( b 1  b 5 )  b 2 X i 2 bedroom , 2 bathroom ^ y i  b 1  b 2 X i  b 3 D1A2 i  b 4 D2A1 i  b 5 D2A2 i ^  b 6 D3A1 i  b 7 D3A2 i D2A2 =1, while the others are zero:
  • 77.
    square feet ofliving space 0 X i House Sales Model with Unrestricted Intercepts b   b  D2-A2 (two bed, two bath) b   b  D2-A1 (two bed, one bath) b   b  D1-A2 (one bed, two bath) b 1 D1-A1 (one bed,one bath) y i selling price D3-A2 (three bed, two bath) D3-A1 (three bed, one bath)
  • 78.
    square feet ofliving space 0 X i House Sales Model with Unrestricted Intercepts b   b   D2-A2 (two bed, two bath) b   b  D2-A1 (two bed, one bath) b   b  D1-A2 (one bed, two bath) b 1 D1-A1 (one bed,one bath) y i selling price b   b  D3-A2 (three bed, two bath) b   b  D3-A1 (three bed, one bath)
  • 79.
    Creating Composite Dummy Variables ( vs. characteristic dummy variables )
  • 80.
    Bath- rooms Howdo we create composite dummy variables ? Need to account for the interaction effect between bathrooms and bedrooms. 1 2 3 1 6 8 26 40 2 7 7 26 40 13 15 52 80 Bedrooms
  • 81.
    Bedrooms vs. Baths vs. Garage Baths Bedrooms Cars in Garage: 1 2 3 1 2 1 2 1 2 1 2 4 3 5 10 16 40 2 1 6 0 7 12 14 40 3 10 3 12 22 30 80
  • 82.
    Karnaugh Map for Bedrooms , Baths , Garage , and School : Adams Saint Joseph 1 2 3 1 2 1 2 1 2 A 1 1 3 2 5 6 13 30 2 0 3 0 6 7 8 24 J 1 1 1 1 0 4 3 10 2 1 3 0 1 5 6 16 3 10 3 12 22 30 80