SlideShare a Scribd company logo
1
Joshua Shea
Wage gap and Variations Between
Gender and Race
Dr Natalia Zhivan
2
Wage inequality and variation
Everybodys main goal at work is to be as efficient as possible and to make as much
money as they can. Although some people may be more efficient or “better at their job” than
the other person they still might get paid less because of biased opinions. People will have
biased opinions on males, females, and their ethnicity. Some biases will be negative some will
be posotive. Allmost all of the biased opinions will be misenterprenting the persons assets. For
example if a women works at a collection agency and is compared to her counterpart male she
might be held at a level below him. These levels might even increase if she has children or is a
different race than him. They might have biased opinions about her if she has children because
they think she might not be as energized as him, cant spend as much time at work,or just isn’t
as good as him because she is a “women”. These mysoganist opinions keep women out of
competitive work positions and can cause negative variations in their salry. Many races also
might be discriminated against because the companies might not think that they are good as
their “white” or other counterparts.
The Huffington post provides a good regression or comparison of how women’s
earnings compare between their different ethnicities and agianst their male counterparts also.
As we can see from looking at this table that all women “as a whole” held up against their same
ethnicity male counterparts have lower earnings. The gap between women and same ethnicity
males are relatively close ranging from 89%-78%. This shows that there is some a large
posibailty of mysoginistic behavior or biased towards women. The variation also could be
explained by women not being able to work full time because of their obligations to children or
other family members. This chart also shows that people do not view race equally.
But how do different ethnicity women hold up agianst their white male counterparts.
They ragress white mens earnings agains womens earnings in different ethnicity catagories. We
find that the variation in womens and their male counterparts earnings become much larger
now. The gape between womens and their same ethnicity male counterparts was 22% at the
largest. When they are ran against their white male counterparts it increases to a maximum
57% varation. This shows that there is a great amount of biased opinions on race and it only
3
becomes greater when you factor in women. We find that Asian women are the closest to their
white male counterparts at 87% of male earnings. Hispanic women are the furthest away from
white male earnigns only making 53% of their white male counterparts.
We should also look at how other ethnicities aren’t paid as much as their white counterparts
excluding the fact that they’re women. Rakesh Kochhar states in his article that “the median
wealth of white households is 20 times that of black households and 18 times that of hispanic
households”. The numbers that he provides us are staggering. He states that black households
had $5,677 in assets, Hispanics $6,325 in assets, and a white household has $113,149. This just
shows how much wage differentiates. We hope to have our model explain why this is and if not
to readjust the parameters and explanatory variables to where it can tell us.
Date of Our Empirical Model
In our model we are going to run regressions and on wage and natural log of wage to
see how wage varies across different variables. We will try to explain why wage varies between
these variables and determine how “good” our model and parameters are. In our model we run
4
the regression ln_wage= age female college hisp black hours part_time child6. All of these
commands or inputs that we run in the regression in Stata stand for different variables. Age is a
continuous variable. Age is the age of an individual measured in years. Female is a dummy
variable. Female is the gender of the person, 1 if female and 0 if male. College is a dummy
variable. College is if the person graduated college or not. College equals 1 if graduated and 0 if
the person hasn’t. Black is a dummy variable. Black is the race of the person. 1 if black 0 if
other. Hispanic is a dummy variable. Hispanic is the ethnicity of the person. 1 if the person is
Hispanic and 0 if other. Health is a categorical variable. Health ranges from 1= excellent, 2=very
good, 3=good, 4= fair, 5=poor. Child 6 is a categorical variable. Child6 is the number of children
under the age of 6. Wage is a continuous variable. Wage is the annual earnings measured in $1
increments. Hours is a continuous variable. Hours is the number of hours worked last week
measured in 1 hour increments. Part_time is a dummy variable. 1 if part time 0 if not part time.
July_temp is a continuous variable. July_temp is the average july temperature in the county of
residence measured in Fahrenheit. Pop65 is a continuous variable. Pop65 is the share of the
population older than the age of 65.
Count 1047
variable observation Mean Std. dev. Min Max
Sum
female
1047 .4555874 .4982616 0 1
Sum
college
1047 .2941738 .4558882 0 1
Sum age 1047 33.53391 7.855929 18 45
Sum
wage
1047 39653.52 48878.27 12 619221
Sum
hours
1047 37.19962 15.24388 0 99
Sum
july_temp
1047 80.03084 4.991196 62.23553 94.44621
Sum
pop65
1047 .252078 .0500064 .0978648 .3961477
5
Describing The Data
Our sample size is 1047 individuals. For female our mean is 45%. So our sample is 45%
female. For college our mean is 29.41%. So 29.41% of our sample has graduated college. For
age our mean is 33.53. This means that the average age of our sample is 33.5 years old. The
minimum age for or sample is 18 and the max is 45. The mean for wage is 39653.52. This mean
the average wage that the individual makes in our sample is $39653.52 a year. The minimum
wage a year for our population is $12 a year and the max is $619221. The mean for hours is
37.19962. This mean that the average amount of hours worked by our population is 37.2 a
week. The minimum amount of hours worked in a week by a individual in our sample is 0 and
the max is 99. The mean for July_temp is 80.03084. This means that our average temperature
was 80 degrees Fahrenheit. The minimum temperature for a sample is 62 degrees and the
maximum is 94 degrees. The mean for pop65 is 25.2078%. This means that 25.2% of our
population is 65 or older.
Histogram.
6
05.0e-061.0e-051.5e-052.0e-05
Density
0 200000 400000 600000
Person's total earnings
We create a histogram of wage, age, and hours worked. Our sample is skewed to the
right we do not have a normal distribution. If the error term was normally distributed it would
be in the shape of a bell. Since the error term does not have a normal distribution we can no
longer trust our hypothesis testing. OLS is no longer Best Linier Unbiased Estimator.
Correlation Matrix
Variables Age Black college Wage Hours
Age 1.000
Black 0.0039 1.000
College 0.1926 0.0315 1.000
Wage 0.2532 0.0174 0.2743 1.000
Hours 0.1406 0.0083 0.0305 0.1949 1.000
In our correlation matrix there are a couple variables that are highly correlated. Wage
and age are highly correlated at 0.2532. This makes sense because the older someone is the
more likely they are skilled in their position and the more likely that they are educated. College
and age are highly correlated at 0.1926. We can expect as the persons age goes up it is more
7
likely they have a college degree. This makes sense because most college students start at 18
and don’t graduate till their mid twentys. College and wage is also highly correlated at 0.2743.
This makes sense because after graduating college the individual is smarter and has a higher
number of skillsets. Hours and wage are also highly correlated at 0.1949. We can expect that
the more hours someone works the higher their wage is going to be. This makes sense because
of the possibility of overtime where someone gets paid a time and a half after working 40
hours.
Ttest wage, by (black)
Group obs mean Std. err. Std. dev. 95% conf.
interval
0 869 39268.2 1582.823 46659.74 36161.59 42374.8
1 178 41534.7 4394.655 58632.01 32862.03 50207.36
Combined 1047 39653 1510.576 48878.27 36689.42 42617.63
diff -2266.501 4022.639 -10159.87 5626.869
Ha: diff !=0
Pr(|T|>|t|)=0.5733
When we run the t-test against wage and being black we find that blacks make $2266.50
more dollars a year. We find that it is not statistically significant at 0.5733. Since it is not
statistically significant we cannot trust the numbers that it gives us.
Ttest wage, by (female)
Group obs mean Std. err. Std. dev. 95% conf.
interval
0 570 4650.73 2164.591 51678.91 42253.17 50756.29
1 477 31466.55 2013.623 43978.19 27509.86 35423.23
Combined 1047 39653.52 1510.576 28878.27 36689.42 42617.63
diff 15038.18 2889.723 9153.981 20922.39
Ha: diff !=0
Pr(|T|>|t|)=0.00
8
When we run the t-test against wage and female we find that women make $15038.18
less a year. This shows that there is a possibility of discrimination, less educated, or there is a
possibility that women work less hours. We do find that it is statistically significant at 0.00. We
can trust the results that it gives us.
Ttest hours, by female
Group obs mean Std. err. Std. dev. 95% conf.
interval
0 570 41.59123 .603649 14.41158 40.4056 42.77685
1 477 31.95178 .6661574 14.5491 30.64281 33.26075
Combined 1047 37.19962 .47111 15.24388 36.27519 38.12405
diff 9.639446 .8982078 7.87695 11.40194
Ha: diff !=0
Pr(|T|>|t|)=0.00
When we run the t-test of hours by female we find that women work 31.95 hours a
week while men work 41.59 hours a week. This helps reassure our belief from the table above
when we ran a t-test for wage and female. Women do make less money because they work less
than their male counterparts. Women working less is likely the cause of having children and
having to take care of other family members while the men work more hours because they get
paid at a higher rate. It is statistically significant so we can use these results and have
confidence in them.
How Relative Are Our Variables?
When we run our regression model we want to make sure that all of our variables are
good fits and that they make sense being in the equation. The reason for this is if we don’t do
this we might break the classical assumptions and OLS will no longer be BLUE, and are results
will not be significant or trustworthy. When we run our regression to find out how wage varies
9
we need to make sure everything is theoretically sound. In our regression our wage is the
dependent variable or “Y”. If we go examine the variables in our model we can determine
which ones should be included. We should include age because with a increase in age wage
tends to go up as well. We can see when we run our regression ln_wage= age female college
hisp black hours part_time child6. Ages coefficient is .0345 so every year wage goes up .0345
when we include age. We should include female because we want to find out why there is a
difference in wages earned. We should include college in our model also. College tells how
wage goes up with education. We should include hours and part time because it can help
explain why people get paid differently with the variations of hours worked. We should also
include child6 because it explains why people work less hours. All of these variables above that
I have stated are statistically significant. When we include black into the equation it is not
statistically significant. The overall fit of our model isn’t very good Rsquared=0.3435. I believe if
we want the overall fit of the equation to be a better fit we need to add variables that might
have been omitted. Some of the variables that I thought might be omitted are IQ, quality of
university, location of living, and job position. One reason why I wanted to add the variable
location of living is because in some cities the cost of living is higher than others. We do not
include july_temp because it has no effect on our model, and we don’t include pop65 because
it is too small for our regression model to have any affect.
For the regression model we want to make sure that we know the expected signs. The
reason why we want to make sure that we know the signs is so we can tell if it would be a good
fit for our model. B:age I expect age to have a positive sign because with the increase in age we
expect an increase in wage. B:hisp we expect Hispanic to have a negative sign because of
barriers to entry and discrimination. B:black we expect black to have a positive sign because we
think it will bring in more money every year. B:female all else being equal we expect female to
have a negative sign. We expect women to make less than their counterpart men. B:college all
else being equal we expect college to have a positive sign. We expect college to bring in more
money yearly because of the skills it provides to people. B:hours we expect hours to have a
positive sign. We expect that the more hours you work the more money you will make.
B:part_time all else being equal we expect part time to have a negative sign. We expect part
10
time to have a negative sign because they work fewer hours than full time employees. The less
hours worked the fewer money brought in yearly. B:child6 all else being equal we expect child6
to have a positive sign.
Variable B hat Standar
error
Significant
at 1%
Significanc
e at 5%
Significanc
e at 10%
Age 0.032 -.003 2.326 1.645 1.282
Hisp -0.134 -.063 2.326 1.645 1.282
Black -.058 -.064 2.326 1.645 1.282
Female -.283 -.048 2.326 1.645 1.282
college 0.509 -.053 2.326 1.645 1.282
Hours 0.013 -.002 2.326 1.645 1.282
Part_time -0.283 -.075 2.326 1.645 1.282
ln_wage coef Std. Err. t P>|t|
Age .0345185 .
0032632
10.5
8
0.000 .0281153 .0409217
Female -.2850152 .
0527574
-5.40 0.000 -.3885385 -.1814918
College .5071715 .
0564339
8.99 0.000 .3964341 .617909
11
Hisp -.198478 .
0676282
-2.93 0.003 -.3311816 -.0657743
Black .0259772 .
0664785
0.39 0.696 -.1044703 .1564248
Hours .0053968 .
0025044
2.15 0.031 .0004826 .0103111
Part_tim
e
-.5076908 .
0834793
-6.08 0.000 -.6714982 -.3438834
Child6 .119445 .
0350736
3.41 0.001 .0506217 .1882683
_cons 8.929974 .
1646518
54.2
4
0.000 8.606885 9.253062
Number of obs=1047
F( 8, 1038)=69.43
r-squared= 0.3486
adj r-squared= 0.3435
The Regressions Results
Our expectation of the regression equation is
LN_wage=8.93+.035AGi-.285FEMALEi+.507COLLEGEi-.198COLLEGEi+.025BLACKi+.005HOURSi-.5
07PART_TIMEi+.119CHILD6i. When we look at our constant we can tell that wage goes up by
8.929% every year. For every year that age goes up wage goes up by .035%. If the person is a
female wage goes down by .285%. If the person has gone to college wage goes up by .507%.If
the person is Hispanic wage goes down by .198%. If the person is black wage goes up by .025%.
For every hour worked wage goes up by .005%. If the person is part time wage goes down by .
507%. If the person has a child under the age of six wage goes up by .119%. Are R-squared is .
12
3486 that means we can only account for 35% of our model. This means that our model is very
poor. When we calculate adjusted R-squared it gets even worse .3453. This means that we
need to make some adjustments to the model like correcting heteroskedasticity, increasing the
sample size, dropping redundant variables, or adding more theoretically fit variables.
i. 35(age)+1(female)+1(college)+2(child6)+1(part_time)=$62,764
ii. 45(age)+1(college)=$75,531
iii. 25(age)+1(black)+1(part_time)=21,366
iv. 25(age)+1(part_time)=19,777
v. 55(age)+1(black)+(part_time)53,921
Our model shows that the 25 year old black male makes $1,589 more a year than their
white male counterpart. Our model also shows how much college and working full time
increase wage. We compare the 45 year old male with the 55 year old male and there is a
big lead that the 55 year old man won’t be able to catch unless he also graduates and starts
working full time. We can also see how having children increase wage in figure i.
wage age hisp black female college hours Part_time Child6
Wage 1.000
Age 0.2532 1.000
Hisp -
0.0685
-.0768 1.000
Black 0.0174 0.003
9
-.1278 1.000
Female -.1533 -.0490 -.0368 0.0301 1.000
13
College .2743 .1926 -.1501 .0315 .0618 1.000
hours .1949 .1406 .0174 .0083 -.3151 .0305 1.000
Part_time -.2083 -.1659 -.0389 -.0030 .2911 -.0685 -.7542 1.000
Child6 .0821 -.0570 .0038 -.0456 -.0134 -.0086 -.0594 .0461 1.000
When we run the regression we find no cases of multicollinierity in the model. All of the
variables correlations are less than >.50. Since there is no multicollinearity in the problem
we know that none of the variables are explaining movement in other variables. This means
the selection of our explanatory variables are theoretically sound.
When we test for heteroskedasticity with the Breusch-Pagan/ Cook-Weisberg test we
find that without a doubt there is heteroskedasticity. Prob > chi2 =0.00. This means that the
error term variance is not constant and has multiple error term variances. Since there is a
strong case of heteroscedasticity that means that the standard errors are biases ant that
hypothesis testing is no longer reliable. We correct heteroskedasticity by typing in the
command robust into Stata. When we robust the equation most of the t values go up. R-
squared stays the same and so do the coefficients. Also standard errors change. Running robust
through Stata helps reduce the chance that the standard errors will be biased increasing the
utility of hypothesis testing.
Regress wage age female black Hispanic adjusted r-squared= .0837
Wage coef Std. err. t P>|t| 95% conf. I
Age 1505.38 184.94 8.14 0.00 1142.47 1868.29
Female -14109.34 2910.27 -4.85 0.00 -19820.01 -8398.68
Black 1814.01 3882.92 0.47 0.640 -5805.22 9433.26
Hisp -6984.56 3913.39 -1.78 0.075 -14663.61 694.47
_cons -3534.23 6696.82 -0.53 0.598 -16675.04 9606.57
14
Regress wage age female black hispanic college adjusted r-squared= .1385
Wage Coef. Std. Err. t P>|t| 95 % Conf I
Age 1225 182.56 6.71 0.00 866.76 1583.24
Female -15672.88 2828.39 -5.54 0.00 -21222.89 -10122.88
Black 1441.915 3765.372 0.38 0.702 -5946.67 8830.49
Hisp -2802.351 3828.77 -.73 0.464 -10315.34 4710.63
College 26023.57 3173.647 8.20 0.00 19796.1 32251.04
_cons -1714.822 6497.403 -.26 0.792 -14464.32 11034.68
Regress wage age female black hispanic college part_time adjusted r-squared=.1522
wage Coef Std. Err. T P>|t| 95% Conf I
Age 1111.20 183.09 6.07 0.00 751.9356 1470.47
Female -12075.66 2931.81 4.12 0.00 -17828.6 -6322.71
Black 1182.848 3735.7 0.32 0.752 -6147.51 8513.21
Hisp -3657.44 3803.46 -.96 0.336 -11120.78 3805.89
College 25120.85 3155.44 7.96 0.00 18929.09 31312.6
Part_time -12699.73 3239.19 -4.23 0.00 -20055.83 -7343.63
_cons 4959.04 6635.68 .75 0.455 -8061.82 17979.91
Comparing these models we see when we add college to the equation the cost of living
goes way up. The constant is -1714.82 making negative earnings. When we add part time to the
15
equation we see that constant turns positive. This makes sense because not working and going
to school you’re spending thousands of dollars without any positive income. Looking at all three
models we see that every time we add a variable the overall fit of the equation goes up.
Although the overall fit of these equations do rise they’re still very poor only accounting for
15% of our model. As we add more explanatory variables fluctuate and standard errors mainly
remain the same. The two explanatory variables that we do add are theoretically sound. They
are both significant at 0.00. The constant is not significant in these models. We see that when
we add college to the equation wages go up like expected. Part time also decreases the amount
of money like we expected. Our models need to be improved. We cannot rely on it because the
adjusted R-squared is so low. We can improve the model by adding more explanatory variables
or increasing the sample size.
Conclusion
I conclude that the model that has been given to us is a very basic model. The model
that we’re given has a few problems. It isn’t very reliable the adjusted R-squared only accounts
for 35% of the equation. OLS is no longer the Best Linear Unbiased Estimator because two of
the classical assumptions are broke. The error term is not normally distributed and does not
have a constant variance. The distribution of the error term is skewed to the right. The error
term has more than one variance so we can no longer trust hypothesis testing. We can correct
these errors so that OLS is BLUE again. We can robust the regression, run a white test, drop
redundant variables, and increase the sample size to fix these problems.
Our model only tells us that women and Hispanics get paid less than their white
counterparts. It doesn’t have enough variables to give us enough information on whether
women and different races are discriminated against in the work environment. If we want our
model to be as reliable as those of the Huffington post or other researchers models we need to
add omitted variables. We should add more races like Asian and Native American to see if
discrimination varies. We also need to add more variables that will give us an idea if they’re
being discriminated against. For example IQ, work ethic, living cost, job position, female to male
ratio, work environments. Then our model will be more descriptive and give us more
16
information about wage inequality. We can also run a time series to see how wage inequality
has either increased or decreased. Expanding our model to this limits takes a lot more research
but also provides us with more descriptive and reliable results.
Refrences
Wealth Gaps Rise to Record High Between Whites, Black, Hispanics, Rakesh Kochhar, Richard
Fry, Paul Taylor, July 26, 2011, Pew Research
http://www.pewsocialtrends.org/2011/07/26/wealth-gaps-rise-to-record-highs-between-
whites-blacks-hispanics/
Catherine Hill, How does Race Affect the Gender Wage Gap?
http://www.huffingtonpost.com/catherine-hill/how-does-race-affect-the-gender-wage-
gap_b_5087132.html

More Related Content

Viewers also liked

Chemistry Unit 4 PPT
Chemistry Unit 4 PPTChemistry Unit 4 PPT
Chemistry Unit 4 PPT
jk_redmond
 
Сбалансированные показатели сайтов - цель сайта
Сбалансированные показатели сайтов - цель сайтаСбалансированные показатели сайтов - цель сайта
Сбалансированные показатели сайтов - цель сайтаSEO_Experts
 
Anexo 2
Anexo 2Anexo 2
ELOY ALFARO
ELOY ALFARO ELOY ALFARO
ELOY ALFARO
Patricio Salazar
 
Animales en peligro de extinción!
Animales en peligro de extinción!Animales en peligro de extinción!
Animales en peligro de extinción!
Aaron Vejar
 
Tecnicas e instrumentos de evaluación
Tecnicas e instrumentos de evaluaciónTecnicas e instrumentos de evaluación
Tecnicas e instrumentos de evaluaciónElimar Poyer
 
Game as a teaching strategy, Spain, KA2
Game as a teaching strategy, Spain, KA2Game as a teaching strategy, Spain, KA2
Game as a teaching strategy, Spain, KA2
Jolanta Varanaviciene
 
εισηγηση σε σεμιναριο
εισηγηση σε σεμιναριοεισηγηση σε σεμιναριο
εισηγηση σε σεμιναριοΕλενη Ζαχου
 
Trabajo escrito discapacidad
Trabajo escrito discapacidadTrabajo escrito discapacidad
Trabajo escrito discapacidad
PakOo JaiMmes
 
Take Your Opportunity
Take Your OpportunityTake Your Opportunity
Diagnosing individuals' different styles
Diagnosing individuals' different stylesDiagnosing individuals' different styles
Diagnosing individuals' different styles
Chantelle Jones
 
Saudi Aramco's Marketing Strategy. - Free Online Library
Saudi Aramco's Marketing Strategy. - Free Online LibrarySaudi Aramco's Marketing Strategy. - Free Online Library
Saudi Aramco's Marketing Strategy. - Free Online Library
damagingspectat26
 
French revolution journal entries
French revolution journal entriesFrench revolution journal entries
French revolution journal entries
joannewang329
 
Budget for Children in Meghalaya 2015-2016
Budget for Children in Meghalaya 2015-2016Budget for Children in Meghalaya 2015-2016
Budget for Children in Meghalaya 2015-2016
HAQ: Centre for Child Rights
 

Viewers also liked (17)

Chemistry Unit 4 PPT
Chemistry Unit 4 PPTChemistry Unit 4 PPT
Chemistry Unit 4 PPT
 
Сбалансированные показатели сайтов - цель сайта
Сбалансированные показатели сайтов - цель сайтаСбалансированные показатели сайтов - цель сайта
Сбалансированные показатели сайтов - цель сайта
 
Anexo 2
Anexo 2Anexo 2
Anexo 2
 
ELOY ALFARO
ELOY ALFARO ELOY ALFARO
ELOY ALFARO
 
Animales en peligro de extinción!
Animales en peligro de extinción!Animales en peligro de extinción!
Animales en peligro de extinción!
 
Tecnicas e instrumentos de evaluación
Tecnicas e instrumentos de evaluaciónTecnicas e instrumentos de evaluación
Tecnicas e instrumentos de evaluación
 
CV_Deepak_Khandelwal
CV_Deepak_KhandelwalCV_Deepak_Khandelwal
CV_Deepak_Khandelwal
 
Game as a teaching strategy, Spain, KA2
Game as a teaching strategy, Spain, KA2Game as a teaching strategy, Spain, KA2
Game as a teaching strategy, Spain, KA2
 
εισηγηση σε σεμιναριο
εισηγηση σε σεμιναριοεισηγηση σε σεμιναριο
εισηγηση σε σεμιναριο
 
Lore e
Lore eLore e
Lore e
 
Trabajo escrito discapacidad
Trabajo escrito discapacidadTrabajo escrito discapacidad
Trabajo escrito discapacidad
 
Chernobylpp
ChernobylppChernobylpp
Chernobylpp
 
Take Your Opportunity
Take Your OpportunityTake Your Opportunity
Take Your Opportunity
 
Diagnosing individuals' different styles
Diagnosing individuals' different stylesDiagnosing individuals' different styles
Diagnosing individuals' different styles
 
Saudi Aramco's Marketing Strategy. - Free Online Library
Saudi Aramco's Marketing Strategy. - Free Online LibrarySaudi Aramco's Marketing Strategy. - Free Online Library
Saudi Aramco's Marketing Strategy. - Free Online Library
 
French revolution journal entries
French revolution journal entriesFrench revolution journal entries
French revolution journal entries
 
Budget for Children in Meghalaya 2015-2016
Budget for Children in Meghalaya 2015-2016Budget for Children in Meghalaya 2015-2016
Budget for Children in Meghalaya 2015-2016
 

Similar to project econ 436511

Gender gap article
Gender gap articleGender gap article
Gender gap articleLisa Schmidt
 
Rohr_CPS_ResearchPaper
Rohr_CPS_ResearchPaperRohr_CPS_ResearchPaper
Rohr_CPS_ResearchPaperRebecca Rohr
 
© 2018 Laureate Education, Inc. Course 6336 Crisis, T.docx
© 2018 Laureate Education, Inc.   Course 6336 Crisis, T.docx© 2018 Laureate Education, Inc.   Course 6336 Crisis, T.docx
© 2018 Laureate Education, Inc. Course 6336 Crisis, T.docx
gerardkortney
 
Labor Market Impact on Women and Men
Labor Market Impact on Women and MenLabor Market Impact on Women and Men
Labor Market Impact on Women and MenDIANN MOORMAN
 
Introduction to Descriptive Statistics-Central Tendency & Dispersion-FA2013
Introduction to Descriptive Statistics-Central Tendency & Dispersion-FA2013Introduction to Descriptive Statistics-Central Tendency & Dispersion-FA2013
Introduction to Descriptive Statistics-Central Tendency & Dispersion-FA2013ambadar
 
BUS308 – Week 1 Lecture 2 Describing Data Expected Out.docx
BUS308 – Week 1 Lecture 2 Describing Data Expected Out.docxBUS308 – Week 1 Lecture 2 Describing Data Expected Out.docx
BUS308 – Week 1 Lecture 2 Describing Data Expected Out.docx
curwenmichaela
 
Merit Pay Essay
Merit Pay EssayMerit Pay Essay
The Pay Gap Is Because of Gender, Not Jobs By ALANOOD ALOTAIBI
 The Pay Gap Is Because of Gender, Not Jobs By ALANOOD ALOTAIBI  The Pay Gap Is Because of Gender, Not Jobs By ALANOOD ALOTAIBI
The Pay Gap Is Because of Gender, Not Jobs By ALANOOD ALOTAIBI Alanood Alotaibi
 
A few thoughts on work life-balance
A few thoughts on work life-balanceA few thoughts on work life-balance
A few thoughts on work life-balance
NishanthiKariyawasam1
 
Effects of ability on determining wages of individuals
Effects of ability on determining wages of individualsEffects of ability on determining wages of individuals
Effects of ability on determining wages of individualsMuhammad Farhan Anwaar
 
Case Study Hereditary AngioedemaAll responses must be in your .docx
Case Study  Hereditary AngioedemaAll responses must be in your .docxCase Study  Hereditary AngioedemaAll responses must be in your .docx
Case Study Hereditary AngioedemaAll responses must be in your .docx
cowinhelen
 
Research methods(2)
Research methods(2)Research methods(2)
Research methods(2)geoghanm
 

Similar to project econ 436511 (17)

The Racial Gap on Wages
The Racial Gap on Wages The Racial Gap on Wages
The Racial Gap on Wages
 
Gender gap article
Gender gap articleGender gap article
Gender gap article
 
Rohr_CPS_ResearchPaper
Rohr_CPS_ResearchPaperRohr_CPS_ResearchPaper
Rohr_CPS_ResearchPaper
 
the paper
the paperthe paper
the paper
 
© 2018 Laureate Education, Inc. Course 6336 Crisis, T.docx
© 2018 Laureate Education, Inc.   Course 6336 Crisis, T.docx© 2018 Laureate Education, Inc.   Course 6336 Crisis, T.docx
© 2018 Laureate Education, Inc. Course 6336 Crisis, T.docx
 
Labor Market Impact on Women and Men
Labor Market Impact on Women and MenLabor Market Impact on Women and Men
Labor Market Impact on Women and Men
 
Prob ^0 Stats Proj 1
Prob ^0 Stats Proj 1Prob ^0 Stats Proj 1
Prob ^0 Stats Proj 1
 
Introduction to Descriptive Statistics-Central Tendency & Dispersion-FA2013
Introduction to Descriptive Statistics-Central Tendency & Dispersion-FA2013Introduction to Descriptive Statistics-Central Tendency & Dispersion-FA2013
Introduction to Descriptive Statistics-Central Tendency & Dispersion-FA2013
 
ECTM RP
ECTM RPECTM RP
ECTM RP
 
BUS308 – Week 1 Lecture 2 Describing Data Expected Out.docx
BUS308 – Week 1 Lecture 2 Describing Data Expected Out.docxBUS308 – Week 1 Lecture 2 Describing Data Expected Out.docx
BUS308 – Week 1 Lecture 2 Describing Data Expected Out.docx
 
Merit Pay Essay
Merit Pay EssayMerit Pay Essay
Merit Pay Essay
 
The Pay Gap Is Because of Gender, Not Jobs By ALANOOD ALOTAIBI
 The Pay Gap Is Because of Gender, Not Jobs By ALANOOD ALOTAIBI  The Pay Gap Is Because of Gender, Not Jobs By ALANOOD ALOTAIBI
The Pay Gap Is Because of Gender, Not Jobs By ALANOOD ALOTAIBI
 
Cms498.chapter9
Cms498.chapter9Cms498.chapter9
Cms498.chapter9
 
A few thoughts on work life-balance
A few thoughts on work life-balanceA few thoughts on work life-balance
A few thoughts on work life-balance
 
Effects of ability on determining wages of individuals
Effects of ability on determining wages of individualsEffects of ability on determining wages of individuals
Effects of ability on determining wages of individuals
 
Case Study Hereditary AngioedemaAll responses must be in your .docx
Case Study  Hereditary AngioedemaAll responses must be in your .docxCase Study  Hereditary AngioedemaAll responses must be in your .docx
Case Study Hereditary AngioedemaAll responses must be in your .docx
 
Research methods(2)
Research methods(2)Research methods(2)
Research methods(2)
 

project econ 436511

  • 1. 1 Joshua Shea Wage gap and Variations Between Gender and Race Dr Natalia Zhivan
  • 2. 2 Wage inequality and variation Everybodys main goal at work is to be as efficient as possible and to make as much money as they can. Although some people may be more efficient or “better at their job” than the other person they still might get paid less because of biased opinions. People will have biased opinions on males, females, and their ethnicity. Some biases will be negative some will be posotive. Allmost all of the biased opinions will be misenterprenting the persons assets. For example if a women works at a collection agency and is compared to her counterpart male she might be held at a level below him. These levels might even increase if she has children or is a different race than him. They might have biased opinions about her if she has children because they think she might not be as energized as him, cant spend as much time at work,or just isn’t as good as him because she is a “women”. These mysoganist opinions keep women out of competitive work positions and can cause negative variations in their salry. Many races also might be discriminated against because the companies might not think that they are good as their “white” or other counterparts. The Huffington post provides a good regression or comparison of how women’s earnings compare between their different ethnicities and agianst their male counterparts also. As we can see from looking at this table that all women “as a whole” held up against their same ethnicity male counterparts have lower earnings. The gap between women and same ethnicity males are relatively close ranging from 89%-78%. This shows that there is some a large posibailty of mysoginistic behavior or biased towards women. The variation also could be explained by women not being able to work full time because of their obligations to children or other family members. This chart also shows that people do not view race equally. But how do different ethnicity women hold up agianst their white male counterparts. They ragress white mens earnings agains womens earnings in different ethnicity catagories. We find that the variation in womens and their male counterparts earnings become much larger now. The gape between womens and their same ethnicity male counterparts was 22% at the largest. When they are ran against their white male counterparts it increases to a maximum 57% varation. This shows that there is a great amount of biased opinions on race and it only
  • 3. 3 becomes greater when you factor in women. We find that Asian women are the closest to their white male counterparts at 87% of male earnings. Hispanic women are the furthest away from white male earnigns only making 53% of their white male counterparts. We should also look at how other ethnicities aren’t paid as much as their white counterparts excluding the fact that they’re women. Rakesh Kochhar states in his article that “the median wealth of white households is 20 times that of black households and 18 times that of hispanic households”. The numbers that he provides us are staggering. He states that black households had $5,677 in assets, Hispanics $6,325 in assets, and a white household has $113,149. This just shows how much wage differentiates. We hope to have our model explain why this is and if not to readjust the parameters and explanatory variables to where it can tell us. Date of Our Empirical Model In our model we are going to run regressions and on wage and natural log of wage to see how wage varies across different variables. We will try to explain why wage varies between these variables and determine how “good” our model and parameters are. In our model we run
  • 4. 4 the regression ln_wage= age female college hisp black hours part_time child6. All of these commands or inputs that we run in the regression in Stata stand for different variables. Age is a continuous variable. Age is the age of an individual measured in years. Female is a dummy variable. Female is the gender of the person, 1 if female and 0 if male. College is a dummy variable. College is if the person graduated college or not. College equals 1 if graduated and 0 if the person hasn’t. Black is a dummy variable. Black is the race of the person. 1 if black 0 if other. Hispanic is a dummy variable. Hispanic is the ethnicity of the person. 1 if the person is Hispanic and 0 if other. Health is a categorical variable. Health ranges from 1= excellent, 2=very good, 3=good, 4= fair, 5=poor. Child 6 is a categorical variable. Child6 is the number of children under the age of 6. Wage is a continuous variable. Wage is the annual earnings measured in $1 increments. Hours is a continuous variable. Hours is the number of hours worked last week measured in 1 hour increments. Part_time is a dummy variable. 1 if part time 0 if not part time. July_temp is a continuous variable. July_temp is the average july temperature in the county of residence measured in Fahrenheit. Pop65 is a continuous variable. Pop65 is the share of the population older than the age of 65. Count 1047 variable observation Mean Std. dev. Min Max Sum female 1047 .4555874 .4982616 0 1 Sum college 1047 .2941738 .4558882 0 1 Sum age 1047 33.53391 7.855929 18 45 Sum wage 1047 39653.52 48878.27 12 619221 Sum hours 1047 37.19962 15.24388 0 99 Sum july_temp 1047 80.03084 4.991196 62.23553 94.44621 Sum pop65 1047 .252078 .0500064 .0978648 .3961477
  • 5. 5 Describing The Data Our sample size is 1047 individuals. For female our mean is 45%. So our sample is 45% female. For college our mean is 29.41%. So 29.41% of our sample has graduated college. For age our mean is 33.53. This means that the average age of our sample is 33.5 years old. The minimum age for or sample is 18 and the max is 45. The mean for wage is 39653.52. This mean the average wage that the individual makes in our sample is $39653.52 a year. The minimum wage a year for our population is $12 a year and the max is $619221. The mean for hours is 37.19962. This mean that the average amount of hours worked by our population is 37.2 a week. The minimum amount of hours worked in a week by a individual in our sample is 0 and the max is 99. The mean for July_temp is 80.03084. This means that our average temperature was 80 degrees Fahrenheit. The minimum temperature for a sample is 62 degrees and the maximum is 94 degrees. The mean for pop65 is 25.2078%. This means that 25.2% of our population is 65 or older. Histogram.
  • 6. 6 05.0e-061.0e-051.5e-052.0e-05 Density 0 200000 400000 600000 Person's total earnings We create a histogram of wage, age, and hours worked. Our sample is skewed to the right we do not have a normal distribution. If the error term was normally distributed it would be in the shape of a bell. Since the error term does not have a normal distribution we can no longer trust our hypothesis testing. OLS is no longer Best Linier Unbiased Estimator. Correlation Matrix Variables Age Black college Wage Hours Age 1.000 Black 0.0039 1.000 College 0.1926 0.0315 1.000 Wage 0.2532 0.0174 0.2743 1.000 Hours 0.1406 0.0083 0.0305 0.1949 1.000 In our correlation matrix there are a couple variables that are highly correlated. Wage and age are highly correlated at 0.2532. This makes sense because the older someone is the more likely they are skilled in their position and the more likely that they are educated. College and age are highly correlated at 0.1926. We can expect as the persons age goes up it is more
  • 7. 7 likely they have a college degree. This makes sense because most college students start at 18 and don’t graduate till their mid twentys. College and wage is also highly correlated at 0.2743. This makes sense because after graduating college the individual is smarter and has a higher number of skillsets. Hours and wage are also highly correlated at 0.1949. We can expect that the more hours someone works the higher their wage is going to be. This makes sense because of the possibility of overtime where someone gets paid a time and a half after working 40 hours. Ttest wage, by (black) Group obs mean Std. err. Std. dev. 95% conf. interval 0 869 39268.2 1582.823 46659.74 36161.59 42374.8 1 178 41534.7 4394.655 58632.01 32862.03 50207.36 Combined 1047 39653 1510.576 48878.27 36689.42 42617.63 diff -2266.501 4022.639 -10159.87 5626.869 Ha: diff !=0 Pr(|T|>|t|)=0.5733 When we run the t-test against wage and being black we find that blacks make $2266.50 more dollars a year. We find that it is not statistically significant at 0.5733. Since it is not statistically significant we cannot trust the numbers that it gives us. Ttest wage, by (female) Group obs mean Std. err. Std. dev. 95% conf. interval 0 570 4650.73 2164.591 51678.91 42253.17 50756.29 1 477 31466.55 2013.623 43978.19 27509.86 35423.23 Combined 1047 39653.52 1510.576 28878.27 36689.42 42617.63 diff 15038.18 2889.723 9153.981 20922.39 Ha: diff !=0 Pr(|T|>|t|)=0.00
  • 8. 8 When we run the t-test against wage and female we find that women make $15038.18 less a year. This shows that there is a possibility of discrimination, less educated, or there is a possibility that women work less hours. We do find that it is statistically significant at 0.00. We can trust the results that it gives us. Ttest hours, by female Group obs mean Std. err. Std. dev. 95% conf. interval 0 570 41.59123 .603649 14.41158 40.4056 42.77685 1 477 31.95178 .6661574 14.5491 30.64281 33.26075 Combined 1047 37.19962 .47111 15.24388 36.27519 38.12405 diff 9.639446 .8982078 7.87695 11.40194 Ha: diff !=0 Pr(|T|>|t|)=0.00 When we run the t-test of hours by female we find that women work 31.95 hours a week while men work 41.59 hours a week. This helps reassure our belief from the table above when we ran a t-test for wage and female. Women do make less money because they work less than their male counterparts. Women working less is likely the cause of having children and having to take care of other family members while the men work more hours because they get paid at a higher rate. It is statistically significant so we can use these results and have confidence in them. How Relative Are Our Variables? When we run our regression model we want to make sure that all of our variables are good fits and that they make sense being in the equation. The reason for this is if we don’t do this we might break the classical assumptions and OLS will no longer be BLUE, and are results will not be significant or trustworthy. When we run our regression to find out how wage varies
  • 9. 9 we need to make sure everything is theoretically sound. In our regression our wage is the dependent variable or “Y”. If we go examine the variables in our model we can determine which ones should be included. We should include age because with a increase in age wage tends to go up as well. We can see when we run our regression ln_wage= age female college hisp black hours part_time child6. Ages coefficient is .0345 so every year wage goes up .0345 when we include age. We should include female because we want to find out why there is a difference in wages earned. We should include college in our model also. College tells how wage goes up with education. We should include hours and part time because it can help explain why people get paid differently with the variations of hours worked. We should also include child6 because it explains why people work less hours. All of these variables above that I have stated are statistically significant. When we include black into the equation it is not statistically significant. The overall fit of our model isn’t very good Rsquared=0.3435. I believe if we want the overall fit of the equation to be a better fit we need to add variables that might have been omitted. Some of the variables that I thought might be omitted are IQ, quality of university, location of living, and job position. One reason why I wanted to add the variable location of living is because in some cities the cost of living is higher than others. We do not include july_temp because it has no effect on our model, and we don’t include pop65 because it is too small for our regression model to have any affect. For the regression model we want to make sure that we know the expected signs. The reason why we want to make sure that we know the signs is so we can tell if it would be a good fit for our model. B:age I expect age to have a positive sign because with the increase in age we expect an increase in wage. B:hisp we expect Hispanic to have a negative sign because of barriers to entry and discrimination. B:black we expect black to have a positive sign because we think it will bring in more money every year. B:female all else being equal we expect female to have a negative sign. We expect women to make less than their counterpart men. B:college all else being equal we expect college to have a positive sign. We expect college to bring in more money yearly because of the skills it provides to people. B:hours we expect hours to have a positive sign. We expect that the more hours you work the more money you will make. B:part_time all else being equal we expect part time to have a negative sign. We expect part
  • 10. 10 time to have a negative sign because they work fewer hours than full time employees. The less hours worked the fewer money brought in yearly. B:child6 all else being equal we expect child6 to have a positive sign. Variable B hat Standar error Significant at 1% Significanc e at 5% Significanc e at 10% Age 0.032 -.003 2.326 1.645 1.282 Hisp -0.134 -.063 2.326 1.645 1.282 Black -.058 -.064 2.326 1.645 1.282 Female -.283 -.048 2.326 1.645 1.282 college 0.509 -.053 2.326 1.645 1.282 Hours 0.013 -.002 2.326 1.645 1.282 Part_time -0.283 -.075 2.326 1.645 1.282 ln_wage coef Std. Err. t P>|t| Age .0345185 . 0032632 10.5 8 0.000 .0281153 .0409217 Female -.2850152 . 0527574 -5.40 0.000 -.3885385 -.1814918 College .5071715 . 0564339 8.99 0.000 .3964341 .617909
  • 11. 11 Hisp -.198478 . 0676282 -2.93 0.003 -.3311816 -.0657743 Black .0259772 . 0664785 0.39 0.696 -.1044703 .1564248 Hours .0053968 . 0025044 2.15 0.031 .0004826 .0103111 Part_tim e -.5076908 . 0834793 -6.08 0.000 -.6714982 -.3438834 Child6 .119445 . 0350736 3.41 0.001 .0506217 .1882683 _cons 8.929974 . 1646518 54.2 4 0.000 8.606885 9.253062 Number of obs=1047 F( 8, 1038)=69.43 r-squared= 0.3486 adj r-squared= 0.3435 The Regressions Results Our expectation of the regression equation is LN_wage=8.93+.035AGi-.285FEMALEi+.507COLLEGEi-.198COLLEGEi+.025BLACKi+.005HOURSi-.5 07PART_TIMEi+.119CHILD6i. When we look at our constant we can tell that wage goes up by 8.929% every year. For every year that age goes up wage goes up by .035%. If the person is a female wage goes down by .285%. If the person has gone to college wage goes up by .507%.If the person is Hispanic wage goes down by .198%. If the person is black wage goes up by .025%. For every hour worked wage goes up by .005%. If the person is part time wage goes down by . 507%. If the person has a child under the age of six wage goes up by .119%. Are R-squared is .
  • 12. 12 3486 that means we can only account for 35% of our model. This means that our model is very poor. When we calculate adjusted R-squared it gets even worse .3453. This means that we need to make some adjustments to the model like correcting heteroskedasticity, increasing the sample size, dropping redundant variables, or adding more theoretically fit variables. i. 35(age)+1(female)+1(college)+2(child6)+1(part_time)=$62,764 ii. 45(age)+1(college)=$75,531 iii. 25(age)+1(black)+1(part_time)=21,366 iv. 25(age)+1(part_time)=19,777 v. 55(age)+1(black)+(part_time)53,921 Our model shows that the 25 year old black male makes $1,589 more a year than their white male counterpart. Our model also shows how much college and working full time increase wage. We compare the 45 year old male with the 55 year old male and there is a big lead that the 55 year old man won’t be able to catch unless he also graduates and starts working full time. We can also see how having children increase wage in figure i. wage age hisp black female college hours Part_time Child6 Wage 1.000 Age 0.2532 1.000 Hisp - 0.0685 -.0768 1.000 Black 0.0174 0.003 9 -.1278 1.000 Female -.1533 -.0490 -.0368 0.0301 1.000
  • 13. 13 College .2743 .1926 -.1501 .0315 .0618 1.000 hours .1949 .1406 .0174 .0083 -.3151 .0305 1.000 Part_time -.2083 -.1659 -.0389 -.0030 .2911 -.0685 -.7542 1.000 Child6 .0821 -.0570 .0038 -.0456 -.0134 -.0086 -.0594 .0461 1.000 When we run the regression we find no cases of multicollinierity in the model. All of the variables correlations are less than >.50. Since there is no multicollinearity in the problem we know that none of the variables are explaining movement in other variables. This means the selection of our explanatory variables are theoretically sound. When we test for heteroskedasticity with the Breusch-Pagan/ Cook-Weisberg test we find that without a doubt there is heteroskedasticity. Prob > chi2 =0.00. This means that the error term variance is not constant and has multiple error term variances. Since there is a strong case of heteroscedasticity that means that the standard errors are biases ant that hypothesis testing is no longer reliable. We correct heteroskedasticity by typing in the command robust into Stata. When we robust the equation most of the t values go up. R- squared stays the same and so do the coefficients. Also standard errors change. Running robust through Stata helps reduce the chance that the standard errors will be biased increasing the utility of hypothesis testing. Regress wage age female black Hispanic adjusted r-squared= .0837 Wage coef Std. err. t P>|t| 95% conf. I Age 1505.38 184.94 8.14 0.00 1142.47 1868.29 Female -14109.34 2910.27 -4.85 0.00 -19820.01 -8398.68 Black 1814.01 3882.92 0.47 0.640 -5805.22 9433.26 Hisp -6984.56 3913.39 -1.78 0.075 -14663.61 694.47 _cons -3534.23 6696.82 -0.53 0.598 -16675.04 9606.57
  • 14. 14 Regress wage age female black hispanic college adjusted r-squared= .1385 Wage Coef. Std. Err. t P>|t| 95 % Conf I Age 1225 182.56 6.71 0.00 866.76 1583.24 Female -15672.88 2828.39 -5.54 0.00 -21222.89 -10122.88 Black 1441.915 3765.372 0.38 0.702 -5946.67 8830.49 Hisp -2802.351 3828.77 -.73 0.464 -10315.34 4710.63 College 26023.57 3173.647 8.20 0.00 19796.1 32251.04 _cons -1714.822 6497.403 -.26 0.792 -14464.32 11034.68 Regress wage age female black hispanic college part_time adjusted r-squared=.1522 wage Coef Std. Err. T P>|t| 95% Conf I Age 1111.20 183.09 6.07 0.00 751.9356 1470.47 Female -12075.66 2931.81 4.12 0.00 -17828.6 -6322.71 Black 1182.848 3735.7 0.32 0.752 -6147.51 8513.21 Hisp -3657.44 3803.46 -.96 0.336 -11120.78 3805.89 College 25120.85 3155.44 7.96 0.00 18929.09 31312.6 Part_time -12699.73 3239.19 -4.23 0.00 -20055.83 -7343.63 _cons 4959.04 6635.68 .75 0.455 -8061.82 17979.91 Comparing these models we see when we add college to the equation the cost of living goes way up. The constant is -1714.82 making negative earnings. When we add part time to the
  • 15. 15 equation we see that constant turns positive. This makes sense because not working and going to school you’re spending thousands of dollars without any positive income. Looking at all three models we see that every time we add a variable the overall fit of the equation goes up. Although the overall fit of these equations do rise they’re still very poor only accounting for 15% of our model. As we add more explanatory variables fluctuate and standard errors mainly remain the same. The two explanatory variables that we do add are theoretically sound. They are both significant at 0.00. The constant is not significant in these models. We see that when we add college to the equation wages go up like expected. Part time also decreases the amount of money like we expected. Our models need to be improved. We cannot rely on it because the adjusted R-squared is so low. We can improve the model by adding more explanatory variables or increasing the sample size. Conclusion I conclude that the model that has been given to us is a very basic model. The model that we’re given has a few problems. It isn’t very reliable the adjusted R-squared only accounts for 35% of the equation. OLS is no longer the Best Linear Unbiased Estimator because two of the classical assumptions are broke. The error term is not normally distributed and does not have a constant variance. The distribution of the error term is skewed to the right. The error term has more than one variance so we can no longer trust hypothesis testing. We can correct these errors so that OLS is BLUE again. We can robust the regression, run a white test, drop redundant variables, and increase the sample size to fix these problems. Our model only tells us that women and Hispanics get paid less than their white counterparts. It doesn’t have enough variables to give us enough information on whether women and different races are discriminated against in the work environment. If we want our model to be as reliable as those of the Huffington post or other researchers models we need to add omitted variables. We should add more races like Asian and Native American to see if discrimination varies. We also need to add more variables that will give us an idea if they’re being discriminated against. For example IQ, work ethic, living cost, job position, female to male ratio, work environments. Then our model will be more descriptive and give us more
  • 16. 16 information about wage inequality. We can also run a time series to see how wage inequality has either increased or decreased. Expanding our model to this limits takes a lot more research but also provides us with more descriptive and reliable results. Refrences Wealth Gaps Rise to Record High Between Whites, Black, Hispanics, Rakesh Kochhar, Richard Fry, Paul Taylor, July 26, 2011, Pew Research http://www.pewsocialtrends.org/2011/07/26/wealth-gaps-rise-to-record-highs-between- whites-blacks-hispanics/ Catherine Hill, How does Race Affect the Gender Wage Gap? http://www.huffingtonpost.com/catherine-hill/how-does-race-affect-the-gender-wage- gap_b_5087132.html