This document analyzes the effects of language barriers on labor force status and occupation in the United States using data from the 2010-2011 American Community Survey. It begins with an introduction on the motivation and research question. Descriptive statistics are then provided on variables like age, gender, education level, English fluency, and whether individuals are in the labor force or what industry they work in. Conditional statistics examine how being out of the labor force or working in manual labor is affected by independent variables. Logistic regression models are then used to analyze the relationships between the variables.
MARGINALIZATION (Different learners in Marginalized Group
Language Barriers Impact on Employment
1. Language Barriers In The United States
The Effect On Labor Force Status And Occupation
Kyle Downey
Ryan King
Edward Maynard
2. Table of Contents
Part 1
I. Motivation…………………………………………………………………………………………………………………………………….. 2
II. Supporting Documentation…………………………………………………………………………………………………………….. 3
III. Research Question…………………………………………………………………………………………………………………………. 4
IV. Data Description…………………………………………………………………………………………………………………………….. 5
V. Modeling Dataset…………………………………………………………………………………………………………………………..… 6
VI. Variables……………………………………………………………………………………………………………………………………….. 7
Part 2
VII. Descriptive Statistics……………………………………………………………………………………………………………………… 8 – 22
Part 3
VIIII. Conditional Statistics…………………………………………………………………………………………………………….……. 23 – 48
Part 4
Missing Values …………………………………………………………………………………………………………….…………………......49 – 51
Dummy Variable Creation …………………………………………………………………………………………………………….…..…. 52– 53
Part 5
Logistic Regression…………………………………………………………………………………………………………….…….………..... 54 – 63
Model 1…………………………………………………………………………………………………………….…….………..……….. 56 – 59
Model 2…………………………………………………………………………………………………………….…….………..……..… 60 – 63
Simulations…………………………………………………………………………………………………………………………………..…..…. 64– 66
Part 6
Conclusion……………………………………………………………………………………………………………………………………..…..... 67 – 68
Appendix …………………………………………………………………………………………………………….…….…………………….….. 69 – 75
1
3. Motivation
Immigration is a significant cause of population growth in the United States. With immigration comes the potential for
language barriers to exist. It has caused many States to enact laws to declare English as their official language.
An increasing number of language barriers could potentially have legal, political, and economic ramifications.
Government services are increasingly likely to adopt bilingual policies to accommodate the influx of immigrants. If
individuals have a hard time seeking employment, it could result in higher unemployment, less people in the labor
force, higher crime, or higher government dependency.
The aforementioned potential effects of language barriers is something we will not look more into for this study. We
simply want to know how language barriers relate to labor force status and the type of industry associated with these
individuals. This way, it can be better understood how lack of English fluency in a predominantly English speaking
country can have an effect on an individual’s employment.
2
4. Supporting Documentation
An article by CBS News titled “Language Barriers Cause Problems” claims that the existence of language barriers is an
issue in the work place and affect the employment of immigrants. It cites the fact that companies do not have the
resources to provide adequate training for potential workers who do not speak English. Also, there is not people on the
job who can frequently translate for these individuals with language barriers.
Source: http://www.cbsnews.com/2100-201_162-517706.html
Franchise Business Review published an article called ”Language Barriers Affecting Business” that explains how lack of
the ability to communicate in the workplace can cause issues. These issues are caused by cultural differences and
miscommunications that lead to errors and quality problems, along with frustration of employees. This may make it
more difficult for people with language barriers to find employment.
Source: http://www.franchisebusinessreview.com/content/Language-Barriers-Affecting-Business
3
5. Research Question
How do the existence of language barriers affect whether or not an individual is in the labor force? Does the existence of
these language barriers affect what type of industry the person works in?
4
6. Data Description
The data for this research is obtained by the United States Census Bureau through the American Community Survey.
The American Community Survey (ACS) is given every month and compiled on an annual basis.
It is administered primarily through mail with phone and in-person follow-ups. Approximately one out of three people
who do not respond by mail are randomly selected for in-person interviews.
The survey is sent to approximately 250,000 homes each month, or about 3 million people annually. It is
administered to people of every age ranging from 0-135.
The data for our research is for the years 2010 and 2011. It contains 241 variables and 6,173,709 observations.
The dataset was extracted from the Integrated Public Use Microdata Series (IPUMS) website. The website is ran by the
Minnesota Population Center is designed to collect and distribute census data for free to users. We were able to choose
all of our variables and which sample years we wanted to research.
5
7. Modeling Dataset
We specified a narrower age range, from 18-65, before downloading the dataset. Since we are looking at labor force
participation, we want to only include those who are most eligible to work. This includes adult over 18 and those under
the retirement age of 65.
When looking at industries, their were too many different choices. We narrowed the variable down to general industry
rather than specific ones. We did this by categorizing various industries into larger groups.
We followed the same process for birthplace as we did for industry. Rather than tracking every single state or country,
we put them into various broader categories.
A dependent variable dummy representing whether or not the participant is out of the labor force was created using a
labor force status variable. We manipulated it to only represent whether they are or are not in the labor force.
We made a dependent dummy variable using the industry variable to represent people working in manual labor. We
decided we wanted to look at this specific industry rather than each industry individually, so our models and statistics
will reflect this group.
There are 3,856,734 observations and a total of 32 variables in our dataset.
6
10. Dependent Variable: Out of Labor Force
9
This variable describes how many people are and are not in
the labor force. There are only two outcomes. They are either
in the labor force or out of the labor force.
The pie chart will give provide these percentages of people.
Almost 1 of 4 of those surveyed are not in the labor force
which makes sense since the national average has long been
around 65% of people being in the labor force. This means
that our group of results is a good sample
11. 10
Dependent Variable: Industry
This graph shows the industries of those who were
surveyed. The original data listed responses by job type
but we created a new variable that categorized the jobs
into industries.
“Not Applicable” represents those not in the labor force.
The largest portions are in educational, health, and social
services. When compared to the whole population, these
represent the national averages well.
About 1/3 of people work in retail or education, health,
and social service.
12. 11
Independent Variable: Age Group
This chart shows the independent variable for age. It is broken down by groups, with a range of 10 years per
group except for the 18-25 category.
The graph shows us that age is fairly evenly distributed with the older populations having a slight higher
presence.
Almost half of people in our sample are between 46 and 65.
13. 12
Independent Variable: Sex
This variable tells us how many males and females took the
survey.
The graph shows about a 50/50 split in the gender of our
sample with slightly less males.
14. 13
Independent Variable: Race
This is a general indicator of the persons race
with only five categories to choose from.
The bar graph shows the distribution of races
of those surveyed. White makes up the
highest percentage at 81.88%, but there are
still about 700,000 is the sample who are not
White.
There are about twice as many Blacks than
Asians who took the survey and about 5 times
as many Asians as American Indian’s.
15. 14
Independent Variable: Number of Children in Household
This chart shows the number of
children in each household. At
first glance it seems strange that
such a high percentage have zero,
but when you think about it, it
makes sense.
Most of those surveyed, as seen
earlier are over 45 which means
it is probable that any kids they
did have moved out or will soon.
Also, the 18-25 sample likely
hasn’t had children yet so that
leaves a smaller sample in the
years of having children at home.
16. 15
Independent Variable: Marital Status
This chart shows the marital status of the sample.
About half are married with there spouse present.
This variable will be valuable in conditional
statistics to see the affects marriage has in
handling and overcoming language barriers. For
example if you wife speaks English, it may have
less of an impact.
About 1 in 3 of those surveyed are single or never
married. Less than 1 and 5 have had a spouse at
some point that is no longer around.
17. 16
Independent Variable: Income
This chart shows the income level for each individual
in the sample. Clearly our sample is mainly dealing
with low to middle income individuals. This will be
able to be tied to success in the workforce based on
language barriers.
18. 17
Independent Variable: Education
This variables gives a general look at the education level of the participants.
From looking at the graph, you can see most of our sample has at least graduated from high school. 37%
stopped after high school, while about half went on to college.
19. 18
Independent Variable: English Fluency
This is one of our key independent variables, English
fluency. This will be important when looking at the effects
of language barriers. It tells us how well, if at all, a
participant speaks English.
While only about 9% of our population don’t speak
English very well or fluently, this is still a large number
since our sample is so big.
20. 19
Independent Variable: Citizenship Status
The variable gives insights into whether or not a person
was born in the United States or abroad.
The chart shows the percentages of those surveyed who
are not naturally born citizens.
About 15% of our sample are not natural born which
makes a good sample for our project.
Those who entered N/A are likely born in America and
do not have need to answer the question otherwise.
21. 20
Independent Variable: Birthplace
This chart represents the country where each person was born.
About 85% were born in the US, but that still leaves us with 15% of a sample that have a separate culture.
Of those born in a foreign country, the majority come from Central America and Asia.
22. 21
Independent Variable: Years in United States
This is the independent variable for how many years each sample
has been in the USA. It is an interval set of time for the
responses.
N/A represents those who were born here so those wont have as
great an impact on our project, but the rest will be usable.
The distribution is fairly even among ranges of time. However,
most have been in the US less than 20 years.
23. 22
Independent Variable: School Attendance
This pie chart represents whether or not each person
surveyed is in school or not.
This represents whether they were students at the time
of the survey. The overwhelming majority, about 9 out
of 10 people, are not in school.
25. Effects of Age on Being Out of The Labor Force
24
The graph illustrates the percentage of people out
of the labor force based on specific age groups
varying from 18 to 65.
40% of the people out of the labor force are from
56 to 65, the oldest group. The next closest group
is the youngest group who makes up 33%.
The rest of the age groups make up the lowest
percentages of people out of the labor force and
are very evenly distributed.
26. Effects of Sex on Being Out of The Labor Force
25
The graph illustrates the percentage of people out of the labor force
based on sex.
The link between gender and percentage of people out of the labor force
is some what defined.
Females make up a greater majority of people out of the labor force
with 30%. Males
27. Effects of Race on Being Out of The Labor Force
26
The graph illustrates the percentage of people out of the labor force based on race.
There is not a very apparent link between race and labor force status.
American Indian/Alaska Native and Black make up the majority of people outside of the labor
force.
At least one quarter of each race is out of the labor force.
28. Effects of Children on Being Out of The Labor Force
27
The graph illustrates the number of children each household has and the effects it has on the percentage
of people out of the labor force.
For the most part as the number of children in a household increases the percent of people out of the
labor force increases.
Households with only two children typically have the lowest percentage of people out of the labor force.
29. Effects of Marital Status on Being Out of The Labor Force
28
The graph illustrates the percentage of people out of the labor force based on
marital status.
People that are widowed have the highest chance of being out of the labor force
followed by those that are married, but the spouse is absent.
People that are married with a spouse present are the least likely to be out of the
labor force.
30. Effects of Income on Being Out of The Labor Force
29
The graph illustrates the difference in total personal income for
people in the labor force vs. people not in the labor force.
There is a very apparent link between total personal income and
labor force status.
Those in the labor force make over four times the amount of
those out of the labor force.
31. Effects of Education on Being Out of The Labor Force
30
The graph illustrates percentage of people out of the labor force based on the amount of
education each person has.
On average, those with lower education levels (high school and below) have a much
higher chance of not being in the labor force.
People with no education have the highest percentage of being out of the labor force at
49%.
Those with 5+ years of college have the smallest chance of being out of the labor force.
32. Effects of English Fluency on Being Out of The Labor Force
31
The graph illustrates the percentage of people out of the labor force based on their
English fluency.
On average, most people out of the labor force either do not speak English or speak it
but not very well.
Those that speak English very well make up the smallest portion of people out of the
labor force.
2 out of 5 people out of the labor force do not speak English.
33. Effects of Citizenship on Being Out of The Labor Force
32
The graph illustrates the percentage of people out of the labor force
based on citizenship status.
Just under 30% of people out of the labor force are not a citizen of the
United States.
On average, those that are citizens of the United States have the
smallest chance of finding themselves out of the labor force.
34. Effects of Birthplace on Being Out of The Labor Force
33
The graph illustrates the percentage of people out of the labor force
based on their birthplace.
People born in US territories have the highest chance of being out of
the labor force followed by those that are born in the Middle East.
People that born in Africa are the least likely to be out of the labor
force.
35. Effects of Years spent in the United States on Being Out of The Labor Force
34
The graph illustrates the percentage of people out of the labor force based on the
number of years they have been in the United States.
On average, the first five years spent in the United States are the most likely to be out
of the labor force.
People who have spent 11+ years in
the United States have an equal
chance of becoming a part of the
group not in the labor force.
36. Effects of School Attendance on Being Out of The Labor Force
35
The graph illustrates the
percentage of people out of
the labor force based on
whether or not they are
currently enrolled in
school.
There is a strong link
between school attendance
and the labor force status.
On average, those in school
make up one and a half
times the amount of people
out of the labor force then
those not currently enrolled
in school.
38. 37
Effects of Age on Working in Manual Labor
This chart shows the percent of people in the manual labor
industry based on each persons respective age.
This shows a balanced percentage, 9%, between ages 26 –
55. This makes sense since younger and older people are
each less likely to work in manual labor, while 26 to 55 year
olds are considered to be in their working years.
39. 38
Effects of Sex on Working in Manual Labor
This graph shows the rate of males and females working
in manual labor.
As expected, the percentage is much higher in the male
population with almost 1 out of 7 of males working in
this field, while only 1 out of 50 females.
40. 39
Effects of Race on Working in Manual Labor
This chart shows the effect race has on likelihood of working in manual
labor.
The results show a low correlation between the two statistics.
American Indian and White were the two highest with rates between
9% and 11%. These groups make up about 1/5th of the total.Asian and
Black were the two lowest with rates around 3%-4%.
41. 40
Effects of Children on Working in Manual Labor
This chart shows the likelihood of
working in manual labor based on
the number of children in
household.
The two directly correlate since the
more children, the higher percent
working in manual labor.
Almost ¾ of people who have more
than 5 children work in manual
labor.
42. 41
Effects of Marital Status on Working in Manual Labor
This graph shows the effect marital status has on working in manual labor.
The lowest percentage, 4%, is widowed which makes sense since that is likely an older population.
1 in 5 married individuals work in manual labor.
43. 42
Effects of Income on Working in Manual Labor
This graph shows the effect of income on working in manual
labor.
Manual labor workers have a slightly higher total personal
income, $39781.67 than other industry workers that average
$36,046.47.
44. 43
Effects of Education on Working in Manual Labor
The graph illustrates percentage of people working in manual labor
based on the amount of education each person has.
On average, those with lower education levels (high school and below)
have a much higher chance of working in manual labor.
People with an education level up to grade 4 have the highest percentage
of working in manual labor at 19%.
Those with 5+ years of college have the smallest chance of being out of
the labor force.
45. 44
Effects of English Fluency on Working in Manual Labor
The graph illustrates the percentage of people working in manual
labor based on their English fluency.
On average, most people working in manual labor either do not
speak English or speak it but not very well.
Those that speak English very well make up the smallest portion
of people working in manual labor.
2 out of 5 people working in manual labor do not speak English.
46. 45
Effects of Citizenship on Working in Manual Labor
The graph illustrates the percentage of people working in manual
labor based off of their citizenship status.
Just under 20% of people working in manual labor are a citizen
of the United States.
On average, those that are naturalized citizens of the United
States have the smallest chance of finding themselves working in
manual labor.
47. 46
Effects of Birth Place on Working in Manual Labor
The graph illustrates the percentage of people who are working in
manual labor based off of where they were born.
On average, people born in Central America made up twice the
amount of people working in manual labor than any other
country.
People born in Africa and Asia made up the lowest portion of
people working in manual labor
48. 47
Effects of Years Spent in United States on Working in Manual Labor
The graph illustrates the percentage of
people working in manual labor based of
the number of years they have lived in
the United States.
There is a not a strong link between years
in the United Stats and percent of people
working in manual labor.
On average, with 6-15 years in the United
States made up the largest percentage of
people working in manual labor.
49. 48
Effects of School Attendance on Working in Manual Labor
The graph illustrates the percentage of people
who either do or do not attend school based upon
if they do some type of manual labor.
9% of the people not currently enrolled in school
work in manual labor. 4% of people currently
attending school work in manual labor.
There is not a strong relationship between school
attendance and percent of people working in
manual labor.
51. 50
Missing Values: Discussion
Missing values are commonly responses that were left blank by a survey participant. However in some cases, values
may have been input but still considered missing. For this study, answers that were noted as N/A, or not applicable, can
be considered missing because they did not apply to the participant.
There were no actual missing values in this study. All were coded in some way to be deemed not applicable as
mentioned. To reduce the effect of missing values we refined the data before downloading to have only 18-65 year olds.
We chose this age range to represent people who are eligible and likely to work. In turn, there are less not applicable
responses than we would have otherwise had, making our data more accurate.
Very few variables had a response that denoted N/A. Therefore we did not have to do extensive missing value
imputation. We only dropped missing values for one variable.
We considered dropping missing, or not applicable values, from the variables representing birth place and years in the
United States. The question was not applicable because they were born in the United States. We decided to include
these values in our baseline instead.
52. 51
Variable: Education
One variable where we dropped missing values was the education variable. It did not seem necessary to keep it in
because it had no significance. There was no reason to keep it in the data or group it with the baseline. There were
enough possible responses that a person could have reasonably chose another category, therefore we do not exactly
know what N/A would represent and do not want to include it. To see exactly how we did this, see slide in the appendix
section.
54. 53
Dummy Variables
Variable Dummy Variable(s) Baseline
LABFORCE OutofLF In Labor Force
INDGRP ManualLabor Not Applicable
AgeGroup
Age18to25
Age26to35
Age36to45
Age56to65
Age 46 to 55
SEX Female Male
RACESING
Black
Indian
Asian
Other
White
NCHILD
OneChild
TwoChildren
ThreeOrMoreChildren
No Children
MARST
Separated
Divorced
Widowed
Single
Married
EDUC
DidNotGraduate
OneYearCollege
TwoYearsCollege
ThreeYearsCollege
FourYearsCollege
FiveYearsCollege
Graduated from
high school
SPEAKENG
NoEnglish
SpeakVWell
SpeakWell
PoorEnglish
Speaks Only
English
Variable Dummy Variable(s) Baseline
CITIZEN
BornAbroad
Naturalized
NotACitizen
N/A (Born in United
States)
BirthPlace
USTerritory
OtherNorthAmerica
CentralAmerica
SouthAmerica
Europe
Russia
Asia
MiddleEast
Africa
Oceania
Born in United
States
YRSUSA2
LessThan5Yrs
SixTo10Yrs
ElevenTo15Yrs
SixteenTo20Yrs
21+ Years in United
States
SCHOOL NotInSchool In School
56. 55
Regression Analysis:
Discussion
For our second dependent variable, we wanted to look more specifically at one type of industry, which
represented manual labor. If the surveyed person worked in agriculture, mining, utilities, or construction, then
we placed them in this category. If we wanted to take an in depth look at every industry, we could run a
regression for each one. Our interest for this model is what the likelihood is that someone with language
barriers will work in manual labor.
Model 2
Model 1
For our first dependent variable, we wanted to have a model that looked at people who are out of the labor
force.
The main purpose of doing our regression analysis is to see what variables are statistically significant in
determining whether or not someone is out of the labor force. If it is, what are the odds or probabilities that they
will be out of the labor force, given that independent variable? We will find predicted probabilities of each
variable as well and model that through a histogram. For further manipulation, we will run simulations with
different variables to see how predicted probabilities change when variables change.
All variables in these regressions are statistically significant at the 1% level.
59. 58
Interpretation:
Out of Labor Force
Statistically significant at the 1% level.
Statistically Significant
Statistically Insignificant
60. 59
Probability Distribution:
Out of Labor Force
About 1 in 100 people have a
10%-25% chance of not being in labor force.
About 25 in 10000 people have over a 56%
probability of not being in labor force.
64. 63
Probability Distribution:
Industry – Manual Labor
The distribution chart shows that an
average of 8% of the population work in
manual labor as a whole.
When additional variables are added,
besides simply being in the work force,
this rates varies from 0% to 53%, where
0% is no statistical significance.
66. 65
Simulation:
Model 1
Using our modeling dataset for the OutOfLF variable, we found the following observations where the predicted
probability of them being out of the labor force was extremely high, over 90%.. It is important to notice that
they all have certain characteristics in common. They are all females between the age of 56 to 65 who speak no
English, did not graduate from high school, and have been in the United States less than 5 years.
Using our modeling dataset again, we found these observations where the probability was low that people would
be out of the labor force. They had a little less than 3% chance. They all spent at least 5 years in college, are
naturalized, and are not currently in school. They also speak English very well.
67. 66
Simulation:
Model 2
When using our modeling dataset for this variable, the likelihood that someone works in manual labor is
strongly correlated to several variables. These variables are noted above in the table. As you can see, all of the
participants who had identified with these characteristics had over a 50% probability of working in manual
labor.
There were also several variables that made it very unlikely that an individual worked in manual labor. The
most unlikely individuals had a 1/10th of a percent chance, and were usually female, black, and single.
69. 68
Conclusion
There were several variables that were significant in both models and had a large impact on the predicted
probabilities. People who identified as not a citizen, speaking no English, or did not graduate from high school, all had
a higher probability of being out of the labor force. These were all statistically significant variables, with high estimate
values, which makes them more important. This was also true for the second model for manual labor.
In both cases, there were many statistically significant values. However, many of those that were deemed significant
at the 1% level had little impact on the predicted probability. This was because of their smaller estimate values.
Essentially, they are less important to determining whether or not someone is out of the labor force or working in
manual labor. When looking at those who are highly likely, these will be more inconsistent than compared to the more
important variables mentioned in the last paragraph. Majority of the variables that were identified as insignificant
when running the regression were the same for both models, with the exception of those who were age 36 to 45, of
other race than listed, having one child, speaking English well, and living in the United States less than 5 years.
Two major differences between the people in the two models were their sex and the time they have been in the United
States. Those who were female, were likely to be out of the labor force but very unlikely to be in manual labor. Those
who have been in the US less than 5 years, were very unlikely to work in manual labor but were likely to be out of the
labor force.
Our research question set out to answer whether or not language barriers affect labor force status and industry type. It
appears that this is indeed the case, given the probabilities we found in the modeling dataset. Immigrants who speak
poor or no English are likely to be out of the labor force or work in manual labor if they are in the labor force.
71. Categorizing Variables
70
/* Creating INDGRP Dependent */
if 0170<=IND<=0290 then INDGRP=1;
else if 0370<=IND<=0490 then INDGRP=2;
else if 0570<=IND<=0690 then INDGRP=3;
else if IND=0770 then INDGRP=4;
else if 1070<=IND<=3990 then INDGRP=5;
else if 4070<=IND<=4590 then INDGRP=6;
else if 4670<=IND<=5790 then INDGRP=7;
else if 6070<=IND<=6390 then INDGRP=8;
else if 6470<=IND<=6780 then INDGRP=9;
else if 6870<=IND<=7190 then INDGRP=10;
else if 7270<=IND<=7790 then INDGRP=11;
else if 7860<=IND<=8470 then INDGRP=12;
else if 8560<=IND<=8690 then INDGRP=13;
else if 8770<=IND<=9290 then INDGRP=14;
else if 9370<=IND<=9590 then INDGRP=15;
else if 9670<=IND<=9870 then INDGRP=16;
else if IND=9920 then INDGRP=17;
else INDGRP=0;
/* Creating AgeGroup Independent */
if 18<=AGE<=25 then AgeGroup=1;
else if 26<=AGE<=35 then AgeGroup=2;
else if 36<=AGE<=45 then AgeGroup=3;
else if 46<=AGE<=55 then AgeGroup=4;
else if 56<=AGE<=65 then AgeGroup=5;
else AgeGroup=0;
/* Creating Birthplace Independent */
if 001<=BPL<=056 then BirthPlace=1;
else if 100<=BPL<=120 then BirthPlace=2;
else if 150<=BPL<=199 then BirthPlace=3;
else if 200<=BPL<=299 then BirthPlace=4;
else if BPL=300 then BirthPlace=5;
else if 400<=BPL<=459 then BirthPlace=6;
else if 460<=BPL<=499 then BirthPlace=7;
else if 500<=BPL<=524 then BirthPlace=8;
else if 530<=BPL<=599 then BirthPlace=9;
else if BPL=600 then BirthPlace=10;
else if 700<=BPL<=950 then BirthPlace=11;
else BirthPlace=.;
72. Adding Formats
71
value OutOfLF_f
1='Not In Labor Force'
0='In Labor Force’;
value ManualLabor_f
1=‘Yes Manual Labor'
0=‘No Manual Labor’;
value INDGRP_f
1='Agriculture'
2='Mining'
3='Utilities'
4='Construction'
5='Manufacturing'
6='WholeSale Trade'
7='Retail'
8='Transportation and Warehousing'
9='Information and Communications'
10='Finance, Insurance, Real Estate'
11='Professional Services'
12='Educational, Health, and Social Services'
13='Entertainment and Food Services'
14='Other Services'
15='Public Admin'
16='Armed Forces'
17='Unemployed'
0='Not Applicable’;
value AgeGroup_f
1='18 to 25'
2='26 to 35'
3='36 to 45'
4='46 to 55'
5='56 to 65’;
value BirthPlace_f
1='United States'
2='US Territory'
3='Other North America'
4='Central America'
5='South America'
6='Europe'
7='Russia'
8='Asia'
9='Middle East'
10='Africa'
11='Oceania'
.='Missing’;
73. 72
Creating Dependent Dummy Variables
/* Creating Dummy for LABFORCE Dependent */
if LABFORCE=1 then OutOfLF=1;
else OutOfLF=0;
/* Creating Dummy for INDGRP Dependent */
if INDGRP=1 or INDGRP=2 or INDGRP=3 or INDGRP=4 then
ManualLabor=1;
else ManualLabor=0;
/* Creating Dummy for INDGRP Dependent */
if INDGRP=1 or INDGRP=2 or INDGRP=3 or INDGRP=4 then
ManualLabor=1;
else ManualLabor=0;
if INDGRP=5 then Manufacturing=1;
else Manufacturing=0;
if INDGRP=6 then WholeSaleTrade=1;
else WholeSaleTrade=0;
if INDGRP=7 then Retail=1;
else Retail=0;
if INDGRP=8 then TransportationWarehousing=1;
else TransportationAndWarehousing=0;
if INDGRP=9 then InformationCommunications=1;
else InformationCommunications=0;
if INDGRP=10 then FinanceInsuranceRealEstate=1
else FinanceInsuranceRealEstate=0;
if INDGRP=11 or INDGRP=13 then ProfEnterFoodServ=1;
else ProfEnterFoodServ=0;
if INDGRP=12 then EducHealthSocialServ=1
else EducHealthSocialServ=0;
if INDGRP=14 or INDGRP=15 then AdminOtherServ=1;
else AdminOtherServ=0;
if INDGRP=16 then ArmedForces=1;
else ArmedForces=0;
if INDGRP=17 then Unemployed=1;
else Unemployed=0;
74. 73
Creating Independent Dummy Variables
/* Creating Dummy for AgeGroup Independent */
if AgeGroup=1 then Age18to25=1;
else Age18to25=0;
if AgeGroup=2 then Age26to35=1;
else Age26to35=0;
if AgeGroup=3 then Age36to45=1;
else Age36to45=0;
if AgeGroup=5 then Age56to65=1;
else Age56to65=0;
/* Creating Dummy for Sex Independent */
if SEX=2 then Female=1;
else Female=0;
/* Creating Dummy for RACESING Independent */
If RACESING=2 then Black=1;
else Black=0;
If RACESING=3 then Indian=1;
else Indian=0;
If RACESING=4 then Asian=1;
else Asian=0;
If RACESING=5 then Other=1;
else Other=0;
/* Creating Dummy for NCHILD Independent */
If NCHILD=1 then OneChild=1;
else OneChild=0;
If NCHILD=2 then TwoChildren=1;
else TwoChildren=0;
If 3<=NCHILD<=9 then ThreeOrMoreChildren=1;
else ThreeOrMoreChildren=0;
/* Creating Dummy for MARST Independent */
if MARST=3 then Separated=1;
else Separated=0;
if MARST=4 then Divorced=1;
else Divorced=0;
if MARST=5 then Widowed=1;
else Widowed=0;
if MARST=6 then Single=1;
else Single=0;
75. 74
Creating Independent Dummy Variables
/* Creating Dummy for EDUC Independent */
/* Dropping missing values for Education */
if EDUC=00 then delete;
if EDUC=01 or EDUCD=02 or EDUC=03 or EDUCD=04
or EDUCD=05 then DidNotGraduate=1;
else DidNotGraduate=0;
if EDUC=07 then OneYearCollege=1;
else OneYearCollege=0;
if EDUC=08 then TwoYearsCollege=1;
else TwoYearsCollege=0;
if EDUC=10 then FourYearsCollege=1;
else FourYearsCollege=0;
if EDUC=11 then FiveYearsCollege=1;
else FiveYearsCollege=0;
/* Creating Dummy for SPEAKENG Independent */
if SPEAKENG=1 then NoEnglish=1;
else NoEnglish=0;
if SPEAKENG=4 then SpeakVWell=1;
else SpeakVWell=0;
if SPEAKENG=5 then SpeakWell=1;
else SpeakWell=0;
if SPEAKENG=6 then PoorEnglish=1;
else PoorEnglish=0;
/* Creating Dummy for CITIZEN Independent */
if CITIZEN=1 then BornAbroad=1;
else BornAbroad=0;
if CITIZEN=2 then Naturalized=1;
else Naturalized=0;
if CITIZEN=3 then NotACitizen=1;
else NotACitizen=0;
/* Creating Dummy for BirthPlace Independent */
if BirthPlace=2 then USTerritory=1;
else USTerritory=0;
if BirthPlace=3 then OtherNorthAmerica=1;
else OtherNorthAmerica=0;
if BirthPlace=4 then CentralAmerica=1;
else CentralAmerica=0;
if BirthPlace=5 then SouthAmerica=1;
else SouthAmerica=0;
if BirthPlace=6 then Europe=1;
else Europe=0;
if BirthPlace=7 then Russia=1;
else Russia=0;
if BirthPlace=8 then Asia=1;
else Asia=0;
if BirthPlace=9 then MiddleEast=1;
else MiddleEast=0;
if BirthPlace=10 then Africa=1;
else Africa=0;
if BirthPlace=11 then Oceania=1;
else Oceania=0;
76. 75
Creating Independent Dummy Variables
/* Creating Dummy for YRSUSA2 Independent */
if YRSUSA2=1 then LessThan5Yrs=1;
else LessThan5Yrs=0;
if YRSUSA2=2 then SixTo10Yrs=1;
else SixTo10Yrs=0;
if YRSUSA2=3 then ElevenTo15Yrs=1;
else ElevenTo15Yrs=0;
if YRSUSA2=4 then SixteenTo20Yrs=1;
else SixteenTo20Yrs=0;
/* Creating Dummy for SCHOOL Independent */
if SCHOOL=1 then NotInSchool=1;
else NotInSchool=0;
run;