This document summarizes a study that replicates and expands on a previous study of immigrant wages in Canada using 2001 and 2006 census data. It divides Canada into three labor markets: Alberta, Quebec, and the rest of Canada (ROC). The summary is:
1) The study finds immigrant wage gaps exist in all three regions in both years, with the gap widening over time in Quebec and the ROC but narrowing in Alberta.
2) Regression analysis shows immigrants have lower returns to characteristics like education and experience compared to non-immigrants. Speaking a non-official language at home reduces wages while citizenship increases wages.
3) A Blinder-Oaxaca decomposition finds unexplained factors contribute to wider
2. 2/45
Introduction
In 2010, Serge Nadeau and Aylin Seckin decomposed the
immigrant wage gap in Canada using census data from the years
1981, 1991, and 2001. In their study, the country was divided
into two distinct labour markets, that of Quebec and that of the
rest of Canada (henceforth known as the ROC), and the immigrant
wage gaps of each region were decomposed by means of a
customised variant of the Blinder-Oaxaca method (Nadeau 266):
Equation 1:
In black are the terms of the standard Blinder-Oaxaca
decomposition. The difference in the mean log wages of two
groups (i.e. immigrants and non-immigrants) is decomposed into a
difference explained by the group’s respective labour market
characteristics (e.g. education, experience) and a difference
that cannot be explained by labour market characteristics and is
therefore attributed to labour market discrimination (Jann 2).
It is worth noting that in addition to discrimination, the
unexplained difference is likely to capture the effects of
factors either not specified in the decomposition (e.g. distance
3. 3/45
to work) or factors difficult to measure (e.g. cultural
attitudes towards work.)
In red is the element added to the decomposition by Nadeau
and Seckin. It is an additional term – thus making the model a
decomposition into three components in place of the normal two –
containing parameters unique to immigrants that are known to
affect their labour market potential (e.g. citizenship, age of
immigration.)
Reproduction of Study
A particularly interesting recent economic trend in Canada
is that of the oil boom in Alberta, which began in the early
2000s when the market price of petroleum products became
sufficiently elevated as to render profitable the development of
the Athabasca oil sands (National Energy Board 11). With the
boom has come a tremendous increase in provincial “GDP” and,
potentially, an increase in real wages and a decrease in the
immigrant wage gap. It is therefore that in creating a study
based on that of Nadeau and Seckin to explore such possibilities
that Alberta is posed as a third unique labour market within
Canada, in addition to those of Quebec and the ROC (which,
naturally, now does not include Alberta.) In order to capture
the effects of the oil boom, the data stem from the 2001 and
4. 4/45
2006 Canadian Censuses. The selection criteria for workers are
the same as in the original article: men (in order to isolate
the immigrant wage gap from a potential male-female wage gap),
aged between 20 and 64, not self-employed, and with a strong
labour force attachment, which is defined as working more than
20 hours per week and more than 26 weeks per year (Nadeau 267).
Because Nadeau and Seckin chose 1981, 1991, and 2001 so as to
have years at similar stages in the business cycle (they are
considered peak years,) (Nadeau 282) it is fortunate that 2001
and 2006 share the same traits.
While finding data in keeping with the criteria laid out by
Nadeau and Seckin is quite simple, reproducing the custom
Blinder-Oaxaca model of their original article is unfortunately
beyond the scope of this course. Consequently, the decomposition
method used in this 2001-2006 study is the standard Blinder-
Oaxaca decomposition and the third term for immigrant-specific
traits will not be included, precluding the analysis of the
impact of factors such as age of immigration.
Initial Analysis
Before proceeding to the Blinder-Oaxaca decomposition and
the OLS regression on which it is based, a preliminary analysis
of the data was undertaken (Table 1). Most notably one observes
5. 5/45
that the mean real wage (in 2001 Canadian dollars) increased by
approximately two dollars for all groups in the ROC and for non-
immigrants in Quebec. The fact that the mean real wage for
immigrants in Quebec is essentially unchanged from 2001 to 2006
plays into a greater narrative of the immigrant wage gap in
Quebec (more to come) and the capacity of the province to
integrate its immigrants. Alberta differs markedly from Quebec
and the ROC as both immigrants and non-immigrants saw a mean
real wage increase of $5.6 CAD.
Looking at factors other than mean real wages, one
observes, for instance, that immigrants in all three labour
markets have a higher level of education than their Canadian-
born counterparts, a trend in keeping with Canada’s policies of
selected immigration. Likewise, no surprises are found when
looking at ‘languages spoken at home’ and ‘knowledge of official
languages’: the vast majority of non-immigrants speak English at
home in Alberta and the ROC and speak French at home in Quebec.
Immigrants are less likely to speak the dominant language of the
region at home. More people live in bilingual households in
Quebec than in Alberta and the ROC. Finally, and as is
frequently a subject of debate in Quebec, immigrants to the
province are less likely to speak French than immigrants to
Alberta and the ROC are to speak English, a fact which may point
6. 6/45
to an increased failure of immigrants to integrate in Quebec and
which, as will be seen in the Blinder-Oaxaca decomposition,
causes a widening of the immigrant wage gap. Finally, it is
interesting to note that in Alberta, Quebec, and the ROC,
approximately 65% of non-immigrants live in cities. Immigrants,
on the other hand, are far more likely to live in cities than
non-immigrants with rates near 90% in Alberta and the ROC and
near 96% in Quebec.
In looking at factors unique to immigrants, one observes
that immigrants to Quebec are far less likely to originate from
the United States and the United Kingdom than immigrants to
Alberta and the ROC and are more likely to originate from
‘other.’ As the U.S. and the U.K. share strong cultural ties to
Canada and typically provide the best-assimilating immigrants,
this may be amongst the root causes of Quebec’s integration
difficulties.
The Immigrant Wage Gap
The immigrant wage gap is found simply by calculating the
difference of mean log wages between immigrants and non-
immigrants. A negative value therefore indicates an advantage to
Canadian-born individuals. Results are displayed in Table 2.
7. 7/45
Table 2: Wage Gaps, Immigrants vs. Those Born in Canada
2001 2006
Province Gap |t| Province Gap |t| Δ
Alberta -0.050** 3.26 Alberta -0.038* 2.33 +0.012
Québec -0.129*** 10.9 Québec -0.168*** 14.32 -0.039
ROC -0.050*** 8.86 ROC -0.065*** 11.3 -0.015
In both 2001 and 2006, all results are found to be
statistically significant. All three regions are found to have
negative wage gaps over the period, indicating an advantage to
Canadian-born workers. One observes that in Quebec and the ROC,
the immigrant wage gap widens (i.e. becomes more negative) over
the period. Quebec, which has the widest wage gap in 2001, also
has the largest change in wage gap amongst the three labour
markets as it grows from -0.129 in 2001 to -0.168 in 2006. This
result is hinted at in Table 1 as one observes that while the
mean real wage for Canadian-born workers in the province
increases by approximately two dollars over the period, the mean
real wage for immigrants increases by only 0.3 dollars, the
smallest increase of all three labour markets. The wage gap in
the ROC, in contrast, widens by left than half that of Quebec
over the same period, a result also predicted by the data in
Table 1 as the mean real wage grows slightly slower (1.9 vs.
2.3) for immigrants than for non-immigrants. Only in Alberta did
the wage gap shrink as it progressed from -0.050 in 2001 to
-0.038 in 2006, a result not only expected due to the recent oil
8. 8/45
boom but also foreshadowed by Alberta’s relatively strong mean
real wage increase as seen in Table 1: exactly 5.6 dollars for
both groups. See Graph 1 for a visual representation of the wage
gaps and their evolution.
Regression Results
The standard Blinder-Oaxaca decomposition contains, as
shown in Equation 1, vectors βB
and βI
which contain the returns
to various labour market characteristics (e.g. education,
experience) as determined by an OLS regression for Canadian-born
and immigrant workers, respectively. It is worth noting that the
regression results are not merely an intermediate step of little
import but are in and of themselves an interesting point of
analysis allowing one to compare returns amongst the two groups
and across the labour markets. Note that the dependant variable
(wage) is logarithmic and the independent variables are level,
meaning that regression coefficients are interpreted as the
decimal expression of the percent change in the dependent
variable. E.g. a coefficient of 0.05 indicates each unit of the
factor is estimated to increase wage by 5%.
Alberta
Turning first to the Alberta regression (Table 3) and
looking only at statistically significant results, one observes
that education (educ) has a positive return amongst all groups
9. 9/45
and in both 2001 and 2006; the same is true for potential
experience (exp_poten.) The returns for immigrants, however, are
lower than those for Canadian-born workers, a result which holds
true for all three labour markets. The coefficients of potential
experience squared (exp_poten_sq) are all negative, indicating
(as expected) than potential experience has decreasing marginal
returns. Amongst linguistic factors, only the coefficients for
speaking a non-official language at home (other_home) are
statistically significant. As English is the reference language
and given the fact that the language spoken at home is a good
indicator of an individual’s fluency (Nadeau 267), it is not
surprising that the other_home coefficient is negative (i.e. it
is estimated to decrease one’s wage.) More interesting, however,
is the size of the coefficient which, at approximately -0.2 for
both groups and both years, is the largest single coefficient of
the regression. One can therefore conclude that there is a high
premium placed on fluency in English in Alberta. Another
interesting result is that of the return to living in a
metropolitan region (CMA). There is a premium of approximately
10% in 2001 but a premium of only approximately 4% in 2006, a
trend likely due to the fact that many jobs associated with the
oil boom, particularly those related to extraction and
transportation, are found outside of metropolitan areas.
10. 10/45
Amongst factors unique to immigrants, one observes that
there is a premium associated with becoming a Canadian citizen:
an 8.8% premium in 2001 and an 11% premium in 2006. This finding
is in keeping with other empirical studies of the Canadian
labour market (Nadeau 272); an explanation of the mechanism
behind the phenomenon is beyond the scope of this study. The
other two statistically significant coefficients, those of
‘other’ countries of origin (autre) and Asian foreign labour
market experience (exp_asie,) are both negative relative to the
reference countries the U.S. and the U.K., indicating that
immigrants from these regions may have a decreased
transferability of skills and/or work experience that employers
find less applicable to Canadian jobs.
Quebec
Turning to Quebec (Table 4), one observes a similar
positive return to education and potential experience for both
groups and a higher return for Canadian-born workers. Potential
experience is also found to have decreasing marginal returns.
Amongst linguistic factors, always a hot-button topic in the
province, there is a similar negative return to speaking a non-
official language at home, indicating that fluency in an
official language is as important in Quebec as it is in Alberta.
A phenomenon not seen in Alberta is that of a positive return on
11. 11/45
being bilingual. An increase in wage of 7.9% is predicted for
Canadian-born workers in 2001 and an increase of 13.0% is
predicted for immigrants in 2006. This finding is likely related
to English’s position as a global lingua franca and the fact
that Quebec is, as a province, more bilingual than either
Alberta or the ROC (Table 1). A final interesting characteristic
of Quebec is that relative to the U.S. and the U.K., all other
countries of origin have a negative return. As expected, ‘other’
countries have the most negative coefficient, but interestingly,
the coefficients for countries in Europe and countries in Asia
are of approximately the same magnitude in both periods, a trend
which may indicate that cultural factors (other than language)
are not necessarily advantageous to Europeans despite sharing
cultural roots with Quebec.
ROC
Unique to the regression for the ROC (Table 5) are
variables for the Prairie Provinces (taken here as Saskatchewan
and Manitoba) and British Columbia. The regression yields that
living in both the Prairies and B.C. reduces one’s income
relative to other parts of the ROC (essentially Ontario as the
Territories and Atlantic Provinces are excluded as they are home
to sufficiently few immigrants that confidentiality cannot be
assured.) Amongst education and potential experience one
12. 12/45
observes results comparable to those of the other two labour
markets: a positive return on education and potential
experience, higher returns for Canadian-born workers, and
decreasing marginal returns to potential experience. Amongst
statistically significant linguistic factors are speaking a non-
official language at home (other_home) and not having a
knowledge of either official language (none_know), both of which
have, as expected, negative returns.
Amongst factors unique to immigrants it is interesting to
note that the ROC regression has more statistically significant
coefficients than the previous two regressions. Relative to the
U.S. and U.K., for instance, all other countries of origin are
estimated to have a negative impact on an individual’s real
wage. The same pattern of negative returns is found when looking
at foreign work experience by country: work experience in all
regions, aside from the U.S. and the U.K., is expected to
diminish one’s real wage.
Blinder-Oaxaca Decomposition
In looking at the results of the decomposition (Table 6),
one first notices that all three labour markets have a positive
unexplained difference, signifying that it serves to widen the
immigrant wage gap. The existence of an unexplained difference
13. 13/45
is due not only to the presence of labour market discrimination,
as is most often attributed in literature, but also due to
factors either not specified in the decomposition (e.g. distance
to work) or factors difficult to measure (e.g. cultural
attitudes towards work, motivation.) Assuming that the authors
of the original study made the most of available Canadian Census
data in formulating their decomposition, the positive
unexplained terms imply that Canadian immigrants have difficulty
integrating in the labour market due to factors not easily
measured by census-type surveys.
Turning to the explained term of the decomposition, one
observes a negative overall coefficient for both Alberta and the
ROC, indicating that the traits of immigrants included in the
decomposition serve to shrink the immigrant wage gap. In
Alberta, Quebec, and the ROC, for instance, education and
potential experience have statistically significant negative
coefficients in all periods, indicating that they are two areas
in which immigrants perform well. Also in Alberta, Quebec, and
the ROC, speaking a non-official language at home has a positive
coefficient, signifying that it is estimated to widen the
immigrant wage gap, a logical conclusion due to the importance
of fluid communication in most forms of employment. Unique to
Quebec is the peculiar fact that speaking French at home is
14. 14/45
actually estimated at a statistically significant level
(although only in 2006) to widen the immigrant wage gap. One
reason for which this could be the case is that foreign dialects
of French are arguably more varied than foreign dialects of
English. A European immigrant who speaks Occitan or a Caribbean
immigrant who speaks a French-based Creole could potentially
indicate that they speak French at home in completing the census
yet have difficulty communicating with speakers of Quebec
French. Also unique to Quebec is the advantage of bilingualism,
as seen in the negative coefficient of having as knowledge of
both official languages (both_know). Finally, living in a
metropolitan area (cma) is estimated at a statistically
significant level to shrink the immigrant wage gap in all three
labour markets.
Conclusion
The immigrant wage gap is an important measure of the
capacity of Canadian immigration policies to identify foreign
workers that are able to successfully integrate into Canada’s
labour market. It is also, to a certain degree, a measure of
immigrants’ capacities to adapt to the realities of life in
Canada, be the factors cultural, political, or linguistic.
Historical analysis of the immigrant wage gap, as performed in
the study of Nadeau and Seckin, reveals that the gap has been
15. 15/45
widening in all of Canada from 1981 to 2001 (Nadeau 269). This
study found that the trend has continued in most regions of
Canada over the 2001-2006 period. While the manner in which the
wage gaps were decomposed varies between the two studies as
Nadeau and Seckin made use of a custom Blinder-Oaxaca method
which was not able to be reproduced in the current study, the
means of determining the wage gaps in both studies were the
same. A careful listing of the census criteria (men aged 20-64,
etc.) in the original study made possible a fidelitous selection
of data in this study. Additionally, many of the quantitative
methods of analysis (e.g. average wage, percent living in a
metropolitan area) were sufficiently standard as to also be
reliably reproduced. This includes the calculation of the wage
gap itself, defined simply as the difference of mean log wages.
A point of comparison between the studies in found in Quebec in
the year 2001 (the ROC may not be used due to the separation of
Alberta in this study.) As expected, one finds essentially the
same, although not exact figures. The 2001 Quebec wage gap was
found by Nadeau and Seckin to be -0.128 (Nadeau 269), whereas
the figure found in this study was a similar -0.129 (Table 1).
The same comparison was able to be made for the contents of
Table 1, for which a ‘Check’ column was added. One observes that
all figures vary from those of the original study by less than
abs(1), with the sole exception of the percentage of immigrants
16. 16/45
who arrived in Canada before age 13, a figure which differs from
that of the original by a remarkable 12.9 percentage points. It
is possible that this one large exception is due to a
calculation error on the part of myself or the original authors.
In summary, the immigrant wage gap from 2001 to 2006 was
found to have widened in both Quebec, where the wage gap has
historically been relatively large, and the ROC. Alberta, in
contrast, was found to have an immigrant wage gap that shrunk
over the same period. All three trends are likely due not only
to the decline of traditional sectors like manufacturing in the
ROC and Quebec and the rapid growth of the petroleum sector in
Alberta but also potentially to the increased tendency of
immigrants to originate from countries with larger cultural and
linguistic differences than past generations.
23. 23/45
Appendix A: Data
The following data were found by means of the Canadian Census
Analyser (Cf. bibliography):
2001 Census:
Selection Filters (as
outlined by Nadeau
and Seckin)1
sexp(2), agep(20-64), hrswkp(20-100),
wkswkp(26-52), selfip(0), totincp(-50000-
200000)
+ Alta/Que, immigrants provp(48/24), yrimmig(1-6)
+ Alta/Que, non-immigrants provp(48/24), yrimmig(9)
+ Rest of Canada,
immigrants
provp(35,46,47,59)2
, yrimmig(1-6)
+ Rest of Canada, non-
immigrants
provp(35,46,47,59)2
, yrimmig(9)
Variables Downloaded totincp, hrswkp, wkswkp, totschp, agep,
hlnp, olnp, cmap, citizenp, immiagep,
pobp
2006 Census:
Selection Filters 1
sex(2), agegrp(8-16), hrswrk(20-98),
wkswrk(26-52), sempi(0), totinc(-50000-
1285586)
+ Alta/Que, immigrants pr(48/24), yrimm(1-7, 1980-2006)
+ Alta/Que, non-immigrants pr(48/24), yrimm(9999)
+ Rest of Canada,
immigrants
pr(35,46,47,59)2
, yrimm(1-7, 1980-2006)
+ Rest of Canada, non-
immigrants
pr(35,46,47,59)2
, yrimm(9999)
Variables Downloaded totinc, hrswrk, wkswrk, hdgree, agegrp,
hlaen, hlafr, hlano, kol, cma, citizen,
ageimm, pob
1: “men aged between 20 and 64, who work more than 20 hours per week and more than
26 weeks per year, and who are not self-employed” (Nadeau, 2010)
2: The Atlantic Provinces are excluded for reasons of confidentiality (Nadeau,
2010)
24. 24/45
Appendix B: Variable Names
Variable in
2001
Meaning Equivalent in
2006
sexp Sex sex
agep Age agegrp
hrswkp Hours worked
per week
hrswrk
wkswkp Weeks worked
per year
wkswrk
selfip Self-employment
income
sempi
totincp Total income totinc
provp Province pr
totschp Education hdgree
hlnp Language.s
spoken at home
hlaen (anglais),
hlafr (français),
hlano (autre)
olnp Knowledge of
official
languages
kol
cmap CMA (Canadian
metropolitan
area)
cma
citizenp Citizenship citizen
immiagep Age at
immigration
ageimm
pobp Country of birth pob
25. 25/45
Appendix C: Do File, Construction of Initial Analysis
Table, 2001 Census
/*
Selection filters:
Alberta/Quebec: sexp(2), agep(20-64), hrswkp(20-100), wkswkp(26-
52), selfip(0), totincp(-50000-200000), provp(48/24)
ROC (rest of Canada): sexp(2), agep(20-64), hrswkp(20-100),
wkswkp(26-52), selfip(0), totincp(-50000-200000),
provp(35,46,47,59)
yrimmig(1-6) for immigrants, yrimmig(9) for non-immigrants
Variables required:
totincp, hrswkp, wkswkp, totschp, agep, hlnp, olnp, cmap,
citizenp, immiagep, pobp
*/
// Average wage:
gen hour_wage = totincp/(hrswkp*wkswkp)
summarize hour_wage
// i.e. total income in 2001 divided by hours worked in 2001
// Median wage:
// <see previous>
// Average education (years):
gen educ = 0
replace educ = 3 if(totschp==1)
replace educ = 6.5 if(totschp==2)
replace educ = 9 if(totschp==3)
replace educ = 10 if(totschp==4)
replace educ = 11 if(totschp==5)
replace educ = 12 if(totschp==6)
replace educ = 13 if(totschp==7)
replace educ = 15.5 if(totschp==8)
replace educ = 18 if(totschp==9)
summarize educ
// Average age (years):
summarize agep
26. 26/45
// Language.s spoken at home:
// % English:
gen en_home = 0
replace en_home = 1 if(hlnp==1)
// N.-B. one divides the number of “real changes made” by the
sample size in order to calculate the percentage
// % French:
gen fr_home = 0
replace fr_home = 1 if(hlnp==2)
// % Both:
gen both_home = 0
replace both_home = 1 if(hlnp==3)
// % Other:
gen other_home = 0
replace other_home = 1 if(hlnp==4 | hlnp==5)
// i.e. aboriginal languages (4), others (5)
// Knowledge of official languages
// % English:
gen en_work = 0
replace en_work = 1 if(olnp==1)
// % French:
gen fr_work = 0
replace fr_work = 1 if(olnp==2)
// % Both:
gen both_work = 0
replace both_work = 1 if(olnp==3)
// % Neither:
gen none_work= 0
replace none_work = 1 if(olnp==4)
// CMA (Canadian metropolitan area):
gen cma = 0
replace cma = 1 if(cmap!=999)
// if countryside == 999, then in town != 999
27. 27/45
Unique to immigrants:
// % Canadian citizen:
gen citizen = 0
replace citizen = 1 if(citizenp==1 | citizenp==2)
// i.e. by birth, by naturalisation
// % Immigrated before age 13:
gen young = 0
replace young = 1 if(immiagep==1 | immiagep==2)
// i.e. 0-4 + 5-12 for "under 13"
// Foreign work experience (years):
gen age_immigration = 0
replace age_immigration = 2 if(immiagep==1)
replace age_immigration = 8.5 if(immiagep==2)
replace age_immigration = 16 if(immiagep==3)
replace age_immigration = 22 if(immiagep==4)
replace age_immigration = 27 if(immiagep==5)
replace age_immigration = 32 if(immiagep==6)
replace age_immigration = 37 if(immiagep==7)
replace age_immigration = 42 if(immiagep==8)
replace age_immigration = 47 if(immiagep==9)
replace age_immigration = 52 if(immiagep==10)
replace age_immigration = 57 if(immiagep==11)
replace age_immigration = 60 if(immiagep==12)
replace age_immigration = 0 if(age_immigration<0)
// gen years_since_immigration = agep - age_immigration
// gen pre_immig_exp = agep - educ - 6 - years_since_immigration
// which simplifies to:
gen pre_immig_exp = age_immigration - educ – 6
replace pre_immig_exp = 0 if(pre_immig_exp<0)
summarize pre_immig_exp
// Country of origin:
// % U.S. and U.K.:
gen us_uk = 0
replace us_uk = 1 if(pobp==6 | pobp==7)
28. 28/45
// % Other European:
gen rest_europe = 0
replace rest_europe = 1 if(pobp==8 | pobp==9 | pobp==10)
// % Asia:
gen asia = 0
replace asia = 1 if(pobp==11)
// % Others:
gen other = 0
replace other = 1 if(pobp==12)
29. 29/45
Appendix D: Appendix C: Do File, Construction of
Initial Analysis Table, 2006 Census
/*
Selection filters:
Alberta/Quebec: sex(2), agegrp(8-16), hrswrk(20-98), wkswrk(26-
52), sempi(0), totinc(-50000-1285586), pr(48/24)
ROC: Quebec: sex(2), agegrp(8-16), hrswrk(20-98), wkswrk(26-52),
sempi(0), totinc(-50000-1285586), pr(35,46,47,59)
yrimm(1-7, 1980-2006) for immigrants, yrimm(9999) for non-
immigrants
Variables required:
totinc, hrswrk, wkswrk, hdgree, agegrp, hlaen, hlafr, hlano,
kol, cma, citizen, ageimm, pob
*/
// Average wage:
gen hour_wage = (totinc/(hrswrk*wkswrk))*0.9
summarize hour_wage
// // i.e. total income in 2006 divided by hours worked in 2006
// CPI base year = 2001, therefore *0.9 as recommended by the
Bank of Canada
// Median wage:
// <see previous>
// Average education (years)
gen educ = 0
replace educ = 8 if(hdgree==1)
replace educ = 12 if(hdgree==2)
replace educ = 13 if(hdgree==3 | hdgree==4 | hdgree==5)
replace educ = 14 if(hdgree==6 | hdgree==7)
replace educ = 15 if(hdgree==8)
replace educ = 16 if(hdgree==9)
replace educ = 17 if(hdgree==10)
replace educ = 18 if(hdgree==12)
replace educ = 22 if(hdgree==11 | hdgree==13)
summarize educ
// Average age (years):
gen age = 0
replace age = 2 if(agegrp==1)
30. 30/45
replace age = 5.5 if(agegrp==2)
replace age = 8 if(agegrp==3)
replace age = 10.5 if(agegrp==4)
replace age = 13 if(agegrp==5)
replace age = 16 if(agegrp==6)
replace age = 18.5 if(agegrp==7)
replace age = 22 if(agegrp==8)
replace age = 27 if(agegrp==9)
replace age = 32 if(agegrp==10)
replace age = 37 if(agegrp==11)
replace age = 42 if(agegrp==12)
replace age = 47 if(agegrp==13)
replace age = 52 if(agegrp==14)
replace age = 57 if(agegrp==15)
replace age = 62 if(agegrp==16)
replace age = 67 if(agegrp==17)
replace age = 72 if(agegrp==18)
replace age = 77 if(agegrp==19)
replace age = 82 if(agegrp==20)
replace age = 85 if(agegrp==21)
summarize age
// Language.s spoken at home
// % English:
gen en_home = 0
replace en_home = 1 if(hlaen==1)
// % French:
gen fr_home = 0
replace fr_home = 1 if(hlafr==1)
// % Both:
gen both_home = 0
replace both_home = 1 if(hlaen==1 & hlafr==1)
// % Other:
gen other_home = 0
replace other_home = 1 if(hlano!=1)
// Knowledge of official languages:
// % English:
gen en_work = 0
replace en_work = 1 if(kol==1)
// % French:
gen fr_work = 0
replace fr_work = 1 if(kol==2)
31. 31/45
// % Both:
gen both_work = 0
replace both_work = 1 if(kol==3)
// % Other:
gen none_work= 0
replace none_work = 1 if(kol==4)
// CMA (Canadian metropolitan area):
gen metro_area = 0
replace metro_area = 1 if(cma!=999)
// if countryside == 999, then in town != 999
Unique to immigrants:
// % Canadian citizen:
gen can_citizen = 0
replace can_citizen = 1 if(citizen==1 | citizen==2)
// % Immigrated before age 13:
gen young = 0
replace young = 1 if(ageimm==1 | ageimm==2 | ageimm==3)
// i.e. 0-4 + 5-9 + 9-14 to approximate "under 13"
// Foreign work experience (years):
gen age_immigration = 0
replace age_immigration = 2 if(ageimm==1)
replace age_immigration = 7 if(ageimm==2)
replace age_immigration = 12 if(ageimm==3)
replace age_immigration = 17 if(ageimm==4)
replace age_immigration = 22 if(ageimm==5)
replace age_immigration = 27 if(ageimm==6)
replace age_immigration = 32 if(ageimm==7)
replace age_immigration = 37 if(ageimm==8)
replace age_immigration = 42 if(ageimm==9)
replace age_immigration = 47 if(ageimm==10)
replace age_immigration = 52 if(ageimm==11)
replace age_immigration = 57 if(ageimm==12)
replace age_immigration = 60 if(ageimm==13)
gen pre_immig_exp = age_immigration - educ - 6
replace pre_immig_exp = 0 if(pre_immig_exp<0)
summarize pre_immig_exp
32. 32/45
// Country of origin:
// % U.S. and U.K.:
gen us_uk = 0
replace us_uk = 1 if(pob==2 | pob==7)
// % Other European:
gen rest_europe = 0
replace rest_europe = 1 if(pob==8 | pob==9 | pob==10 | pob==11 |
pob==12 | pob==13 | pob==14)
// % Asia:
gen asia = 0
replace asia = 1 if(pob==18 | pob==19 | pob==20 | pob==21 |
pob==22 | pob==23 | pob==24 | pob==25 | pob==26)
// % Others:
gen other = 0
replace other = 1 if(pob==3 | pob==4 | pob==5 | pob==6 | pob==15
| pob==16 | pob==17 | pob==27)
33. 33/45
Appendix E: Do File, Oaxaca Decomposition, 2001 Census
/*
- one must first execute the command "ssc install oaxaca" in
order to install the plug-in
Selection filters:
Alberta: sexp(2), agep(20-64), hrswkp(20-100), wkswkp(26-52),
selfip(0), wagesp(0-200000), provp(48)
ROC: sexp(2), agep(20-64), hrswkp(20-100), wkswkp(26-52),
selfip(0), wagesp(0-200000), provp(35,46,47,59)
yrimmig(1-6) for immigrants, yrimmig(9) for non-immigrants
Variables required:
totincp, hrswkp, wkswkp, totschp, agep, hlnp, olnp, cmap,
citizenp, immiagep, pobp
*/
// Dependant variable:
gen hour_wage = totincp/(hrswkp*wkswkp)
gen log_hour_wage = log(hour_wage)
replace log_hour_wage = 0 if(log_hour_wage<0)
// Variable by (what distinguishes the two groups):
gen immig = 0
replace immig = 1 if(yrimmig!=9)
// Prairies:
gen prairies = 0
replace prairies = 1 if(provp==46 | provp==47)
// Quebec:
gen quebec = 0
replace quebec = 1 if(provp==24)
// B.C.:
gen bc = 0
replace bc = 1 if(provp==59)
34. 34/45
// Education:
gen educ = 0
replace educ = 3 if(totschp==1)
replace educ = 6.5 if(totschp==2)
replace educ = 9 if(totschp==3)
replace educ = 10 if(totschp==4)
replace educ = 11 if(totschp==5)
replace educ = 12 if(totschp==6)
replace educ = 13 if(totschp==7)
replace educ = 15.5 if(totschp==8)
replace educ = 18 if(totschp==9)
// Potential experience:
gen poten_exp = agep - educ - 6
replace poten_exp = 0 if(poten_exp<0)
// negative values are removed as they have no practical
interpretation
// Potential experience, squared, over 100:
gen poten_exp_sq = (poten_exp^2)/100
// Language.s spoken at home:
// Reference: English
// French:
gen fr_home = 0
replace fr_home = 1 if(hlnp==2)
// Both:
gen both_home = 0
replace both_home = 1 if(hlnp==3)
// Other:
gen other_home = 0
replace other_home = 1 if(hlnp==4 | hlnp==5)
// Knowledge of official languages
// Reference: English
// French:
gen fr_work = 0
replace fr_work = 1 if(olnp==2)
35. 35/45
// Both:
gen both_work = 0
replace both_work = 1 if(olnp==3)
// Neither:
gen other_work = 0
replace other_work = 1 if(olnp==4)
// CMA (Canadian metropolitan area):
gen cma = 0
replace cma = 1 if(cmap!=999)
// en campagne == 999, donc en ville != 999
// Canadian citizen:
gen citizen = 0
replace citizen = 1 if(citizenp==1 | citizenp==2)
// Immigrated before age 13:
gen young = 0
replace young = 1 if(immiagep==1 | immiagep==2)
// Immigrated before age 13, education:
gen young_educ = young*educ
// Country of origin:
// Reference: U.S. and U.K.
// Other European:
gen rest_europe = 0
replace rest_europe = 1 if(pobp==8 | pobp==9 | pobp==10)
// Asia:
gen asia = 0
replace asia = 1 if(pobp==11)
// Others:
gen other = 0
replace other = 1 if(pobp==12)
// Foreign work experience
gen age_immigration = 0
36. 36/45
replace age_immigration = 2 if(immiagep==1)
replace age_immigration = 8.5 if(immiagep==2)
replace age_immigration = 16 if(immiagep==3)
replace age_immigration = 22 if(immiagep==4)
replace age_immigration = 27 if(immiagep==5)
replace age_immigration = 32 if(immiagep==6)
replace age_immigration = 37 if(immiagep==7)
replace age_immigration = 42 if(immiagep==8)
replace age_immigration = 47 if(immiagep==9)
replace age_immigration = 52 if(immiagep==10)
replace age_immigration = 57 if(immiagep==11)
replace age_immigration = 60 if(immiagep==12)
// gen years_since_immig = agep - age_immigration
// gen pre_immig_exp = poten_exp - years_since_immig
// which simplifies to:
gen pre_immig_exp = age_immigration - educ - 6
replace pre_immig_exp = 0 if(pre_immig_exp<0)
// U.S. and U.K.
gen us_uk = 0
replace us_uk = 1 if(pobp==6 | pobp==7)
gen us_uk_exp = us_uk*pre_immig_exp
// Other European:
gen rest_europe_exp = rest_europe*pre_immig_exp
// Asia:
gen asia_exp = asia*pre_immig_exp
// Others:
gen other_exp = other*pre_immig_exp
// Foreign work experience, squared, over 100:
// U.S. and U.K.
gen us_uk_exp_sq = (us_uk_exp^2)/100
// Other European:
gen rest_europe_exp_sq = (rest_europe_exp^2)/100
// Asia:
gen asia_exp_sq = (asia_exp^2)/100
37. 37/45
// Others:
gen other_exp_sq = (other_exp^2)/100
// Foreign work experience * experience in Canada, over 100:
gen dom_exp = agep - educ - 6 - pre_immig_exp
replace dom_exp = 0 if(dom_exp<0)
// U.S. and U.K.:
gen us_uk_exp_dom = (us_uk_exp*dom_exp)/100
// Other European:
gen rest_europe_exp_dom = (rest_europe_exp*dom_exp)/100
// Asia:
gen asia_exp_dom = (asia_exp*dom_exp)/100
// Others:
gen other_exp_dom = (other_exp*dom_exp)/100
OAXACA:
Regression for Immigrants:
regress log_hour_wage prairies bc educ poten_exp poten_exp_sq
fr_home both_home other_home fr_work both_work other_work cma
citizen young young_educ rest_europe asia other us_uk_exp
rest_europe_exp asia_exp other_exp us_uk_exp_sq
rest_europe_exp_sq asia_exp_sq other_exp_sq us_uk_exp_dom
rest_europe_exp_dom asia_exp_dom other_exp_dom if(immig==1),
vce(robust)
Regression for Non-Immigrants:
regress log_hour_wage prairies bc educ poten_exp poten_exp_sq
fr_home both_home other_home fr_work both_work other_work cma
if(immig==0), vce(robust)
Oaxaca Decomposition, Immigrant Coefficients as Reference:
oaxaca log_hour_wage prairies bc educ poten_exp poten_exp_sq
fr_home both_home other_home fr_work both_work other_work cma,
by(immig) weight(0) detail
39. 39/45
Appendix F: Do File, Oaxaca Decomposition, 2006 Census
/*
Selection filters:
Alberta: sex(2), agegrp(8-16), hrswrk(20-98), wkswrk(26-52),
sempi(0), wages(0-1226490), pr(48)
ROC: sex(2), agegrp(8-16), hrswrk(20-98), wkswrk(26-52),
sempi(0), wages(0-1226490), pr(35,46,47,59)
yrimm(1-7, 1980-2006) for immigrants, yrimm(9999) for non-
immigrants
Variables required:
totinc, hrswrk, wkswrk, hdgree, agegrp, hlaen, hlafr, hlano,
kol, cma, citizen, ageimm, pob
*/
// Dependant variable:
gen hour_wage = (totinc/(hrswrk*wkswrk))*0.9
gen log_hour_wage = log(hour_wage)
replace log_hour_wage = 0 if(log_hour_wage<0)
// Variable by (what distinguishes the two groups):
gen immig = 0
replace immig = 1 if(yrimm!=9999)
// Prairies:
gen prairies = 0
replace prairies = 1 if(pr==46 | pr==47)
// Quebec:
gen quebec = 0
replace quebec = 1 if(pr==24)
// B.C.:
gen bc = 0
replace bc = 1 if(pr==59)
40. 40/45
// Education:
gen educ = 0
replace educ = 8 if(hdgree==1)
replace educ = 12 if(hdgree==2)
replace educ = 13 if(hdgree==3 | hdgree==4 | hdgree==5)
replace educ = 14 if(hdgree==6 | hdgree==7)
replace educ = 15 if(hdgree==8)
replace educ = 16 if(hdgree==9)
replace educ = 17 if(hdgree==10)
replace educ = 18 if(hdgree==12)
replace educ = 22 if(hdgree==11 | hdgree==13)
// Potential experience:
gen age = 0
replace age = 2 if(agegrp==1)
replace age = 5.5 if(agegrp==2)
replace age = 8 if(agegrp==3)
replace age = 10.5 if(agegrp==4)
replace age = 13 if(agegrp==5)
replace age = 16 if(agegrp==6)
replace age = 18.5 if(agegrp==7)
replace age = 22 if(agegrp==8)
replace age = 27 if(agegrp==9)
replace age = 32 if(agegrp==10)
replace age = 37 if(agegrp==11)
replace age = 42 if(agegrp==12)
replace age = 47 if(agegrp==13)
replace age = 52 if(agegrp==14)
replace age = 57 if(agegrp==15)
replace age = 62 if(agegrp==16)
replace age = 67 if(agegrp==17)
replace age = 72 if(agegrp==18)
replace age = 77 if(agegrp==19)
replace age = 82 if(agegrp==20)
replace age = 85 if(agegrp==21)
gen poten_exp = age - educ - 6
replace poten_exp = 0 if(poten_exp<0)
// negative values are removed as they have no practical
interpretation
// Potential experience, squared, over 100:
gen poten_exp_sq = (poten_exp^2)/100
41. 41/45
// Language.s spoken at home:
// Reference: English
// French:
gen fr_home = 0
replace fr_home = 1 if(hlafr==1)
// Both:
gen both_home = 0
replace both_home = 1 if(hlaen==1 & hlafr==1)
// Other:
gen other_home = 0
replace other_home = 1 if(hlano!=1)
// Knowledge of official languages
// Reference: English
// French:
gen fr_work = 0
replace fr_work = 1 if(kol==2)
// Both:
gen both_work = 0
replace both_work = 1 if(kol==3)
// Neither:
gen none_work= 0
replace none_work = 1 if(kol==4)
// CMA (Canadian metropolitan area):
gen metro_area = 0
replace metro_area = 1 if(cma!=999)
// Canadian citizen:
gen can_citizen = 0
replace can_citizen = 1 if(citizen==1 | citizen==2)
// Immigrated under age 13:
gen young = 0
replace young = 1 if(ageimm==1 | ageimm==2 | ageimm==3)
42. 42/45
// Immigrated before age 13, education:
gen young_educ = young*educ
// Country of origin:
// Reference: U.S. and U.K.
// Other European:
gen rest_europe = 0
replace rest_europe = 1 if(pob==8 | pob==9 | pob==10 | pob==11 |
pob==12 | pob==13 | pob==14)
// Asia:
gen asia = 0
replace asia = 1 if(pob==18 | pob==19 | pob==20 | pob==21 |
pob==22 | pob==23 | pob==24 | pob==25 | pob==26)
// Others:
gen other = 0
replace other = 1 if(pob==3 | pob==4 | pob==5 | pob==6 | pob==15
| pob==16 | pob==17 | pob==27)
// Foreign work experience:
gen age_immigration = 0
replace age_immigration = 2 if(ageimm==1)
replace age_immigration = 7 if(ageimm==2)
replace age_immigration = 12 if(ageimm==3)
replace age_immigration = 17 if(ageimm==4)
replace age_immigration = 22 if(ageimm==5)
replace age_immigration = 27 if(ageimm==6)
replace age_immigration = 32 if(ageimm==7)
replace age_immigration = 37 if(ageimm==8)
replace age_immigration = 42 if(ageimm==9)
replace age_immigration = 47 if(ageimm==10)
replace age_immigration = 52 if(ageimm==11)
replace age_immigration = 57 if(ageimm==12)
replace age_immigration = 60 if(ageimm==13)
gen pre_immig_exp = age_immigration - educ - 6
replace pre_immig_exp = 0 if(pre_immig_exp<0)
// U.S. and U.K.:
gen us_uk = 0
replace us_uk = 1 if(pob==2 | pob==7)
gen us_uk_exp = us_uk*pre_immig_exp
43. 43/45
// Other European:
gen rest_europe_exp = rest_europe*pre_immig_exp
// Asia:
gen asia_exp = asia*pre_immig_exp
// Others:
gen other_exp = other*pre_immig_exp
// Foreign work experience, squared, over 100:
// U.S. and U.K.:
gen us_uk_exp_sq = (us_uk_exp^2)/100
// Other European:
gen rest_europe_exp_sq = (rest_europe_exp^2)/100
// Asia:
gen asia_exp_sq = (asia_exp^2)/100
// Others:
gen other_exp_sq = (other_exp^2)/100
// Foreign work experience * experience in Canada, over 100:
gen dom_exp = age - educ - 6 - pre_immig_exp
replace dom_exp = 0 if(dom_exp<0)
// U.S. and U.K.:
gen us_uk_exp_dom = (us_uk_exp*dom_exp)/100
// Other European:
gen rest_europe_exp_dom = (rest_europe_exp*dom_exp)/100
// Asia:
gen asia_exp_dom = (asia_exp*dom_exp)/100
// Others:
gen other_exp_dom = (other_exp*dom_exp)/100
44. 44/45
OAXACA:
Regression for Immigrants:
regress log_hour_wage prairies bc educ poten_exp poten_exp_sq
fr_home both_home other_home fr_work both_work none_work
metro_area can_citizen young young_educ rest_europe asia other
us_uk_exp rest_europe_exp asia_exp other_exp us_uk_exp_sq
rest_europe_exp_sq asia_exp_sq other_exp_sq us_uk_exp_dom
rest_europe_exp_dom asia_exp_dom other_exp_dom if(immig==1),
vce(robust)
Regression for Non-Immigrants:
regress log_hour_wage prairies bc educ poten_exp poten_exp_sq
fr_home both_home other_home fr_work both_work none_work
metro_area if(immig==0), vce(robust)
Oaxaca Decomposition, Immigrant Coefficients as Reference:
oaxaca log_hour_wage prairies bc educ poten_exp poten_exp_sq
fr_home both_home other_home fr_work both_work none_work
metro_area, by(immig) weight(0) detail
Oaxaca Decomposition, Non-Immigrant Coefficients as Reference:
oaxaca log_hour_wage prairies bc educ poten_exp poten_exp_sq
fr_home both_home other_home fr_work both_work none_work
metro_area, by(immig) weight(1) detail
45. 45/45
Bibliography
Nadeau, S. and Seckin, A. 2010. “The Immigrant Wage Gap in
Canada: Quebec and the Rest of Canada.” Canadian Public
Policy 36(3): 265-285. University of Toronto Press. Last access
10/03/2014, from the Project MUSE database.
Nadeau, S. and Seckin, A. 2010. “Online Appendix:
Regression Coefficients.” The Canadian Public Policy Archive.
Last access 10/03/2014. “http://economics.ca/cgi/jab?journal=cpp
&view=v36n3/CPPv36n3p265appx.pdf.”
Canada’s Oil Sands: Opportunities and Challenges to 2015.
National Energy Board. Government of Canada. Last access
30/03/2014. “http://www.neb-one.gc.ca/clf-nsi/rnrgynfmtn/nrgyrpr
t/lsnd/pprtntsndchllngs20152006/pprtntsndchllngs20152006-eng.pdf.”
Jann, B. 2008. “The Blinder-Oaxaca Decomposition for Linear
Regression Models.” The Stata Journal 8(4): 453-479. Stata
Press. Last access 30/03/2014. “http://www.stata-journal.com/
article.html?article=st0151.”
Grenier, G. 2013. “Exemple de la décomposition Blinder-
Oaxaca for les écarts de salaires entre les hommes et les
femmes.” BlackBoard Learn. University of Ottawa. Last access
30/03/2014.
Canadian Census Analyser. Computing in the Humanities and
Social Sciences (CHASS). University of Toronto. “http://datacent
re.chass.utoronto.ca.proxy.bib.uottawa.ca/census/.”