OEUS Lawley

Alternative Allocation Design for
the Occupational Employment
Statistics (OES) Survey
Ernest Lawley, Bureau of Labor Statistics
Marie C. Stetser, Bureau of Labor Statistics
Dr. Eduardas Valaitis, American University
OEUS ANNUAL MEETING 2007
Washington, DC

Alternative Allocation Design for the
Occupational Employment
• Occupational Employment Statistics (OES)
Survey
• Frame Development
• Frame Stratification
• Sample Requirements
• Prior Allocation Design
• Current Allocation Design
• Calculating Sh (standard error)
• Reliability

OES Survey
• Partnership with 50 States + DC, Guam,
Puerto Rico, US Virgin Islands
• Measures occupational employment and
wages within 300+ industry groups*
– Approximately 800 detailed occupations
(SOC)
– Broken down by MSA—aggregated Statewide
and Nationwide
*using 4-digit and 5-digit NAICS codes

Frame Development
• Quarterly Census of Employment and Wages (QCEW)
– Collects non-railroad data for all business establishments for 50
States + DC, PR, USVI
– Data includes pertinent information for each establishment such
as: Trade Name, Legal Name, Address information, and Monthly
Employment for the past 12 months
– Data compiled into Bureau’s Longitudinal Database (LDB)
• Railroad Frame File
– Collected by Bureau’s Office of Safety and Health (OSH)
• Guam Frame File
– Collected by one of the BLS Regional Offices
All three elements combined; OES Frame≈6.7 million
business establishments

Frame Stratification
• Frame initially stratified geographically
– Approximately 600 geographic areas
• Approximately 400 State/Metropolitan Statistical Areas (MSAs)
• Approximately 200 Non-MSA Areas (“rural”)
• Frame further stratified by detailed industry (NAICS 4-
digit, selected NAICS 5-digit)
– Approximately 350 industries
– Industry is related to occupation
• Approximately 170,000 total non-empty strata
– Each business establishment in the nation fits into exactly one of
these defined strata
– Each non-empty stratum contains one business establishment to
hundreds of business establishments

Frame Stratification
State 1
MSA X MSA Y
Industry 1 Industry 2Industry 1 Industry 2
State 2
MSA X MSA Z
Industry
1
Industry
2
Industry
1
Industry
2

Sample Requirements
• Sample allocated by stratum
• Sample Allocation≈1.2 million establishments
• Individual State Sample Sizes (∑≈1.2 million)
– Confidential value for each State
– Based on State employment population
– Last modified in 1996
Example:
Hypothetically (exact values are confidential):
State State Sample Size
California 120,000
Texas 100,000
New York 100,000
Florida 85,000
And so forth… Σ≈1.2 million

Prior Allocation Design
“Proportional-to-Employment”
• Maximum Employment
– Maximum monthly employment value in LDB for each
establishment
STEPS:
1. Sum max employment values across stratum, Nh
2. Sum max employment values across state, ΣNh
3. Look up Individual State Sample Size, n
4. Calculate stratum allocation: nh=n∙(Nh/ΣNh)
5. Repeat calculation for all strata, approx. 170,000
times
Note: n may require iterative reduction to work
minimum sample allocation requirements for each

• Advantages
– Simple
– Strata with larger populations are allocated
more sample
• Is this necessarily an advantage?

“A sample should allocate most heavily to those
strata where the least amount of certainty
exists.”
Causes for uncertainty (less reliability)
within a sampled stratum:
• Undersampling a large population
• Undersampling where there is a large
variability in occupations

• Disadvantage
– Estimates in smaller strata that have large
occupational variability may not be reliable
due to allocation of smaller sample size

Accomodations/Food
Services Industry
• 90% of all employees work in
88 occupations
• 12.8 million workers in this
industry
Wholesale Trade Industry
• 90% of all employees work in
175 occupations
• 6.1 million workers in this
industry
EXAMPLE
Which of these cells should be allocated more
sample?
Using “Proportional Allocation”:
Accom/Food Services Wholesale Trade
120,000 establishments 72,000 establishments

Current Allocation Design
Neyman Allocation
( )∑
=
⋅
⋅
•= H
1h
hh
hh
h
SN
SN
nn
n=Individual State “fixed” sample size
Nh = sum of stratum frame employees
Sh represents an occupational
variability measure within a stratum
Occupations for each stratum (or cell)
obtained from recent estimates
file; weighted data
Denominator summed overall by
state

Current Allocation Design
Neyman Allocation Proportional Allocation
( )∑
=
⋅
⋅
•= H
1h
hh
hh
h
SN
SN
nn
( )∑
=
•= H
1h
h
h
h
N
N
nn
“Occupational Variability” measure; notice that the
“adjustment” from the Proportional Allocation
formula.

Calculating Sh
1. Calculate a “coefficient of variation” for
each occupation within an industry.
2. Determine 90th
-percentile of occupations
within each industry.
3. Sh (for each industry) is calculated by
obtaining the weighted mean of CVs for
the 90th
-percentile of occupations within
each industry.

Calculating Sh
Step 1: Calculating a “coefficient of variation” for
each occupation within stratum
– Using most recent weighted estimates file:
• Count # of employees in each occupation for each business
establishment (call this yi)
• Count # of employees total for each business establishment
(call this xi)
• Sample weight, wi, represents the number of business
establishments that each establishment on the estimates
file (i) represents
• Create a “weighted ratio”Rw=Σ(wi∙yi)/Σ(wi∙xi); summed over
a defined cell
– Note: This ratio is the ratio of occupational employment to
overall employment; ratio will always be ≤ 1.

Calculating Sh
• CV formula (unweighted)
– Derived from variance formula
– Relative variance (CV2
) for an original variate Yi:
– Using a little algebra (remember R=y/x):
( )
2
N
i
2
i
2
Y
2
2
Y
Y)1N(
YY
Y
S
CV
⋅−
−
==
∑
R
S
x
1
xR
S
y
S
CV
y
yy
Y
⋅
=
⋅
==
( )
R
1N
xRy
x
1
CV
N
1i
2
ii
Y
−
⋅−
⋅
=
∑
=


Calculating Sh
( )[ ]
w
i
i
n
1i
2
iwii
Y
R
1w
xRyw
x
1
CV R
−
⋅−
⋅
≈
∑
∑
=
• CV formula (for each defined “Sh cell”),
summed by cell (including weights):
• Note: x-bar is a weighted average.
∑
∑
= n
1
i
n
i
ii
w
xw
x

Calculating Sh
EXAMPLE (hypothetical cell w/ sampled 2 business establishments)
• Restaurant ABC; represents 5 businesses
• What is ABC’s weight?
• Restaurant XYZ; represents itself (1 business)
• What is XYZ’s weight?
ABC’s Staffing Pattern
Occupation # employed
Waitress/Waiter 8
Cook 4
Dishwasher 2
Janitor 1
Manager 1
TOTAL 16
XYZ’s Staffing Pattern
Occupation # employed
Waitress/Waiter 32
Cook 15
Dishwasher 10
Manager 3
TOTAL 60
Calculations for ABC
Waitress/Waiter Cook Dishwasher Janitor Manager
yi
= 8 yi
= 4 yi
= 2 yi
= 1 yi
= 1
wi
yi
=5∙8=40 wi
yi
=5∙4=20 wi
yi
=5∙2=10 wi
yi
=5∙1=5 wi
yi
=5∙1=5
xi = 16 xi = 16 xi = 16 xi = 16 xi = 16
wi
xi
=5∙16=80 wi
xi
=5∙16=80 wi
xi
=5∙16=80 wi
xi
=5∙16=80 wi
xi
=5∙16=80
Calculations for XYZ
Waitress/Waiter Cook Dishwasher Manager
yi
= 32 yi
= 15 yi
= 10 yi
= 3
wi
yi
=1∙32=32 wi
yi
=1∙15=15 wi
yi
=1∙10=10 wi
yi
=1∙3=3
xi
= 60 xi
= 60 xi
= 60 xi
= 60
wi
xi
=1∙60=60 wi
xi
=1∙60=60 wi
xi
=1∙60=60 wi
xi
=1∙60=60

Calculating Sh
( )[ ]
w
i
i
n
1i
2
iwii
Y
R
1w
xRyw
x
1
CV R
−
⋅−
⋅
≈
∑
∑
=
ABC
yi
=8
wiyi=5∙8=40
xi
=16
wi
xi
=5∙16=80
XYZ
yi
=32
wiyi=1∙32=32
xi
=60
wi
xi
=1∙60=60
Example: CVs for
Occupations
Occupation CV
Waitress/Waiter 0.060
Cook 0
Dishwasher 0.271
Janitor 1.626
Manager 0.203
Waitress/Waiter
( )
( ) ( )
( ) 060.0
140
3240
16
140
32406032
140
32408040
15
6080
1
CV
22
YR
≈
+
−



 +⋅−+



 +⋅−
⋅
+
+
≈
The smaller the CV
value, the less diverse
the occupation is within
the defined cell.

Step 2: Avoiding “atypical” occupations
within each cell:
• Conservative approach: utilize 90th
-
percentile until further research is done
• Exclude bottom 10th
percentile of
occupations
Calculating Sh

Calculating Sh
Step 3: A CV is created for each occupation
within a defined cell—How are occupations
within a cell “combined” to create one value
for the cell?
– Weighted mean of 90th
-percentile occupations
• Obtain occupational proportion for each cell
• Obtain Sh by calculating weighted mean of the top-90th
-
percentile of occupations
– Less prevalent (bottom 10%) occupations are eliminated
– Sh=weighted mean of 90th-
percentile CVs within defined cell

Calculating Sh
Example (sorted in “proportional order”)
90th
percentile
(Look at proportions)
• 90th
-percentile Occupations
– Weighted mean=Sh=Σ ”products”
≈ 0.03 + 0 + 0.04 = 0.07
Weighted Mean of CVs of All Occupations
Occupation CV Proportion Product
Waitress/Waiter 0.060 72/140≈0.51 0.060*0.51≈0.03
Cook 0 35/140=0.25 0*0.25=0
Dishwasher 0.271 20/140≈0.14 0.271*0.14≈0.04
Manager 0.203 8/140≈0.06 0.203*0.06≈0.01
Janitor 1.626 5/140≈0.04 1.626*0.04≈0.07

Calculating Sh
Defining Sh “cell”
– Normality of individual CVs
– Sufficient amount of data to create reliable estimate of
occupational variability (Sh)

Calculating Sh
Aggregation by National Industry (Industry-only)
Concerns:
– Assumption that national aggregates of industry will produce
accurate CVs and Sh values
• Aggregation necessary due to lack of data for finely-detailed cells
• 88.6% of industry MSA-BOS staffing patterns were similar to
corresponding nationally-aggregated industry staffing patterns
(α=0.10)

Reliability
• Problem of small populations in geographic areas
• Desire to produce similar reliability in large and small areas
– Example: Utilizing the Neyman Allocation method illustrated,
Chicago takes up approximately 54% if Illinois’s sample allocation;
this may lead to a possible unreliable sample in non-Chicago areas
within Illinois

Reliability
How to “spread out” sample allocation?
Bankier (1988): Power Allocations: Determining Sample Sizes for
Subnational Areas
• Adjust exponent for Nh (numerator and denominator) in the
Neyman Allocation
• Drops Chicago’s value to approx. 34% of IL’s sample allocation
( )∑=
⋅
⋅
•= H
h
hh
hh
h
SN
SN
nn
1
Nh = sum of stratum frame employees
Sh represents an occupational
variability measure within a stratum
Occupations for each stratum (or
cell) obtained from recent
estimates file; weighted data
Denominator summed overall by state

Total Allocation for Illinois
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
B
loom
ingtonC
ham
paign
C
hicago
D
anvilleR
ockIsland
D
ecaturK
ankakee
P
eoria
R
ockford
E
.S
t.LouisS
pringfield
B
O
S
1
B
O
S
2
B
O
S
3
B
O
S
4
Allocation
Neyman 90th
Neyman 90th(SqRoot)
Reliability

Alternative Allocation Design for
the Occupational Employment
QUESTIONS?
lawley.ernest@bls.gov

OEUS Lawley

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (8)

Similar to OEUS Lawley

Similar to OEUS Lawley (20)

OEUS Lawley

Editor's Notes