SlideShare a Scribd company logo
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
A Bayesian Approach to Estimating Agricultural
Yield Based on Multiple Repeated Surveys
Jianqiang (Jay) Wang
National Institute of Statistical Sciences
July 9th, 2010
Joint work with Scott Holan (Univ. of Missouri), Wendy Barboza (NASS-USDA),
Balgobin Nandram (Worcester Polytech), Criselda Toto, Dilli Bhatta (Worcester
Polytech), and Edwin Anderson (NASS-USDA)
1 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Outline
1 Project overview
2 Regional yield modeling
3 State yield modeling
4 Concluding remarks
2 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Outline
1 Project overview
2 Regional yield modeling
3 State yield modeling
4 Concluding remarks
3 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Project background
National Agricultural Statistical Service (NASS)-USDA:
Provide timely, accurate and useful statistics in service to U.S.
agriculture.
Hundreds of surveys and over five hundred reports covering
virtually every aspect of U.S. agriculture.
Monthly Crop Production report:
Area planted, harvest, yield and production at state/national
level for multiple crops.
Projected value or an end-of-season estimate
e.g. forecasted corn yield (Aug-Nov), and end-of-season yield
estimate (Dec or Jan).
4 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Agricultural Statistics Board (ASB)
Panel of experts that analyze the survey data and publish official
estimates based on a combination of survey data and expert opinion
Information from multiple surveys are combined into a single
forecasted/estimated value.
Members of ASB adjust the predicted number using their
expert knowledge.
ASB “lockup” process
Literally locked in an area with armed guards outside.
Windows covered with metal shades.
No phones, PDA, or laptop.
Starts around 10:00pm and lasts until the report is released at
8:30am.
5 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Project overview
Drawbacks of the ASB Process
Influenced by subjective judgement of ASB members
No quantification of the uncertainty associated with ASB
forecast/estimate
Lacks transparency and repeatability
Goal of the project
Build a statistically rigorous forecasting model that uses all
available information including expert opinion of ASB
Major commodities: corn, soybeans, cotton, wheat
Variables: planted and harvested acres, yield, stocks
Levels: state, region, US
Make the crop production model transparent and repeatable;
provide measures of uncertainty.
6 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Regional and State Yield Modeling
Regional yield modeling
Forecast end-of-year corn yield at speculative (“spec”) region
level
Spec states: IL, IN, IA, MN, NE, OH, WI
State yield modeling
Model yield for individual state and aggregate up to regional
level
Spec yield = State yield × State harvested acres
Spec harvested acres
7 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Regional and State Yield Modeling
Regional yield modeling
Forecast end-of-year corn yield at speculative (“spec”) region
level
Spec states: IL, IN, IA, MN, NE, OH, WI
State yield modeling
Model yield for individual state and aggregate up to regional
level
Spec yield = State yield × State harvested acres
Spec harvested acres
8 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Sources of Data
Survey indications and associated s.e.’s
Objective Yield (OYS): Aug-Dec, field measurement survey
Number of corn ears for sample fields
Grain weight per ear
Estimate harvest loss
Agricultural Yield (AYS): Aug-Nov, farmer opinion survey
December Ag Survey (DAS): Dec, farmer opinion survey
Weather related auxiliary variables (National Climate Data Center)
Temperature, preciptation, soil moisture
Levels of resolution (district, state)
Aggregated hourly, daily, weekly and monthly
9 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Sources of Data
Survey indications and associated s.e.’s
Objective Yield (OYS): Aug-Dec, field measurement survey
Number of corn ears for sample fields
Grain weight per ear
Estimate harvest loss
Agricultural Yield (AYS): Aug-Nov, farmer opinion survey
December Ag Survey (DAS): Dec, farmer opinion survey
Weather related auxiliary variables (National Climate Data Center)
Temperature, preciptation, soil moisture
Levels of resolution (district, state)
Aggregated hourly, daily, weekly and monthly
10 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Conceptualization of the Problem
Applies to both regional and state model
Specify:
Data Model
[Survey indications|true yield, Θd ]
Process Model
[true yield| Θp]
Parameter Model (Prior)
[Θd , Θp]
Bayesian hierarchical model (BHM)
Provides a natural framework for incorporating data from
mutiple sources
11 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
BHM
(Regional model) Conditional independence of surveys:
[OYS, AYS, DAS| true yield, Θd ]
= [OYS| true yield, Θd ][AYS|·][DAS|·].
Posterior Distribution:
[true yield, Θd , Θp|OYS, AYS, DAS]
∝ [OYS| true yield, Θd ][AYS|·][DAS|·][true yield|Θp][Θd ][Θp].
Composite estimation: Graybill and Deal (1959), Keller and Olkin
(2002)
12 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
BHM
(Regional model) Conditional independence of surveys:
[OYS, AYS, DAS| true yield, Θd ]
= [OYS| true yield, Θd ][AYS|·][DAS|·].
Posterior Distribution:
[true yield, Θd , Θp|OYS, AYS, DAS]
∝ [OYS| true yield, Θd ][AYS|·][DAS|·][true yield|Θp][Θd ][Θp].
Composite estimation: Graybill and Deal (1959), Keller and Olkin
(2002)
13 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Outline
1 Project overview
2 Regional yield modeling
3 State yield modeling
4 Concluding remarks
14 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Model Development
1 Exploratory Data Analysis
2 Propose working models for forecasting regional yield
“Bias-adjusted” models (RM1-RM5)
3 Bayesian inference
Implement Markov chain Monte Carlo (McMC) in R
Simulate data to examine frequentist properties of Bayes
estimator
4 Application to the NASS corn yield
15 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
“Bias-adjusted” Models
1995 2000 2005
Yield Indications over Time (Years)
Year
Yield
OYS
AYS
DAS
ASB
Aug
Sep
Oct
Nov
Dec
DAS: most reliable
Corn has been harvested in all
states
Sample size is much larger
Objective yield (OYS) has positive bias
compared to DAS.
Ag yield (AYS) has negative bias, and
the bias decreases (in absolute value) as
we approach the end of year.
The survey bias is “nearly” consistent
across years, with a few exceptions.
16 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Notation
For each survey k ∈ {OYS, AYS, DAS}, in month m ∈ {8, . . . , 12}
and year t = 1, . . . , T
ytmk: survey indication of corn yield;
stmk: standard error associated with ytmk;
θtmk: finite population mean of ytmk;
µt : true (unobserved) yield in tth year.
17 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Models for Survey Indications (1)
Consider five versions of “bias-adjusted” models: RM1-RM5
Survey indication (ytmk )= ppl mean (θtmk )+ sampling error;
Population mean (θtmk ) = true yield (µt) + bias (bmk ) + forecast
error
Change of true yield over years (process model) and repeated
surveys within a year (longitudinal)
Data model for DAS: for k = DAS,
ytmk
ind
∼ N(µt, s2
tmk )
18 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Models for Survey Indications (2)
Vector notation
yt,OYS = (yt,8,OYS , yt,9,OYS , · · · , yt,12,OYS )′
,
Similarly θt,OYS , st,OYS and AYS data.
Data model for panel survey indications:
yt,OYS
yt,AYS
·
ind
∼ N
θt,OYS
θt,AYS
, diag(st )
R5(ρ1) 0
0 R4(ρ1)
diag(st ) ,
where st = (st,OYS , st,AYS )′
and R(ρ1) = ρ
|m−m′
|
1 denotes AR(1)
corr. matrix.
19 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Models for Survey Indications (2)
Vector notation
yt,OYS = (yt,8,OYS , yt,9,OYS , · · · , yt,12,OYS )′
,
Similarly θt,OYS , st,OYS and AYS data.
Data model for panel survey indications:
yt,OYS
yt,AYS
·
ind
∼ N
θt,OYS
θt,AYS
, diag(st )
R5(ρ1) 0
0 R4(ρ1)
diag(st ) ,
where st = (st,OYS , st,AYS )′
and R(ρ1) = ρ
|m−m′
|
1 denotes AR(1)
corr. matrix.
20 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Model for Finite Population Means
Average survey bias: bOYS = (b8,OYS , b9,OYS , · · · , b12,OYS )′
, and
similarly bAYS .
θt,OYS
θt,AYS
·
ind
∼ N µt 1 +
bt,OYS
bt,AYS
, ∆
R5(ρ2) 0
0 R4(ρ2)
∆′
,
where ∆ = diag(σ8,OYS , · · · , σ12,OYS , σ8,AYS , · · · , σ11,AYS ).
Forecast error s.d. σmk : RM2: σmk = σk ; rest: σmk = σm.
Correlation between months: ρ1 = ρ2 = 0 for RM1-RM2; ρ2 = 0 for
RM3; ρ1 = 0 for RM4 and RM5.
RM6 with conditional corr. between OYS and AYS: upper-left 9 × 9
submatrix of
R5(ρ2) τR5(ρ2)
τR5(ρ2) R5(ρ2)
.
21 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Model for Finite Population Means
Average survey bias: bOYS = (b8,OYS , b9,OYS , · · · , b12,OYS )′
, and
similarly bAYS .
θt,OYS
θt,AYS
·
ind
∼ N µt 1 +
bt,OYS
bt,AYS
, ∆
R5(ρ2) 0
0 R4(ρ2)
∆′
,
where ∆ = diag(σ8,OYS , · · · , σ12,OYS , σ8,AYS , · · · , σ11,AYS ).
Forecast error s.d. σmk : RM2: σmk = σk ; rest: σmk = σm.
Correlation between months: ρ1 = ρ2 = 0 for RM1-RM2; ρ2 = 0 for
RM3; ρ1 = 0 for RM4 and RM5.
RM6 with conditional corr. between OYS and AYS: upper-left 9 × 9
submatrix of
R5(ρ2) τR5(ρ2)
τR5(ρ2) R5(ρ2)
.
22 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Model for Finite Population Means
Average survey bias: bOYS = (b8,OYS , b9,OYS , · · · , b12,OYS )′
, and
similarly bAYS .
θt,OYS
θt,AYS
·
ind
∼ N µt 1 +
bt,OYS
bt,AYS
, ∆
R5(ρ2) 0
0 R4(ρ2)
∆′
,
where ∆ = diag(σ8,OYS , · · · , σ12,OYS , σ8,AYS , · · · , σ11,AYS ).
Forecast error s.d. σmk : RM2: σmk = σk ; rest: σmk = σm.
Correlation between months: ρ1 = ρ2 = 0 for RM1-RM2; ρ2 = 0 for
RM3; ρ1 = 0 for RM4 and RM5.
RM6 with conditional corr. between OYS and AYS: upper-left 9 × 9
submatrix of
R5(ρ2) τR5(ρ2)
τR5(ρ2) R5(ρ2)
.
23 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Model for True Yield
Applies to RM1-RM5
Spec-region weather covariates: state level weather variables
weighted by planted acres
PCPt: average July precipitation
Tempt: average July teperature
Progt: percentage of corn planted by the 20-th week
Linear regression model
µt = β0 + β1t + β2PCPt + β3Tempt + β4Progt +
β5(t ∗ PCPt ) + β6(t ∗ Tempt ) + β7(t ∗ Progt ) + ηt
No weather×year interaction (i.e. β5 = β6 = β7 = 0) except in RM5
Alternative process model:
µt − µt−1 = φ0 + φ1(µt−1 − µt−2) + ηt
24 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Model for True Yield
Applies to RM1-RM5
Spec-region weather covariates: state level weather variables
weighted by planted acres
PCPt: average July precipitation
Tempt: average July teperature
Progt: percentage of corn planted by the 20-th week
Linear regression model
µt = β0 + β1t + β2PCPt + β3Tempt + β4Progt +
β5(t ∗ PCPt ) + β6(t ∗ Tempt ) + β7(t ∗ Progt ) + ηt
No weather×year interaction (i.e. β5 = β6 = β7 = 0) except in RM5
Alternative process model:
µt − µt−1 = φ0 + φ1(µt−1 − µt−2) + ηt
25 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Application to NASS Data
Multiple data sources with temporal misalignment:
OYS (1993-2009)
AYS (2001-2009)
DAS (1996-2009)
Derive McMC sampling algorithms for fitting the model
Model comparison: Deviance Information Criterion (DIC) and
delete-one cross validation (CV) criterion
Parameter estimation with whole data set
Point Estimate: Posterior Mean (PM) of the parameters
Measures of Uncertainty: 95% Credible Intervals (CI)
Forecasting the end-of-year yield (Aug 2004 - Dec 2009)
26 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Application to NASS Data
Multiple data sources with temporal misalignment:
OYS (1993-2009)
AYS (2001-2009)
DAS (1996-2009)
Derive McMC sampling algorithms for fitting the model
Model comparison: Deviance Information Criterion (DIC) and
delete-one cross validation (CV) criterion
Parameter estimation with whole data set
Point Estimate: Posterior Mean (PM) of the parameters
Measures of Uncertainty: 95% Credible Intervals (CI)
Forecasting the end-of-year yield (Aug 2004 - Dec 2009)
27 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Application to NASS Data
Multiple data sources with temporal misalignment:
OYS (1993-2009)
AYS (2001-2009)
DAS (1996-2009)
Derive McMC sampling algorithms for fitting the model
Model comparison: Deviance Information Criterion (DIC) and
delete-one cross validation (CV) criterion
Parameter estimation with whole data set
Point Estimate: Posterior Mean (PM) of the parameters
Measures of Uncertainty: 95% Credible Intervals (CI)
Forecasting the end-of-year yield (Aug 2004 - Dec 2009)
28 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Computational Details
Model parameters: µt, b, θ, β; σ2
mk , σ2
η; ρ1, ρ2
Prior specifications
N(0, 106
) on β and b
Inverse Gamma(0.01, 0.01) on σ2
mk , σ2
η
Uniform[−1, 1] prior on ρ1 and ρ2
Priors for µt and θ are specified in hierarchical structure (e.g., prior
of µt : yield weather model).
29 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Computational Details
Model parameters: µt, b, θ, β; σ2
mk , σ2
η; ρ1, ρ2
Prior specifications
N(0, 106
) on β and b
Inverse Gamma(0.01, 0.01) on σ2
mk , σ2
η
Uniform[−1, 1] prior on ρ1 and ρ2
Priors for µt and θ are specified in hierarchical structure (e.g., prior
of µt : yield weather model).
Fit BHM via McMC.
RM1-RM2: Gibbs sampling
RM3-RM5: Metropolis-Hastings algorithm
30 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Sampling algorithm for RM4
Generate µt, b, θ, β, σ2
η from standard distributions.
e.g. Generate µt from
[µt |·] ∝ N
∆2t
∆1t
,
1
∆1t
,
where ∆1t = 1′
5Σ−1
OYS 15 + 1′
4Σ−1
AYS 14 + s−2
12,t,DAS + σ−2
η and
∆2t = 1′
5Σ−1
OYS (θt,OYS − bOYS ) + 1′
4Σ−1
AYS (θt,AYS − bAYS ) +
y12,t,DAS /s2
12,t,DAS + (β0 + β1t + · · · + β4Progt )/σ2
η.
Transform (σ2
8, · · · , σ2
12, ρ2) into
γ = log(σ2
8), · · · , log(σ2
12), log(1+ρ2
1−ρ2
) , and use random-walk
Metropolis algorithm with multivariate normal proposal distribution
on γ,
J(γ(k)
|γ(k−1)
) = N(γ(k−1)
, σ2
γI).
31 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Model Comparison
RM1 RM2 RM3 RM4 RM5
No No [ytmk |θtmk ] θtmk θtmk
Monthly correlation
ρ1 = ρ2 = 0 ρ2 = 0 ρ1 = 0
Forecast error variance σ2
m σ2
k σ2
m σ2
m σ2
m
Weather*year interaction No Yes
DIC 782.29 788.35 771.67 700.58 701.09
pD 29.30 27.22 29.00 30.02 30.72
CV 4.10 4.10 3.35 1.81 1.79
Deviance Information Criterion (DIC), Gelman et al (2003).
pD: effective number of parameters.
CV:
CV =
1
#{ytmk }
m,t,k
ytmk − E(ytmk |y(tmk)) ,
where y(tmk) denotes the data set with ytmk removed (Importance
sampling).
32 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Yield Estimates by RM4
Year PM CI(lb) CI(ub) Width I(BD ∈ CI) I(DAS ∈ CI)
1993 97.93 92.20 104.21 12.01 1 NA
1994 142.01 136.95 146.77 9.82 1 NA
1995 110.52 105.65 115.55 9.90 0 NA
1996 130.96 130.09 131.80 1.72 1 1
1997 131.86 131.10 132.60 1.50 1 1
1998 143.48 142.61 144.40 1.79 1 1
1999 141.21 140.33 142.08 1.75 1 1
2000 142.54 141.69 143.41 1.72 1 1
2001 144.99 144.02 145.98 1.96 1 1
2002 139.55 138.11 141.00 2.89 1 1
2003 151.47 150.07 152.81 2.74 1 1
2004 169.68 168.69 170.68 2.00 1 1
2005 157.35 156.14 158.51 2.38 1 1
2006 158.18 157.15 159.18 2.03 0 1
2007 160.87 159.75 161.97 2.22 1 1
2008 164.73 163.72 165.75 2.03 1 1
2009 174.84 173.92 175.76 1.84 1 1
33 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Yield Estimates by RM4
1995 2000 2005
Yield Estimates over Time (Years)
Year
Yield
PM(mu)
CI
DAS
Our estimates follow data
pattern very well.
The ASB yield and DAS are
captured within the CI most
of the time.
34 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Posterior Summaries of RM4
Survey bias:
PM Aug Sep Oct Nov Dec
OYS 14.48 12.57 14.04 14.84 14.83
AYS -15.72 -16.70 -11.71 -4.50
Model parameters:
Parameters PM CI (lb) CI (ub)
Forecast error variance
σ2
8 60.62 32.26 102.76
σ2
9 47.14 25.16 78.72
σ2
10 27.83 14.92 46.92
σ2
11 7.08 3.57 12.35
σ2
12 6.41 2.60 14.03
Latent process model
β0 113.68 105.39 122.17
β1 (year) 3.47 2.68 4.29
β2 (PCP) -3.57 -7.98 0.63
β3 (Temp) -5.60 -9.86 -1.36
β4 (Prog) 5.04 1.06 9.65
σ2
η 64.01 25.27 155.48
35 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Forecast RMSE of Monthly Board, Keller and Olkin
(2002), and RM4
KO estimator: y∗
tm = (y∗
tm,OYS , y∗
tm,AYS )′
, and Stm denotes sample
covariance matrix of (y∗
tm,OYS , y∗
tm,AYS )′
ˆµKO
t =
1
′
S−1
tm y∗
tm
1
′
S−1
tm 1
,
RMSE: mean squared divergence between each forecast and DAS.
Month RMSE.ASB Keller & Olkin RMSE.RM4
8 7.79 5.93 6.18
9 7.52 6.76 6.53
10 4.48 5.04 3.87
11 2.64 3.80 2.74
8-11 6.01 5.49 5.08
36 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Forecasting End-of-year Yield via RM4
• DAS; △ BHM; + Monthly board; ⋄ Keller & Olkin estimator;
8 9 10 11 12
145180
Plot of Forecast for 2004
Months
ForecastedYield
8 9 10 11 12
145170
Plot of Forecast for 2005
Months
ForecastedYield
8 9 10 11 12
150175
Plot of Forecast for 2006
Months
ForecastedYield
8 9 10 11 12
160185
Plot of Forecast for 2007
Months
ForecastedYield
8 9 10 11 12
155180
Plot of Forecast for 2008
Months
ForecastedYield
8 9 10 11 12
160190
Plot of Forecast for 2009
Months
ForecastedYield
37 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Outline
1 Project overview
2 Regional yield modeling
3 State yield modeling
4 Concluding remarks
38 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Modeling State Survey Indications
Review regional yield model
Ramifications of “bias-adjusted” models
Model comparison
Forecasting performance of BHM
Downscaling to state level
OYS (Aug - Dec), AYS (Aug - Nov) and DAS (Dec)
Explanatory analysis and proposed working model
Linear mixed model for state indications
Random coefficient model for yield-weather relationship
(Dempster et al, 1981)
39 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Modeling State Survey Indications
Review regional yield model
Ramifications of “bias-adjusted” models
Model comparison
Forecasting performance of BHM
Downscaling to state level
OYS (Aug - Dec), AYS (Aug - Nov) and DAS (Dec)
Explanatory analysis and proposed working model
Linear mixed model for state indications
Random coefficient model for yield-weather relationship
(Dempster et al, 1981)
40 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
State indications (IA and OH)
1995 2000 2005
Yield Indications over Time (Years),IA
Year
Yield
OY
AY
Ag
Aug
Sep
Oct
Nov
Dec
1995 2000 2005
Yield Indications over Time (Years),OH
Year
Yield
OY
AY
Ag
Aug
Sep
Oct
Nov
Dec
41 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Notation and Data Model
For each survey k ∈ {OYS, AYS, DAS} in month m ∈ {8, · · · , 12},
year t = 1, 2, · · · , T and in state l ∈ {IL, · · · , WI}
ytlmk : survey indication of corn yield;
stlmk : s.e. associated with ytlmk
θtlmk : finite population mean of ytlmk
µtl : true yield
Vector notation: ytl denotes 9-dim column vector with OYS
(Aug-Dec) and AYS (Aug-Nov) indications, similarly θtl , stl .
Data model
OYS and AYS indications:
[ytl |θtl ]
ind
∼ N(θtl , diag(s2
tl ))
DAS indications: yt,l,12,DAS
ind
∼ N(µtl , s2
t,l,12,DAS )
42 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Notation and Data Model
For each survey k ∈ {OYS, AYS, DAS} in month m ∈ {8, · · · , 12},
year t = 1, 2, · · · , T and in state l ∈ {IL, · · · , WI}
ytlmk : survey indication of corn yield;
stlmk : s.e. associated with ytlmk
θtlmk : finite population mean of ytlmk
µtl : true yield
Vector notation: ytl denotes 9-dim column vector with OYS
(Aug-Dec) and AYS (Aug-Nov) indications, similarly θtl , stl .
Data model
OYS and AYS indications:
[ytl |θtl ]
ind
∼ N(θtl , diag(s2
tl ))
DAS indications: yt,l,12,DAS
ind
∼ N(µtl , s2
t,l,12,DAS )
43 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Notation and Data Model
For each survey k ∈ {OYS, AYS, DAS} in month m ∈ {8, · · · , 12},
year t = 1, 2, · · · , T and in state l ∈ {IL, · · · , WI}
ytlmk : survey indication of corn yield;
stlmk : s.e. associated with ytlmk
θtlmk : finite population mean of ytlmk
µtl : true yield
Vector notation: ytl denotes 9-dim column vector with OYS
(Aug-Dec) and AYS (Aug-Nov) indications, similarly θtl , stl .
Data model
OYS and AYS indications:
[ytl |θtl ]
ind
∼ N(θtl , diag(s2
tl ))
DAS indications: yt,l,12,DAS
ind
∼ N(µtl , s2
t,l,12,DAS )
44 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Modeling Finite Population Means
Survey bias: btl = (btl,8,OYS , · · · , btl,12,OYS , btl,8,AYS , · · · , btl,11,AYS )′
θtl = µtl 19 + btl .
Biases btl from state l:
(b1,l , · · · , bT,l )′
∼ N(1T ⊗ b, (IT + λ1T 1′
T ) ⊗ (∆R∆′
)) (∗)
where ∆ = (σ8,OYS , · · · , σ12,OYS , σ8,AYS , · · · , σ11AYS ) and
R =
R5(ρ1) 0
0 R4(ρ2)
.
(*) is equivalent to
[bt,l |bl ]
ind
∼ N(bl , ∆R∆′
), and bl
iid
∼ N(b, λ(∆R∆′
)),
where λ1T 1′
T introduces within-state correlation.
Spatial correlation; covariates
45 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Modeling Finite Population Means
Survey bias: btl = (btl,8,OYS , · · · , btl,12,OYS , btl,8,AYS , · · · , btl,11,AYS )′
θtl = µtl 19 + btl .
Biases btl from state l:
(b1,l , · · · , bT,l )′
∼ N(1T ⊗ b, (IT + λ1T 1′
T ) ⊗ (∆R∆′
)) (∗)
where ∆ = (σ8,OYS , · · · , σ12,OYS , σ8,AYS , · · · , σ11AYS ) and
R =
R5(ρ1) 0
0 R4(ρ2)
.
(*) is equivalent to
[bt,l |bl ]
ind
∼ N(bl , ∆R∆′
), and bl
iid
∼ N(b, λ(∆R∆′
)),
where λ1T 1′
T introduces within-state correlation.
Spatial correlation; covariates
46 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Process Model for True Yield
Random regression-coefficient model:
State-level covariates, xtl ,
µtl = γ0l + γ1l t + γ2l xtl + ηtl
where, γ0l
iid
∼ N γ0, σ2
0γ , γ1l
iid
∼ N γ1, σ2
1γ , γ2l
iid
∼ N γ2, σ2
2γ and
ηtl
iid
∼ N 0, σ2
η .
Dynamic regression model using real-time weather.
Large number of covariates: ridge regression, principal component
regression (PCR).
Spatial correlation: spatially correlated errors vs spatially varying
coefficient.
47 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Process Model for True Yield
Random regression-coefficient model:
State-level covariates, xtl ,
µtl = γ0l + γ1l t + γ2l xtl + ηtl
where, γ0l
iid
∼ N γ0, σ2
0γ , γ1l
iid
∼ N γ1, σ2
1γ , γ2l
iid
∼ N γ2, σ2
2γ and
ηtl
iid
∼ N 0, σ2
η .
Dynamic regression model using real-time weather.
Large number of covariates: ridge regression, principal component
regression (PCR).
Spatial correlation: spatially correlated errors vs spatially varying
coefficient.
48 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Estimated Yields (IA and OH)
1995 2000 2005
Yield Estimates over Time (Years), IA
Year
Yield
PM(mu)
CI
DAS
1995 2000 2005
Yield Estimates over Time (Years), OH
Year
Yield
PM(mu)
CI
DAS
49 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Outline
1 Project overview
2 Regional yield modeling
3 State yield modeling
4 Concluding remarks
50 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Concluding Remarks
Solving a problem of national (and international) interest,
estimating/forecasting agricultural commodities to aid in setting
national figures.
Bayesian hiearchical model that combines multiple repeated surveys
with auxiliary variables to form single forecast/estimate.
Future directions: state-level modeling, jointly model multiple crops,
and model planted/harvest acreage.
51 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Selected Publications and Working Papers
Survey statistics
Wang, J., & Opsomer, J. (2010), On the Asymptotic Normality of
Nondifferentiable Survey Estimators, tentatively accepted by Biometrika.
Wang, J., Wang, H., & Opsomer, J. (2010+), Bagging Non-differentiable
Estimators in Complex Surveys, under review.
Wang, J., & Opsomer, J. (2010), Characterizing Population Dispersion
and Identifying Outliers in Multivariate Survey Data, under review.
Wang, J. (2010), Semiparametric Interval Estimation under Nested-error
Regression Model, under review.
Shape restricted inference
Wang, J., & Meyer, M. (2010), Testing the Monotonicity or Convexity of
a Function using Regression Splines, tentatively accepted by Canadian
Journal of Statistics.
Meyer, M., & Wang, J. (2010), Hypothesis Tests in Constrained
Parametric Regression, under review by Biometrika.
Bayesian analysis
Wang, J., et al (2010), A Bayesian Approach to Estimating Agricultural
Yield Based on Multiple Repeated Surveys, in preparation for JABES.
Wang, J., & Holan, S. (2010+), Bayesian Smooth-transition Regression
with Ordered Categorical Response, in preparation. 52 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Thank you very much!
53 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Standardized Delete-one Residuals (SDR) in RM4
1995 2000 2005
−3−2−10123
OY Residual vs year
year
Residuals
YOY8.s
YOY9.s
YOY10.s
YOY11.s
YOY12.s
1995 2000 2005
−3−2−10123
AY Residual vs Year
year
Residuals
YAY8.s
YAY9.s
YAY10.s
YAY11.s
YAg12.s
rtmk =
ytmk − E(ytmk |y(tmk))
Var(ytmk |y(tmk))
54 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Model for Bias
Applies to RM1-RM5
For m = 8, 9, . . . , 12, and k ∈ {OYS, AYS},
bm,OYS
iid
∼ Normal(0, σ2
b)I(0,∞)
bm,AYS
iid
∼ Normal(0, σ2
b)I(−∞,0)
OYS bias: inherent in the survey indications
AYS bias: pessimism of farmers
McMC Implementation: generate from unconstrained conditional
distribution and retain the values if the constraint is met (Gelfand et
al 1992)
55 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
How to model true yield? (2)
State yield = District yield × District harvested acres
State harvested acres
1995 2000 2005
0.020.060.100.14
IA, 10
Propofharvested
1995 2000 2005
0.020.060.100.14
IA, 20
Propofharvested
1995 2000 2005
0.020.060.100.14
IA, 30
Propofharvested
1995 2000 2005
0.020.060.100.14
IA, 40
Propofharvested
1995 2000 2005
0.020.060.100.14
IA, 50
Propofharvested
1995 2000 2005
0.020.060.100.14
IA, 60
Propofharvested
1995 2000 2005
0.020.060.100.14
IA, 70
Propofharvested
1995 2000 2005
0.020.060.100.14
IA, 80
Propofharvested
1995 2000 2005
0.020.060.100.14
IA, 90
Propofharvested District
State
harvested acres for
Iowa are within ±1%
56 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Official district yields (IA and IL)
1995 2000 2005
IA
year
yield
1995 2000 2005
−40−2002040
IA, residuals
year
residuals
1995 2000 2005
IL
year
yield
1995 2000 2005
−40−20020
IL, residuals
year
residuals
Circle: State (detrended) yield
57 / 58
Outline Project overview Regional yield modeling State yield modeling Concluding remarks
Fitting FLR on official district yields
Same pattern of weather effects for ALL districts
1995 2000 2005
7090110130
WI, District 10
Year
yield
1995 2000 2005
6080120
WI, District 20
Year
yield
1995 2000 2005
80100140
WI, District 30
Year
yield
1995 2000 2005
90110130150
WI, District 40
Year
yield
1995 2000 2005
80100120140
WI, District 50
Year
yield
1995 2000 2005
80100140
WI, District 60
Year
yield
1995 2000 2005
100120140160
WI, District 70
Year
yield
1995 2000 2005
110130150
WI, District 80
Year
yield
1995 2000 2005
100120140160
WI, District 90
Year
yield
State-specific pattern of weather effects
1995 2000 2005
7090110130
WI, District 10
Year
yield
1995 2000 2005
6080120
WI, District 20
Year
yield
1995 2000 2005
80100140
WI, District 30
Year
yield
1995 2000 2005
90110130150
WI, District 40
Year
yield
1995 2000 2005
90110130
WI, District 50
Year
yield
1995 2000 2005
80100140
WI, District 60
Year
yield
1995 2000 2005
100140
WI, District 70
Year
yield
1995 2000 2005
100120140160
WI, District 80
Year
yield
1995 2000 2005
100120140160
WI, District 90
Year
yield
Black: official; red: fitted
58 / 58

More Related Content

Similar to A Bayesian Approach to Estimating Agricultual Yield Based on Multiple Repeated Surveys

Landscape Capacity Analysis For Ventura County
Landscape Capacity Analysis For Ventura CountyLandscape Capacity Analysis For Ventura County
Landscape Capacity Analysis For Ventura County
Ecotrust
 
Adaptation Metrics: Community Based Measuring And Prioritizing Adaptation Act...
Adaptation Metrics: Community Based Measuring And Prioritizing Adaptation Act...Adaptation Metrics: Community Based Measuring And Prioritizing Adaptation Act...
Adaptation Metrics: Community Based Measuring And Prioritizing Adaptation Act...
Prabhakar SVRK
 
Biometry and trends in Agricultural Research: Some Challenges and Opportuniti...
Biometry and trends in Agricultural Research: Some Challenges and Opportuniti...Biometry and trends in Agricultural Research: Some Challenges and Opportuniti...
Biometry and trends in Agricultural Research: Some Challenges and Opportuniti...
International Institute of Tropical Agriculture
 
Regional impact assessment modelling
Regional impact assessment modellingRegional impact assessment modelling
Article review (Agicultural Production Economics) by Yishak and Kutoya (Hu, M...
Article review (Agicultural Production Economics) by Yishak and Kutoya (Hu, M...Article review (Agicultural Production Economics) by Yishak and Kutoya (Hu, M...
Article review (Agicultural Production Economics) by Yishak and Kutoya (Hu, M...
YishakShitaye1
 
Using STATA in Survey Data Analysis - Niveen El Zayat
Using STATA in Survey Data Analysis - Niveen El ZayatUsing STATA in Survey Data Analysis - Niveen El Zayat
Using STATA in Survey Data Analysis - Niveen El Zayat
Economic Research Forum
 
Workshop session 4 - Optimal sample designs for general community telephone s...
Workshop session 4 - Optimal sample designs for general community telephone s...Workshop session 4 - Optimal sample designs for general community telephone s...
Workshop session 4 - Optimal sample designs for general community telephone s...
The Social Research Centre
 
NTT
NTTNTT
FPP 1. Getting started
FPP 1. Getting startedFPP 1. Getting started
FPP 1. Getting started
Rob Hyndman
 
EcoTas13 Hutchinson e-MAST ANU
EcoTas13 Hutchinson e-MAST ANUEcoTas13 Hutchinson e-MAST ANU
EcoTas13 Hutchinson e-MAST ANU
TERN Australia
 
A Systemic View of Food Security for Early Warning Analysis: How Far Away?
A Systemic View of Food Security for Early Warning Analysis: How Far Away?A Systemic View of Food Security for Early Warning Analysis: How Far Away?
A Systemic View of Food Security for Early Warning Analysis: How Far Away?
International Food Policy Research Institute (IFPRI)
 
Jianqiang Ren_Simulation of regional winter wheat yield by EPIC model.ppt
Jianqiang Ren_Simulation of regional winter wheat yield by EPIC model.pptJianqiang Ren_Simulation of regional winter wheat yield by EPIC model.ppt
Jianqiang Ren_Simulation of regional winter wheat yield by EPIC model.ppt
grssieee
 
July 29-330-Mike Kucera
July 29-330-Mike KuceraJuly 29-330-Mike Kucera
July 29-330-Mike Kucera
Soil and Water Conservation Society
 
Sct2013 boston,randomizationmetricsposter,d6.2
Sct2013 boston,randomizationmetricsposter,d6.2Sct2013 boston,randomizationmetricsposter,d6.2
Sct2013 boston,randomizationmetricsposter,d6.2
Dennis Sweitzer
 
Multi-objective land-water allocation model for sustainable agriculture with ...
Multi-objective land-water allocation model for sustainable agriculture with ...Multi-objective land-water allocation model for sustainable agriculture with ...
Multi-objective land-water allocation model for sustainable agriculture with ...
Kannapha Amaruchkul
 
Integrated assessment of agricultural systems (SEAMLESS)
Integrated assessment of agricultural systems (SEAMLESS)Integrated assessment of agricultural systems (SEAMLESS)
Integrated assessment of agricultural systems (SEAMLESS)
Martin Van Ittersum
 
Shawn Conley - Key Management Practices That Explain Soybean Yield Gaps Acros...
Shawn Conley - Key Management Practices That Explain Soybean Yield Gaps Acros...Shawn Conley - Key Management Practices That Explain Soybean Yield Gaps Acros...
Shawn Conley - Key Management Practices That Explain Soybean Yield Gaps Acros...
John Blue
 
An economic analysis of potato production in Achham district of Nepal
An economic analysis of potato production in Achham district of NepalAn economic analysis of potato production in Achham district of Nepal
An economic analysis of potato production in Achham district of Nepal
SushilSapkota5
 
U.S.NationalAgriculturalLandCoverMonitoring_Mueller.pptx
U.S.NationalAgriculturalLandCoverMonitoring_Mueller.pptxU.S.NationalAgriculturalLandCoverMonitoring_Mueller.pptx
U.S.NationalAgriculturalLandCoverMonitoring_Mueller.pptx
grssieee
 
U.S.NationalAgriculturalLandCoverMonitoring_Mueller.pptx
U.S.NationalAgriculturalLandCoverMonitoring_Mueller.pptxU.S.NationalAgriculturalLandCoverMonitoring_Mueller.pptx
U.S.NationalAgriculturalLandCoverMonitoring_Mueller.pptx
grssieee
 

Similar to A Bayesian Approach to Estimating Agricultual Yield Based on Multiple Repeated Surveys (20)

Landscape Capacity Analysis For Ventura County
Landscape Capacity Analysis For Ventura CountyLandscape Capacity Analysis For Ventura County
Landscape Capacity Analysis For Ventura County
 
Adaptation Metrics: Community Based Measuring And Prioritizing Adaptation Act...
Adaptation Metrics: Community Based Measuring And Prioritizing Adaptation Act...Adaptation Metrics: Community Based Measuring And Prioritizing Adaptation Act...
Adaptation Metrics: Community Based Measuring And Prioritizing Adaptation Act...
 
Biometry and trends in Agricultural Research: Some Challenges and Opportuniti...
Biometry and trends in Agricultural Research: Some Challenges and Opportuniti...Biometry and trends in Agricultural Research: Some Challenges and Opportuniti...
Biometry and trends in Agricultural Research: Some Challenges and Opportuniti...
 
Regional impact assessment modelling
Regional impact assessment modellingRegional impact assessment modelling
Regional impact assessment modelling
 
Article review (Agicultural Production Economics) by Yishak and Kutoya (Hu, M...
Article review (Agicultural Production Economics) by Yishak and Kutoya (Hu, M...Article review (Agicultural Production Economics) by Yishak and Kutoya (Hu, M...
Article review (Agicultural Production Economics) by Yishak and Kutoya (Hu, M...
 
Using STATA in Survey Data Analysis - Niveen El Zayat
Using STATA in Survey Data Analysis - Niveen El ZayatUsing STATA in Survey Data Analysis - Niveen El Zayat
Using STATA in Survey Data Analysis - Niveen El Zayat
 
Workshop session 4 - Optimal sample designs for general community telephone s...
Workshop session 4 - Optimal sample designs for general community telephone s...Workshop session 4 - Optimal sample designs for general community telephone s...
Workshop session 4 - Optimal sample designs for general community telephone s...
 
NTT
NTTNTT
NTT
 
FPP 1. Getting started
FPP 1. Getting startedFPP 1. Getting started
FPP 1. Getting started
 
EcoTas13 Hutchinson e-MAST ANU
EcoTas13 Hutchinson e-MAST ANUEcoTas13 Hutchinson e-MAST ANU
EcoTas13 Hutchinson e-MAST ANU
 
A Systemic View of Food Security for Early Warning Analysis: How Far Away?
A Systemic View of Food Security for Early Warning Analysis: How Far Away?A Systemic View of Food Security for Early Warning Analysis: How Far Away?
A Systemic View of Food Security for Early Warning Analysis: How Far Away?
 
Jianqiang Ren_Simulation of regional winter wheat yield by EPIC model.ppt
Jianqiang Ren_Simulation of regional winter wheat yield by EPIC model.pptJianqiang Ren_Simulation of regional winter wheat yield by EPIC model.ppt
Jianqiang Ren_Simulation of regional winter wheat yield by EPIC model.ppt
 
July 29-330-Mike Kucera
July 29-330-Mike KuceraJuly 29-330-Mike Kucera
July 29-330-Mike Kucera
 
Sct2013 boston,randomizationmetricsposter,d6.2
Sct2013 boston,randomizationmetricsposter,d6.2Sct2013 boston,randomizationmetricsposter,d6.2
Sct2013 boston,randomizationmetricsposter,d6.2
 
Multi-objective land-water allocation model for sustainable agriculture with ...
Multi-objective land-water allocation model for sustainable agriculture with ...Multi-objective land-water allocation model for sustainable agriculture with ...
Multi-objective land-water allocation model for sustainable agriculture with ...
 
Integrated assessment of agricultural systems (SEAMLESS)
Integrated assessment of agricultural systems (SEAMLESS)Integrated assessment of agricultural systems (SEAMLESS)
Integrated assessment of agricultural systems (SEAMLESS)
 
Shawn Conley - Key Management Practices That Explain Soybean Yield Gaps Acros...
Shawn Conley - Key Management Practices That Explain Soybean Yield Gaps Acros...Shawn Conley - Key Management Practices That Explain Soybean Yield Gaps Acros...
Shawn Conley - Key Management Practices That Explain Soybean Yield Gaps Acros...
 
An economic analysis of potato production in Achham district of Nepal
An economic analysis of potato production in Achham district of NepalAn economic analysis of potato production in Achham district of Nepal
An economic analysis of potato production in Achham district of Nepal
 
U.S.NationalAgriculturalLandCoverMonitoring_Mueller.pptx
U.S.NationalAgriculturalLandCoverMonitoring_Mueller.pptxU.S.NationalAgriculturalLandCoverMonitoring_Mueller.pptx
U.S.NationalAgriculturalLandCoverMonitoring_Mueller.pptx
 
U.S.NationalAgriculturalLandCoverMonitoring_Mueller.pptx
U.S.NationalAgriculturalLandCoverMonitoring_Mueller.pptxU.S.NationalAgriculturalLandCoverMonitoring_Mueller.pptx
U.S.NationalAgriculturalLandCoverMonitoring_Mueller.pptx
 

More from Jay (Jianqiang) Wang

The Practice of Data Driven Products in Kuaishou
The Practice of Data Driven Products in KuaishouThe Practice of Data Driven Products in Kuaishou
The Practice of Data Driven Products in Kuaishou
Jay (Jianqiang) Wang
 
Artificial Intelligence in fashion -- Combining Statistics and Expert Human J...
Artificial Intelligence in fashion -- Combining Statistics and Expert Human J...Artificial Intelligence in fashion -- Combining Statistics and Expert Human J...
Artificial Intelligence in fashion -- Combining Statistics and Expert Human J...
Jay (Jianqiang) Wang
 
Making data-informed decisions and building intelligent products (Chinese)
Making data-informed decisions and building intelligent products (Chinese)Making data-informed decisions and building intelligent products (Chinese)
Making data-informed decisions and building intelligent products (Chinese)
Jay (Jianqiang) Wang
 
Notes on Machine Learning and Data-centric Startups
Notes on Machine Learning and Data-centric StartupsNotes on Machine Learning and Data-centric Startups
Notes on Machine Learning and Data-centric Startups
Jay (Jianqiang) Wang
 
Introduction to data science and its application in online advertising
Introduction to data science and its application in online advertisingIntroduction to data science and its application in online advertising
Introduction to data science and its application in online advertising
Jay (Jianqiang) Wang
 
How to prepare for data science interviews
How to prepare for data science interviewsHow to prepare for data science interviews
How to prepare for data science interviews
Jay (Jianqiang) Wang
 
Introduction to data science and candidate data science projects
Introduction to data science and candidate data science projectsIntroduction to data science and candidate data science projects
Introduction to data science and candidate data science projects
Jay (Jianqiang) Wang
 
Multivariate outlier detection
Multivariate outlier detectionMultivariate outlier detection
Multivariate outlier detection
Jay (Jianqiang) Wang
 
Multivariate outlier detection
Multivariate outlier detectionMultivariate outlier detection
Multivariate outlier detection
Jay (Jianqiang) Wang
 

More from Jay (Jianqiang) Wang (9)

The Practice of Data Driven Products in Kuaishou
The Practice of Data Driven Products in KuaishouThe Practice of Data Driven Products in Kuaishou
The Practice of Data Driven Products in Kuaishou
 
Artificial Intelligence in fashion -- Combining Statistics and Expert Human J...
Artificial Intelligence in fashion -- Combining Statistics and Expert Human J...Artificial Intelligence in fashion -- Combining Statistics and Expert Human J...
Artificial Intelligence in fashion -- Combining Statistics and Expert Human J...
 
Making data-informed decisions and building intelligent products (Chinese)
Making data-informed decisions and building intelligent products (Chinese)Making data-informed decisions and building intelligent products (Chinese)
Making data-informed decisions and building intelligent products (Chinese)
 
Notes on Machine Learning and Data-centric Startups
Notes on Machine Learning and Data-centric StartupsNotes on Machine Learning and Data-centric Startups
Notes on Machine Learning and Data-centric Startups
 
Introduction to data science and its application in online advertising
Introduction to data science and its application in online advertisingIntroduction to data science and its application in online advertising
Introduction to data science and its application in online advertising
 
How to prepare for data science interviews
How to prepare for data science interviewsHow to prepare for data science interviews
How to prepare for data science interviews
 
Introduction to data science and candidate data science projects
Introduction to data science and candidate data science projectsIntroduction to data science and candidate data science projects
Introduction to data science and candidate data science projects
 
Multivariate outlier detection
Multivariate outlier detectionMultivariate outlier detection
Multivariate outlier detection
 
Multivariate outlier detection
Multivariate outlier detectionMultivariate outlier detection
Multivariate outlier detection
 

Recently uploaded

Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
slg6lamcq
 
一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理
ugydym
 
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理 原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
tzu5xla
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
aguty
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
vasanthatpuram
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
eudsoh
 
Drownings spike from May to August in children
Drownings spike from May to August in childrenDrownings spike from May to August in children
Drownings spike from May to August in children
Bisnar Chase Personal Injury Attorneys
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
Vietnam Cotton & Spinning Association
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
osoyvvf
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
actyx
 
社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
NABLAS株式会社
 
A gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented GenerationA gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented Generation
dataschool1
 

Recently uploaded (20)

Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
 
一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理
 
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理 原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
 
Drownings spike from May to August in children
Drownings spike from May to August in childrenDrownings spike from May to August in children
Drownings spike from May to August in children
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
 
社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
 
A gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented GenerationA gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented Generation
 

A Bayesian Approach to Estimating Agricultual Yield Based on Multiple Repeated Surveys

  • 1. Outline Project overview Regional yield modeling State yield modeling Concluding remarks A Bayesian Approach to Estimating Agricultural Yield Based on Multiple Repeated Surveys Jianqiang (Jay) Wang National Institute of Statistical Sciences July 9th, 2010 Joint work with Scott Holan (Univ. of Missouri), Wendy Barboza (NASS-USDA), Balgobin Nandram (Worcester Polytech), Criselda Toto, Dilli Bhatta (Worcester Polytech), and Edwin Anderson (NASS-USDA) 1 / 58
  • 2. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Outline 1 Project overview 2 Regional yield modeling 3 State yield modeling 4 Concluding remarks 2 / 58
  • 3. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Outline 1 Project overview 2 Regional yield modeling 3 State yield modeling 4 Concluding remarks 3 / 58
  • 4. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Project background National Agricultural Statistical Service (NASS)-USDA: Provide timely, accurate and useful statistics in service to U.S. agriculture. Hundreds of surveys and over five hundred reports covering virtually every aspect of U.S. agriculture. Monthly Crop Production report: Area planted, harvest, yield and production at state/national level for multiple crops. Projected value or an end-of-season estimate e.g. forecasted corn yield (Aug-Nov), and end-of-season yield estimate (Dec or Jan). 4 / 58
  • 5. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Agricultural Statistics Board (ASB) Panel of experts that analyze the survey data and publish official estimates based on a combination of survey data and expert opinion Information from multiple surveys are combined into a single forecasted/estimated value. Members of ASB adjust the predicted number using their expert knowledge. ASB “lockup” process Literally locked in an area with armed guards outside. Windows covered with metal shades. No phones, PDA, or laptop. Starts around 10:00pm and lasts until the report is released at 8:30am. 5 / 58
  • 6. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Project overview Drawbacks of the ASB Process Influenced by subjective judgement of ASB members No quantification of the uncertainty associated with ASB forecast/estimate Lacks transparency and repeatability Goal of the project Build a statistically rigorous forecasting model that uses all available information including expert opinion of ASB Major commodities: corn, soybeans, cotton, wheat Variables: planted and harvested acres, yield, stocks Levels: state, region, US Make the crop production model transparent and repeatable; provide measures of uncertainty. 6 / 58
  • 7. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Regional and State Yield Modeling Regional yield modeling Forecast end-of-year corn yield at speculative (“spec”) region level Spec states: IL, IN, IA, MN, NE, OH, WI State yield modeling Model yield for individual state and aggregate up to regional level Spec yield = State yield × State harvested acres Spec harvested acres 7 / 58
  • 8. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Regional and State Yield Modeling Regional yield modeling Forecast end-of-year corn yield at speculative (“spec”) region level Spec states: IL, IN, IA, MN, NE, OH, WI State yield modeling Model yield for individual state and aggregate up to regional level Spec yield = State yield × State harvested acres Spec harvested acres 8 / 58
  • 9. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Sources of Data Survey indications and associated s.e.’s Objective Yield (OYS): Aug-Dec, field measurement survey Number of corn ears for sample fields Grain weight per ear Estimate harvest loss Agricultural Yield (AYS): Aug-Nov, farmer opinion survey December Ag Survey (DAS): Dec, farmer opinion survey Weather related auxiliary variables (National Climate Data Center) Temperature, preciptation, soil moisture Levels of resolution (district, state) Aggregated hourly, daily, weekly and monthly 9 / 58
  • 10. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Sources of Data Survey indications and associated s.e.’s Objective Yield (OYS): Aug-Dec, field measurement survey Number of corn ears for sample fields Grain weight per ear Estimate harvest loss Agricultural Yield (AYS): Aug-Nov, farmer opinion survey December Ag Survey (DAS): Dec, farmer opinion survey Weather related auxiliary variables (National Climate Data Center) Temperature, preciptation, soil moisture Levels of resolution (district, state) Aggregated hourly, daily, weekly and monthly 10 / 58
  • 11. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Conceptualization of the Problem Applies to both regional and state model Specify: Data Model [Survey indications|true yield, Θd ] Process Model [true yield| Θp] Parameter Model (Prior) [Θd , Θp] Bayesian hierarchical model (BHM) Provides a natural framework for incorporating data from mutiple sources 11 / 58
  • 12. Outline Project overview Regional yield modeling State yield modeling Concluding remarks BHM (Regional model) Conditional independence of surveys: [OYS, AYS, DAS| true yield, Θd ] = [OYS| true yield, Θd ][AYS|·][DAS|·]. Posterior Distribution: [true yield, Θd , Θp|OYS, AYS, DAS] ∝ [OYS| true yield, Θd ][AYS|·][DAS|·][true yield|Θp][Θd ][Θp]. Composite estimation: Graybill and Deal (1959), Keller and Olkin (2002) 12 / 58
  • 13. Outline Project overview Regional yield modeling State yield modeling Concluding remarks BHM (Regional model) Conditional independence of surveys: [OYS, AYS, DAS| true yield, Θd ] = [OYS| true yield, Θd ][AYS|·][DAS|·]. Posterior Distribution: [true yield, Θd , Θp|OYS, AYS, DAS] ∝ [OYS| true yield, Θd ][AYS|·][DAS|·][true yield|Θp][Θd ][Θp]. Composite estimation: Graybill and Deal (1959), Keller and Olkin (2002) 13 / 58
  • 14. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Outline 1 Project overview 2 Regional yield modeling 3 State yield modeling 4 Concluding remarks 14 / 58
  • 15. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Model Development 1 Exploratory Data Analysis 2 Propose working models for forecasting regional yield “Bias-adjusted” models (RM1-RM5) 3 Bayesian inference Implement Markov chain Monte Carlo (McMC) in R Simulate data to examine frequentist properties of Bayes estimator 4 Application to the NASS corn yield 15 / 58
  • 16. Outline Project overview Regional yield modeling State yield modeling Concluding remarks “Bias-adjusted” Models 1995 2000 2005 Yield Indications over Time (Years) Year Yield OYS AYS DAS ASB Aug Sep Oct Nov Dec DAS: most reliable Corn has been harvested in all states Sample size is much larger Objective yield (OYS) has positive bias compared to DAS. Ag yield (AYS) has negative bias, and the bias decreases (in absolute value) as we approach the end of year. The survey bias is “nearly” consistent across years, with a few exceptions. 16 / 58
  • 17. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Notation For each survey k ∈ {OYS, AYS, DAS}, in month m ∈ {8, . . . , 12} and year t = 1, . . . , T ytmk: survey indication of corn yield; stmk: standard error associated with ytmk; θtmk: finite population mean of ytmk; µt : true (unobserved) yield in tth year. 17 / 58
  • 18. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Models for Survey Indications (1) Consider five versions of “bias-adjusted” models: RM1-RM5 Survey indication (ytmk )= ppl mean (θtmk )+ sampling error; Population mean (θtmk ) = true yield (µt) + bias (bmk ) + forecast error Change of true yield over years (process model) and repeated surveys within a year (longitudinal) Data model for DAS: for k = DAS, ytmk ind ∼ N(µt, s2 tmk ) 18 / 58
  • 19. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Models for Survey Indications (2) Vector notation yt,OYS = (yt,8,OYS , yt,9,OYS , · · · , yt,12,OYS )′ , Similarly θt,OYS , st,OYS and AYS data. Data model for panel survey indications: yt,OYS yt,AYS · ind ∼ N θt,OYS θt,AYS , diag(st ) R5(ρ1) 0 0 R4(ρ1) diag(st ) , where st = (st,OYS , st,AYS )′ and R(ρ1) = ρ |m−m′ | 1 denotes AR(1) corr. matrix. 19 / 58
  • 20. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Models for Survey Indications (2) Vector notation yt,OYS = (yt,8,OYS , yt,9,OYS , · · · , yt,12,OYS )′ , Similarly θt,OYS , st,OYS and AYS data. Data model for panel survey indications: yt,OYS yt,AYS · ind ∼ N θt,OYS θt,AYS , diag(st ) R5(ρ1) 0 0 R4(ρ1) diag(st ) , where st = (st,OYS , st,AYS )′ and R(ρ1) = ρ |m−m′ | 1 denotes AR(1) corr. matrix. 20 / 58
  • 21. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Model for Finite Population Means Average survey bias: bOYS = (b8,OYS , b9,OYS , · · · , b12,OYS )′ , and similarly bAYS . θt,OYS θt,AYS · ind ∼ N µt 1 + bt,OYS bt,AYS , ∆ R5(ρ2) 0 0 R4(ρ2) ∆′ , where ∆ = diag(σ8,OYS , · · · , σ12,OYS , σ8,AYS , · · · , σ11,AYS ). Forecast error s.d. σmk : RM2: σmk = σk ; rest: σmk = σm. Correlation between months: ρ1 = ρ2 = 0 for RM1-RM2; ρ2 = 0 for RM3; ρ1 = 0 for RM4 and RM5. RM6 with conditional corr. between OYS and AYS: upper-left 9 × 9 submatrix of R5(ρ2) τR5(ρ2) τR5(ρ2) R5(ρ2) . 21 / 58
  • 22. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Model for Finite Population Means Average survey bias: bOYS = (b8,OYS , b9,OYS , · · · , b12,OYS )′ , and similarly bAYS . θt,OYS θt,AYS · ind ∼ N µt 1 + bt,OYS bt,AYS , ∆ R5(ρ2) 0 0 R4(ρ2) ∆′ , where ∆ = diag(σ8,OYS , · · · , σ12,OYS , σ8,AYS , · · · , σ11,AYS ). Forecast error s.d. σmk : RM2: σmk = σk ; rest: σmk = σm. Correlation between months: ρ1 = ρ2 = 0 for RM1-RM2; ρ2 = 0 for RM3; ρ1 = 0 for RM4 and RM5. RM6 with conditional corr. between OYS and AYS: upper-left 9 × 9 submatrix of R5(ρ2) τR5(ρ2) τR5(ρ2) R5(ρ2) . 22 / 58
  • 23. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Model for Finite Population Means Average survey bias: bOYS = (b8,OYS , b9,OYS , · · · , b12,OYS )′ , and similarly bAYS . θt,OYS θt,AYS · ind ∼ N µt 1 + bt,OYS bt,AYS , ∆ R5(ρ2) 0 0 R4(ρ2) ∆′ , where ∆ = diag(σ8,OYS , · · · , σ12,OYS , σ8,AYS , · · · , σ11,AYS ). Forecast error s.d. σmk : RM2: σmk = σk ; rest: σmk = σm. Correlation between months: ρ1 = ρ2 = 0 for RM1-RM2; ρ2 = 0 for RM3; ρ1 = 0 for RM4 and RM5. RM6 with conditional corr. between OYS and AYS: upper-left 9 × 9 submatrix of R5(ρ2) τR5(ρ2) τR5(ρ2) R5(ρ2) . 23 / 58
  • 24. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Model for True Yield Applies to RM1-RM5 Spec-region weather covariates: state level weather variables weighted by planted acres PCPt: average July precipitation Tempt: average July teperature Progt: percentage of corn planted by the 20-th week Linear regression model µt = β0 + β1t + β2PCPt + β3Tempt + β4Progt + β5(t ∗ PCPt ) + β6(t ∗ Tempt ) + β7(t ∗ Progt ) + ηt No weather×year interaction (i.e. β5 = β6 = β7 = 0) except in RM5 Alternative process model: µt − µt−1 = φ0 + φ1(µt−1 − µt−2) + ηt 24 / 58
  • 25. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Model for True Yield Applies to RM1-RM5 Spec-region weather covariates: state level weather variables weighted by planted acres PCPt: average July precipitation Tempt: average July teperature Progt: percentage of corn planted by the 20-th week Linear regression model µt = β0 + β1t + β2PCPt + β3Tempt + β4Progt + β5(t ∗ PCPt ) + β6(t ∗ Tempt ) + β7(t ∗ Progt ) + ηt No weather×year interaction (i.e. β5 = β6 = β7 = 0) except in RM5 Alternative process model: µt − µt−1 = φ0 + φ1(µt−1 − µt−2) + ηt 25 / 58
  • 26. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Application to NASS Data Multiple data sources with temporal misalignment: OYS (1993-2009) AYS (2001-2009) DAS (1996-2009) Derive McMC sampling algorithms for fitting the model Model comparison: Deviance Information Criterion (DIC) and delete-one cross validation (CV) criterion Parameter estimation with whole data set Point Estimate: Posterior Mean (PM) of the parameters Measures of Uncertainty: 95% Credible Intervals (CI) Forecasting the end-of-year yield (Aug 2004 - Dec 2009) 26 / 58
  • 27. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Application to NASS Data Multiple data sources with temporal misalignment: OYS (1993-2009) AYS (2001-2009) DAS (1996-2009) Derive McMC sampling algorithms for fitting the model Model comparison: Deviance Information Criterion (DIC) and delete-one cross validation (CV) criterion Parameter estimation with whole data set Point Estimate: Posterior Mean (PM) of the parameters Measures of Uncertainty: 95% Credible Intervals (CI) Forecasting the end-of-year yield (Aug 2004 - Dec 2009) 27 / 58
  • 28. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Application to NASS Data Multiple data sources with temporal misalignment: OYS (1993-2009) AYS (2001-2009) DAS (1996-2009) Derive McMC sampling algorithms for fitting the model Model comparison: Deviance Information Criterion (DIC) and delete-one cross validation (CV) criterion Parameter estimation with whole data set Point Estimate: Posterior Mean (PM) of the parameters Measures of Uncertainty: 95% Credible Intervals (CI) Forecasting the end-of-year yield (Aug 2004 - Dec 2009) 28 / 58
  • 29. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Computational Details Model parameters: µt, b, θ, β; σ2 mk , σ2 η; ρ1, ρ2 Prior specifications N(0, 106 ) on β and b Inverse Gamma(0.01, 0.01) on σ2 mk , σ2 η Uniform[−1, 1] prior on ρ1 and ρ2 Priors for µt and θ are specified in hierarchical structure (e.g., prior of µt : yield weather model). 29 / 58
  • 30. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Computational Details Model parameters: µt, b, θ, β; σ2 mk , σ2 η; ρ1, ρ2 Prior specifications N(0, 106 ) on β and b Inverse Gamma(0.01, 0.01) on σ2 mk , σ2 η Uniform[−1, 1] prior on ρ1 and ρ2 Priors for µt and θ are specified in hierarchical structure (e.g., prior of µt : yield weather model). Fit BHM via McMC. RM1-RM2: Gibbs sampling RM3-RM5: Metropolis-Hastings algorithm 30 / 58
  • 31. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Sampling algorithm for RM4 Generate µt, b, θ, β, σ2 η from standard distributions. e.g. Generate µt from [µt |·] ∝ N ∆2t ∆1t , 1 ∆1t , where ∆1t = 1′ 5Σ−1 OYS 15 + 1′ 4Σ−1 AYS 14 + s−2 12,t,DAS + σ−2 η and ∆2t = 1′ 5Σ−1 OYS (θt,OYS − bOYS ) + 1′ 4Σ−1 AYS (θt,AYS − bAYS ) + y12,t,DAS /s2 12,t,DAS + (β0 + β1t + · · · + β4Progt )/σ2 η. Transform (σ2 8, · · · , σ2 12, ρ2) into γ = log(σ2 8), · · · , log(σ2 12), log(1+ρ2 1−ρ2 ) , and use random-walk Metropolis algorithm with multivariate normal proposal distribution on γ, J(γ(k) |γ(k−1) ) = N(γ(k−1) , σ2 γI). 31 / 58
  • 32. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Model Comparison RM1 RM2 RM3 RM4 RM5 No No [ytmk |θtmk ] θtmk θtmk Monthly correlation ρ1 = ρ2 = 0 ρ2 = 0 ρ1 = 0 Forecast error variance σ2 m σ2 k σ2 m σ2 m σ2 m Weather*year interaction No Yes DIC 782.29 788.35 771.67 700.58 701.09 pD 29.30 27.22 29.00 30.02 30.72 CV 4.10 4.10 3.35 1.81 1.79 Deviance Information Criterion (DIC), Gelman et al (2003). pD: effective number of parameters. CV: CV = 1 #{ytmk } m,t,k ytmk − E(ytmk |y(tmk)) , where y(tmk) denotes the data set with ytmk removed (Importance sampling). 32 / 58
  • 33. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Yield Estimates by RM4 Year PM CI(lb) CI(ub) Width I(BD ∈ CI) I(DAS ∈ CI) 1993 97.93 92.20 104.21 12.01 1 NA 1994 142.01 136.95 146.77 9.82 1 NA 1995 110.52 105.65 115.55 9.90 0 NA 1996 130.96 130.09 131.80 1.72 1 1 1997 131.86 131.10 132.60 1.50 1 1 1998 143.48 142.61 144.40 1.79 1 1 1999 141.21 140.33 142.08 1.75 1 1 2000 142.54 141.69 143.41 1.72 1 1 2001 144.99 144.02 145.98 1.96 1 1 2002 139.55 138.11 141.00 2.89 1 1 2003 151.47 150.07 152.81 2.74 1 1 2004 169.68 168.69 170.68 2.00 1 1 2005 157.35 156.14 158.51 2.38 1 1 2006 158.18 157.15 159.18 2.03 0 1 2007 160.87 159.75 161.97 2.22 1 1 2008 164.73 163.72 165.75 2.03 1 1 2009 174.84 173.92 175.76 1.84 1 1 33 / 58
  • 34. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Yield Estimates by RM4 1995 2000 2005 Yield Estimates over Time (Years) Year Yield PM(mu) CI DAS Our estimates follow data pattern very well. The ASB yield and DAS are captured within the CI most of the time. 34 / 58
  • 35. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Posterior Summaries of RM4 Survey bias: PM Aug Sep Oct Nov Dec OYS 14.48 12.57 14.04 14.84 14.83 AYS -15.72 -16.70 -11.71 -4.50 Model parameters: Parameters PM CI (lb) CI (ub) Forecast error variance σ2 8 60.62 32.26 102.76 σ2 9 47.14 25.16 78.72 σ2 10 27.83 14.92 46.92 σ2 11 7.08 3.57 12.35 σ2 12 6.41 2.60 14.03 Latent process model β0 113.68 105.39 122.17 β1 (year) 3.47 2.68 4.29 β2 (PCP) -3.57 -7.98 0.63 β3 (Temp) -5.60 -9.86 -1.36 β4 (Prog) 5.04 1.06 9.65 σ2 η 64.01 25.27 155.48 35 / 58
  • 36. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Forecast RMSE of Monthly Board, Keller and Olkin (2002), and RM4 KO estimator: y∗ tm = (y∗ tm,OYS , y∗ tm,AYS )′ , and Stm denotes sample covariance matrix of (y∗ tm,OYS , y∗ tm,AYS )′ ˆµKO t = 1 ′ S−1 tm y∗ tm 1 ′ S−1 tm 1 , RMSE: mean squared divergence between each forecast and DAS. Month RMSE.ASB Keller & Olkin RMSE.RM4 8 7.79 5.93 6.18 9 7.52 6.76 6.53 10 4.48 5.04 3.87 11 2.64 3.80 2.74 8-11 6.01 5.49 5.08 36 / 58
  • 37. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Forecasting End-of-year Yield via RM4 • DAS; △ BHM; + Monthly board; ⋄ Keller & Olkin estimator; 8 9 10 11 12 145180 Plot of Forecast for 2004 Months ForecastedYield 8 9 10 11 12 145170 Plot of Forecast for 2005 Months ForecastedYield 8 9 10 11 12 150175 Plot of Forecast for 2006 Months ForecastedYield 8 9 10 11 12 160185 Plot of Forecast for 2007 Months ForecastedYield 8 9 10 11 12 155180 Plot of Forecast for 2008 Months ForecastedYield 8 9 10 11 12 160190 Plot of Forecast for 2009 Months ForecastedYield 37 / 58
  • 38. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Outline 1 Project overview 2 Regional yield modeling 3 State yield modeling 4 Concluding remarks 38 / 58
  • 39. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Modeling State Survey Indications Review regional yield model Ramifications of “bias-adjusted” models Model comparison Forecasting performance of BHM Downscaling to state level OYS (Aug - Dec), AYS (Aug - Nov) and DAS (Dec) Explanatory analysis and proposed working model Linear mixed model for state indications Random coefficient model for yield-weather relationship (Dempster et al, 1981) 39 / 58
  • 40. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Modeling State Survey Indications Review regional yield model Ramifications of “bias-adjusted” models Model comparison Forecasting performance of BHM Downscaling to state level OYS (Aug - Dec), AYS (Aug - Nov) and DAS (Dec) Explanatory analysis and proposed working model Linear mixed model for state indications Random coefficient model for yield-weather relationship (Dempster et al, 1981) 40 / 58
  • 41. Outline Project overview Regional yield modeling State yield modeling Concluding remarks State indications (IA and OH) 1995 2000 2005 Yield Indications over Time (Years),IA Year Yield OY AY Ag Aug Sep Oct Nov Dec 1995 2000 2005 Yield Indications over Time (Years),OH Year Yield OY AY Ag Aug Sep Oct Nov Dec 41 / 58
  • 42. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Notation and Data Model For each survey k ∈ {OYS, AYS, DAS} in month m ∈ {8, · · · , 12}, year t = 1, 2, · · · , T and in state l ∈ {IL, · · · , WI} ytlmk : survey indication of corn yield; stlmk : s.e. associated with ytlmk θtlmk : finite population mean of ytlmk µtl : true yield Vector notation: ytl denotes 9-dim column vector with OYS (Aug-Dec) and AYS (Aug-Nov) indications, similarly θtl , stl . Data model OYS and AYS indications: [ytl |θtl ] ind ∼ N(θtl , diag(s2 tl )) DAS indications: yt,l,12,DAS ind ∼ N(µtl , s2 t,l,12,DAS ) 42 / 58
  • 43. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Notation and Data Model For each survey k ∈ {OYS, AYS, DAS} in month m ∈ {8, · · · , 12}, year t = 1, 2, · · · , T and in state l ∈ {IL, · · · , WI} ytlmk : survey indication of corn yield; stlmk : s.e. associated with ytlmk θtlmk : finite population mean of ytlmk µtl : true yield Vector notation: ytl denotes 9-dim column vector with OYS (Aug-Dec) and AYS (Aug-Nov) indications, similarly θtl , stl . Data model OYS and AYS indications: [ytl |θtl ] ind ∼ N(θtl , diag(s2 tl )) DAS indications: yt,l,12,DAS ind ∼ N(µtl , s2 t,l,12,DAS ) 43 / 58
  • 44. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Notation and Data Model For each survey k ∈ {OYS, AYS, DAS} in month m ∈ {8, · · · , 12}, year t = 1, 2, · · · , T and in state l ∈ {IL, · · · , WI} ytlmk : survey indication of corn yield; stlmk : s.e. associated with ytlmk θtlmk : finite population mean of ytlmk µtl : true yield Vector notation: ytl denotes 9-dim column vector with OYS (Aug-Dec) and AYS (Aug-Nov) indications, similarly θtl , stl . Data model OYS and AYS indications: [ytl |θtl ] ind ∼ N(θtl , diag(s2 tl )) DAS indications: yt,l,12,DAS ind ∼ N(µtl , s2 t,l,12,DAS ) 44 / 58
  • 45. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Modeling Finite Population Means Survey bias: btl = (btl,8,OYS , · · · , btl,12,OYS , btl,8,AYS , · · · , btl,11,AYS )′ θtl = µtl 19 + btl . Biases btl from state l: (b1,l , · · · , bT,l )′ ∼ N(1T ⊗ b, (IT + λ1T 1′ T ) ⊗ (∆R∆′ )) (∗) where ∆ = (σ8,OYS , · · · , σ12,OYS , σ8,AYS , · · · , σ11AYS ) and R = R5(ρ1) 0 0 R4(ρ2) . (*) is equivalent to [bt,l |bl ] ind ∼ N(bl , ∆R∆′ ), and bl iid ∼ N(b, λ(∆R∆′ )), where λ1T 1′ T introduces within-state correlation. Spatial correlation; covariates 45 / 58
  • 46. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Modeling Finite Population Means Survey bias: btl = (btl,8,OYS , · · · , btl,12,OYS , btl,8,AYS , · · · , btl,11,AYS )′ θtl = µtl 19 + btl . Biases btl from state l: (b1,l , · · · , bT,l )′ ∼ N(1T ⊗ b, (IT + λ1T 1′ T ) ⊗ (∆R∆′ )) (∗) where ∆ = (σ8,OYS , · · · , σ12,OYS , σ8,AYS , · · · , σ11AYS ) and R = R5(ρ1) 0 0 R4(ρ2) . (*) is equivalent to [bt,l |bl ] ind ∼ N(bl , ∆R∆′ ), and bl iid ∼ N(b, λ(∆R∆′ )), where λ1T 1′ T introduces within-state correlation. Spatial correlation; covariates 46 / 58
  • 47. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Process Model for True Yield Random regression-coefficient model: State-level covariates, xtl , µtl = γ0l + γ1l t + γ2l xtl + ηtl where, γ0l iid ∼ N γ0, σ2 0γ , γ1l iid ∼ N γ1, σ2 1γ , γ2l iid ∼ N γ2, σ2 2γ and ηtl iid ∼ N 0, σ2 η . Dynamic regression model using real-time weather. Large number of covariates: ridge regression, principal component regression (PCR). Spatial correlation: spatially correlated errors vs spatially varying coefficient. 47 / 58
  • 48. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Process Model for True Yield Random regression-coefficient model: State-level covariates, xtl , µtl = γ0l + γ1l t + γ2l xtl + ηtl where, γ0l iid ∼ N γ0, σ2 0γ , γ1l iid ∼ N γ1, σ2 1γ , γ2l iid ∼ N γ2, σ2 2γ and ηtl iid ∼ N 0, σ2 η . Dynamic regression model using real-time weather. Large number of covariates: ridge regression, principal component regression (PCR). Spatial correlation: spatially correlated errors vs spatially varying coefficient. 48 / 58
  • 49. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Estimated Yields (IA and OH) 1995 2000 2005 Yield Estimates over Time (Years), IA Year Yield PM(mu) CI DAS 1995 2000 2005 Yield Estimates over Time (Years), OH Year Yield PM(mu) CI DAS 49 / 58
  • 50. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Outline 1 Project overview 2 Regional yield modeling 3 State yield modeling 4 Concluding remarks 50 / 58
  • 51. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Concluding Remarks Solving a problem of national (and international) interest, estimating/forecasting agricultural commodities to aid in setting national figures. Bayesian hiearchical model that combines multiple repeated surveys with auxiliary variables to form single forecast/estimate. Future directions: state-level modeling, jointly model multiple crops, and model planted/harvest acreage. 51 / 58
  • 52. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Selected Publications and Working Papers Survey statistics Wang, J., & Opsomer, J. (2010), On the Asymptotic Normality of Nondifferentiable Survey Estimators, tentatively accepted by Biometrika. Wang, J., Wang, H., & Opsomer, J. (2010+), Bagging Non-differentiable Estimators in Complex Surveys, under review. Wang, J., & Opsomer, J. (2010), Characterizing Population Dispersion and Identifying Outliers in Multivariate Survey Data, under review. Wang, J. (2010), Semiparametric Interval Estimation under Nested-error Regression Model, under review. Shape restricted inference Wang, J., & Meyer, M. (2010), Testing the Monotonicity or Convexity of a Function using Regression Splines, tentatively accepted by Canadian Journal of Statistics. Meyer, M., & Wang, J. (2010), Hypothesis Tests in Constrained Parametric Regression, under review by Biometrika. Bayesian analysis Wang, J., et al (2010), A Bayesian Approach to Estimating Agricultural Yield Based on Multiple Repeated Surveys, in preparation for JABES. Wang, J., & Holan, S. (2010+), Bayesian Smooth-transition Regression with Ordered Categorical Response, in preparation. 52 / 58
  • 53. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Thank you very much! 53 / 58
  • 54. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Standardized Delete-one Residuals (SDR) in RM4 1995 2000 2005 −3−2−10123 OY Residual vs year year Residuals YOY8.s YOY9.s YOY10.s YOY11.s YOY12.s 1995 2000 2005 −3−2−10123 AY Residual vs Year year Residuals YAY8.s YAY9.s YAY10.s YAY11.s YAg12.s rtmk = ytmk − E(ytmk |y(tmk)) Var(ytmk |y(tmk)) 54 / 58
  • 55. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Model for Bias Applies to RM1-RM5 For m = 8, 9, . . . , 12, and k ∈ {OYS, AYS}, bm,OYS iid ∼ Normal(0, σ2 b)I(0,∞) bm,AYS iid ∼ Normal(0, σ2 b)I(−∞,0) OYS bias: inherent in the survey indications AYS bias: pessimism of farmers McMC Implementation: generate from unconstrained conditional distribution and retain the values if the constraint is met (Gelfand et al 1992) 55 / 58
  • 56. Outline Project overview Regional yield modeling State yield modeling Concluding remarks How to model true yield? (2) State yield = District yield × District harvested acres State harvested acres 1995 2000 2005 0.020.060.100.14 IA, 10 Propofharvested 1995 2000 2005 0.020.060.100.14 IA, 20 Propofharvested 1995 2000 2005 0.020.060.100.14 IA, 30 Propofharvested 1995 2000 2005 0.020.060.100.14 IA, 40 Propofharvested 1995 2000 2005 0.020.060.100.14 IA, 50 Propofharvested 1995 2000 2005 0.020.060.100.14 IA, 60 Propofharvested 1995 2000 2005 0.020.060.100.14 IA, 70 Propofharvested 1995 2000 2005 0.020.060.100.14 IA, 80 Propofharvested 1995 2000 2005 0.020.060.100.14 IA, 90 Propofharvested District State harvested acres for Iowa are within ±1% 56 / 58
  • 57. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Official district yields (IA and IL) 1995 2000 2005 IA year yield 1995 2000 2005 −40−2002040 IA, residuals year residuals 1995 2000 2005 IL year yield 1995 2000 2005 −40−20020 IL, residuals year residuals Circle: State (detrended) yield 57 / 58
  • 58. Outline Project overview Regional yield modeling State yield modeling Concluding remarks Fitting FLR on official district yields Same pattern of weather effects for ALL districts 1995 2000 2005 7090110130 WI, District 10 Year yield 1995 2000 2005 6080120 WI, District 20 Year yield 1995 2000 2005 80100140 WI, District 30 Year yield 1995 2000 2005 90110130150 WI, District 40 Year yield 1995 2000 2005 80100120140 WI, District 50 Year yield 1995 2000 2005 80100140 WI, District 60 Year yield 1995 2000 2005 100120140160 WI, District 70 Year yield 1995 2000 2005 110130150 WI, District 80 Year yield 1995 2000 2005 100120140160 WI, District 90 Year yield State-specific pattern of weather effects 1995 2000 2005 7090110130 WI, District 10 Year yield 1995 2000 2005 6080120 WI, District 20 Year yield 1995 2000 2005 80100140 WI, District 30 Year yield 1995 2000 2005 90110130150 WI, District 40 Year yield 1995 2000 2005 90110130 WI, District 50 Year yield 1995 2000 2005 80100140 WI, District 60 Year yield 1995 2000 2005 100140 WI, District 70 Year yield 1995 2000 2005 100120140160 WI, District 80 Year yield 1995 2000 2005 100120140160 WI, District 90 Year yield Black: official; red: fitted 58 / 58