Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Core Training Presentations- 3 Estimating an Ag Database using CE Methods

336 views

Published on

Global Futures & Strategic Foresight (GFSF) program enhances and uses a coordinated suite of biophysical and socioeconomic models to assess potential returns to investments in new agricultural technologies and policies. These models include IFPRI’s International Model for Policy Analysis of Agricultural Commodities and Trade (IMPACT), hydrology and water supply-demand models, and the DSSAT suite of process-based crop models.

The program also provides tools and trainings to scientists and policy makers to undertake similar assessments.

GFSF program is a Consultative Group on International Agricultural Research (CGIAR) program led by the International Food Policy Research Institute (IFPRI)

Published in: Government & Nonprofit
  • Be the first to comment

  • Be the first to like this

Core Training Presentations- 3 Estimating an Ag Database using CE Methods

  1. 1. National Accounts and SAM Estimation Using Cross-Entropy Methods Sherman Robinson
  2. 2. Estimation Problem • Partial equilibrium models such as IMPACT require balanced and consistent datasets the represent disaggregated production and demand by commodity • Estimating such a dataset requires an efficiency method to incorporate and reconcile information from a variety of sources 2
  3. 3. Primary Data Sources for IMPACT Base Year • FAOSTAT for country totals for: – Production: Area, Yields and Supply – Demand: Total, Food, Intermediate, Feed, Other Demands – Trade: Exports, Imports, Net Trade – Nutrition: Calories per capita, calories per kg of commodity • AQUASTAT for country irrigation and rainfed production • SPAM pixel level estimation of global allocation of production 3
  4. 4. Estimating a Consistent and Disaggregated Database 4 Estimate IMPACT Country Database • FAOSTAT Estimate Technology Disaggregated Production • IMPACT Country Database • FAO AQUASTAT Estimate Geographic Disaggregated Production • Technology Disaggregated • SPAM
  5. 5. 5 Bayesian Work Plan Priors on values and estimation errors of production, demand, and trade Estimation by Cross- Entropy Method Check results against priors and identify potential data problems New information to correct identified problems
  6. 6. Information Theory Approach • Goal is to recover parameters and data we observe imperfectly. Estimation rather than prediction. • Assume very little information about the error generating process and nothing about the functional form of the error distribution. • Very different from standard statistical approaches (e.g., econometrics). – Usually have lots of data 6
  7. 7. Estimation Principles • Use all the information you have. • Do not use or assume any information you do not have. • Arnold Zellner: “Efficient Information Processing Rule (IPR).” • Close links to Bayesian estimation 7
  8. 8. Information Theory • Need to be flexible in incorporating information in parameter/data estimation – Lots of different forms of information • In classic statistics, “information” in a data set can summarized by the moments of the distribution of the data – Summarizes what is needed for estimation • We need a broader view of “estimation” and need to define “information” 8
  9. 9. 9 An analogy from physics initial state of motion. final state of motion. Force Force is whatever induces a change of motion: dt pd F  
  10. 10. 10 Inference is dynamics as well old beliefs new beliefs information “Information” is what induces a change in rational beliefs.
  11. 11. Information Theory • Suppose an event E will occur with probability p. What is the information content of a message stating that E occurs? • If p is “high”, event occurrence has little “information.” If p is low, event occurrence is a surprise, and contains a lot of information – Content of the message is not the issue: amount, not meaning, of information 11
  12. 12. Information Theory • Shannon (1948) developed a formal measure of “information content” of the arrival of a message (he worked for AT&T)    )(0 0)(1 )/1log()( phthenpIF phthenpIF pph 12
  13. 13. Information Theory • For a set of events, the expected information content of a message before it arrives is the entropy measure: 1 1 ( ) ( ) log( ) and 1 n n k k k k k k k k H p p h p p p p          13
  14. 14. 14 Claude Shannon
  15. 15. E.T. Jaynes • Jaynes proposed using the Shannon entropy measure in estimation • Maximum entropy (MaxEnt) principle: – Out of all probability distributions that are consistent with the constraints, choose the one that has maximum uncertainty (maximizes the Shannon entropy metric) • Idea of estimating probabilities (or frequencies) – In the absence of any constraints, entropy is maximized for the uniform distribution 15
  16. 16. 16 E.T. Jaynes
  17. 17. Estimation With a Prior • The estimation problem is to estimate a set of probabilities that are “close” to a known prior and that satisfy various known moment constraints. • Jaynes suggested using the criterion of minimizing the Kullback-Leibler “cross entropy” (CE) “divergence” between the estimated probabilities and the prior. 17
  18. 18. 18 Cross Entropy Estimation   Minimize: log log log where is the prior probability. k k k k k k kk p p p p p p p          “Divergence”, not “distance”. Measure is not symmetric and does not satisfy the triangle inequality. It is not a “norm”.
  19. 19. MaxEnt vs Cross-Entropy • If the prior is specified as a uniform distri-bution, the CE estimate is equivalent to the MaxEnt estimate • Laplace’s Principle of Insufficient Reason: In the absence of any information, you should choose the uniform distribution, which has maximum uncertainty – Uniform distribution as a prior is an admission of “ignorance”, not knowledge 19
  20. 20. Cross Entropy Measure • Two kinds of information – Prior distribution of the probabilities – Moments of the distribution • Can know any moments – Can also specify inequalities – Moments with error will be considered – Summary statistics such as quantiles 20
  21. 21. 21 Cross-Entropy Measure K k 1 , 1 1 Minimize ln subject to constraints (information) about moments and the adding-up constraint (finite distribution) 1 k k k K k t k t k k k k p p p p x y p               
  22. 22. 22 Lagrangian 1 , 1 1 1 ln 1 K k k k k T K t t k t k t k K k k p L p p y p x p                               
  23. 23. 23 First Order Conditions   T t kttkk xpp 1 , 01lnln    K k ktkt xpy 1 , 0 01 1   K k kp
  24. 24. 24 Solution   , 11 2 , 1 1 exp ( , ,..., ) where exp T k k t t k tT K T k t t k k t p p x p x                           
  25. 25. Cross-Entropy (CE) Estimates • Ω is called the “partition function”. • Can be viewed as a limiting form (non- parametric) of a Bayesian estimator, transforming prior and sample information into posterior estimates of probabilities. • Not strictly Bayesian because you do not specify the prior as a frequency function, but a discrete set of probabilities. 25
  26. 26. From Probabilities to Parameters • From information theory, we now have a way to use “information” to estimate probabilities • But in economics, we want to estimate parameters of a model or a “consistent” data set • How do we move from estimating probabilities to estimating parameters and/or data? 26
  27. 27. Types of Information • Values: – Areas, production, demand, trade • Coefficients: technology – Crop and livestock yields – Input-output coefficients for processed commodities (sugar, oils) • Prior Distribution of measurement error: – Mean – Standard error of measurement – “Informative” or “uninformative” prior distribution 27
  28. 28. Data Estimation • Generate a prior “best” estimate of all entries: Values and/or coefficients. • A “prototype” based on: – Values and aggregates • Historical and current data • Expert Knowledge – Coefficients: technology and behavior • Current and/or historical data • Assumption of behavior and technical stability 28
  29. 29. Estimation Constraints • Nationally – Area times Yield = Production by crop – Total area = Sum of area over crops – Total Demand = Sum of demand over types of demand – Net trade = Supply – Demand • Globally – Net trade sums to 0 29
  30. 30. Measurement Error • Error specification – Error on coefficients or values – Additive or multiplicative errors • Multiplicative errors – Logarithmic distribution – Errors cannot be negative • Additive – Possibility of entries changing sign 30
  31. 31. Error Specification ,k ,k ,k ,k ,k Typical error specification (additive): x = x where 0 1 and 1 and is the "support set" for the errors i i i i i i k i i k i e e W v W W v        31
  32. 32. Error Specification • Errors are weighted averages of support set values – The v parameters are fixed and have units of item being estimated. – The W variables are probabilities that need to be estimated. • Convert problem of estimating errors to one of estimating probabilities. 32
  33. 33. Error Specification • The technique provides a bridge between standard estimation where parameters to be estimated are in “natural” units and the information approach where the parameters are probabilities. – The specified support set provides the link. 33
  34. 34. Error Specification • Conversion of a “standard” stochastic specification with continuous random variables into a specification with a discrete set of probabilities – Golan, Judge, Miller • Problem is to estimate a discrete probability distribution 34
  35. 35. Uninformative Prior • Prior incorporates only information about the bounds between which the errors must fall. • Uniform distribution is the continuous uninformative prior in Bayesian analysis. – Laplace: Principle of insufficient reason • We specify a finite probability distribution that approximates the uniform distribution. 35
  36. 36. Uninformative Prior • Assume that the bounds are set at ±3s where s is a constant. • For uniform distribution, the variance is: 36    2 2 2 3 3 3 12 s s s    
  37. 37. 37 7-Element Support Set 1 2 3 4 5 6 7 3 2 0 2 3 v s v s v s v v s v s v s                2 2 2 2 2 1 and the prior is 7 9 4 1 1 4 9 4 7 k k k k w v w s s             
  38. 38. Uninformative Prior • Finite uniform prior with 7-element support set is a conservative uninformative prior. • Adding more elements would more closely approximate the continuous uniform distribution, reducing the prior variance toward the limit of 3s2. • Posterior distribution is essentially unconstrained. 38
  39. 39. Informative Prior • Start with a prior on both mean and standard deviation of the error distribution – Prior mean is normally zero. – Standard deviation of e is the prior on the standard error of measurement of item. • Define the support set with s=σ so that the bounds are now ±3σ. 39
  40. 40. 40 Informative Prior, 2 Parameters 2 2 ,k ,ki i i k W v   Variance ,k ,k 0i i k W v  Mean
  41. 41. 41 3-Element Support Set ,1 ,3 ,5 3 0 3 i i i i i v v v       
  42. 42. 42 Informative Prior, 2 Parameters      2 2 2 ,1 ,2 ,39 0 9i i i i i iW W W         ,1 ,3 ,2 ,1 ,3 1 18 16 1 18 i i i i i W W W W W      
  43. 43. Informative Prior: 4 Parameters • Must specify prior for additional statistics – Skewness and Kurtosis • Assume symmetric distribution: – Skewness is zero. • Specify normal prior: – Kurtosis is a function of σ. • Can recover additional information on error distribution. 43
  44. 44. 44 Informative Prior, 4 Parameters 2 2 ,k ,ki i i k W v   4 4 ,k ,k 3i i i k W v   Variance Kurtosis ,k ,k 0i i k W v  Mean 3 ,k ,k 0i i k W v  Skewness
  45. 45. 45 5-Element Support Set ,1 ,2 ,3 ,4 ,5 3.0 1.5 0 1.5 3.0 i i i i i i i i i v v v v v             
  46. 46. 46 Informative Prior, 4 Parameters                 2 2 2 ,1 ,2 ,3 2 2 ,2 ,1 4 4 4 ,1 ,2 4 4 ,3 ,2 ,1 9 2.25 0 2.25 9 81 3 81 16 81 0 81 16 i i i i i i i i i t i i i i i i i i i t W W W W W W W W W W                                         ,1 ,5 ,2 ,4 ,3 1 16 48 ; ; 162 81 81 i i i i iW W W W W    
  47. 47. Implementation • Implement program in GAMS – Large, difficult, estimation problem – Major advances in solvers. Solution is now robust and routine. • CE minimand similar to maximum likelihood estimators. • Excel front end for GAMS program – Easy to use 47
  48. 48. Implementation 48 IMPACT 3 FAOSTAT Database Data Estimation with Cross Entropy Nationally: Trade = Supply - Demand Nationally: Area X Yield = Supply Globally: Supply = Demand Data Cleaning and Setting Priors Crop Production Livestock Production Commodity Demand and Trade Processed Commodities (oilseeds, sugar, etc.) Data Collection Commodity Balance Food Balance

×