Improved County Level Estimation of Crop Yield Using Model-Based Methodology With a Spatial Component - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Improved County Level Estimation of Crop Yield Using Model-Based Methodology With a Spatial Component

Description:

Improved County Level Estimation of Crop Yield Using ModelBased Methodology With a Spatial Component – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 35
Provided by: nutrientda
Category:

less

Transcript and Presenter's Notes

Title: Improved County Level Estimation of Crop Yield Using Model-Based Methodology With a Spatial Component


1
Improved County Level Estimation of Crop Yield
Using Model-Based Methodology With a Spatial
Component
  • Michael E. Bellow, USDA/NASS


2
Outline
  • Background
  • Simulation Methodology
  • Results of Ten State Study
  • Convergence Evaluation
  • Summary

3
County Level Commodity Estimation
  • NASS program since 1917
  • Estimates used by private sector, academia,
    government
  • Data from various sources used
  • NASS County Estimates System developed to
    facilitate the estimation process

4
Available Data Sources
  • Voluntary response surveys of farm operators
  • List frame control data (lists of known farming
    operations)
  • Previous year official estimates
  • Census of Agriculture data (NASS conducts Census
    every five years)
  • Earth resources satellite data

5
County Crop Yield Estimation
  • Yield is ratio of crop production to harvested
    area (acres)
  • Accurate estimation challenging due to
  • - reliable administrative data seldom available
  • - high year-to-year variability of yields
    (weather
  • sensitive)
  • - lack of adequate sample survey data

6
Desirable Features of a County Yield Estimation
Method
  • Repeatability
  • Accurate variance estimation
  • Produce estimates for counties having no survey
    data

7
Ratio (R) Estimator
  • Traditional crop yield estimator used by NASS
  • Computed as ratio between production and
    harvested area estimates (with minor adjustment)
  • Can produce inconsistent yields due to
    fluctuations in harvested acreage
  • No utilization of survey data from counties
  • other than the one being estimated

8
Model-Based County Estimation Methods
  • Based on linear or non-linear models relating
    true yields to survey reported values
  • Generally fit using an iterative algorithm
  • Convergence not always guaranteed
  • Estimates can be adjusted for consistency with
    published state figures

9
Stasny-Goel (SG) Method
  • Developed at Ohio State University under
    cooperative agreement with NASS
  • Assumes mixed effects model with farm size group
    as fixed effect and county as random effect
  • Random effect assumed multivariate normal with
    covariance matrix reflecting spatial correlation
    among neighboring counties -
  • corr(ti, tj ) r if county i
    borders county j
  • 0 otherwise
  • EM algorithm used to fit model

10
Stasny-Goel Method (cont.)
  • Previous year county yields used to derive
    initial estimates of county and size group
    effects
  • Processing continues until at least one of the
    following two conditions is satisfied
  • relative group and log-likelihood distances fall
    below preset limits
  • maximum allowable number of iterations reached
  • County yield estimates computed as weighted
  • averages of individual farm level estimates
  • (weights derived from Census of Agriculture
    data)

11
Griffith (G) Method
  • Developed by Dr. Dan Griffith at Syracuse
    University under cooperative agreement with NASS
  • Predicts yield values using published number of
    farms producing crop of interest
  • Assumes autoregressive model
  • Employs Box-Cox and Box-Tidwell transformations
  • Spatial imputation routine can compute estimates
    for counties with missing survey data

12
Previous Research on Model-Based Methods
  • Stasny, Goel and Rumsey (1991) early version of
    SG method tested on Kansas wheat production data
  • Stasny et al (1995) improved version of SG
    tested on Ohio corn yield data
  • Crouse (2000) SG evaluated for Michigan corn
    and barley yield
  • Griffith (2000) Griffith method tested on
    Michigan
  • corn yield data
  • Bellow (2004) SG and Griffith methods compared
    for North Dakota oats and barley yield (presented
    at FCSM Research Conference)

13
Ten-State Research Study
  • Compare performance of Stasny-Goel, Griffith and
    ratio methods for various crops in ten
    geographically dispersed states
  • NY, OH, MI, TN, MS, FL, ND, OK,
  • CO, WA
  • Criteria for comparison bias, variance, MSE,
    outlier properties, convergence percentage

14
States In Study Area
15
Post-Stratification Size Groups
  • NASS statewide survey data post-stratified by
    county and farm size based on COA data
  • (two or three size groups defined)
  • Percentages of Census farm acres by size group
    used as weights for SG algorithm
  • Equal total land in farms criterion used to
  • form groups

16
Data Sources For Research Study
  • 2002-03 Quarterly Agricultural Survey
  • 2001-03 County Estimates Survey
  • 2001-02 official crop yield estimates
  • (previous year data)
  • 2002 Census of Agriculture (number of
  • farms, land in farms)

17
Simulation Procedure
  • Multiple regression performed on survey reported
    yield vs. official county yields,
  • weighted average neighbor yields, size group
    membership variables
  • Artificial population of 10,000 simulated survey
    data sets used to compute true population
    parameter values
  • 250 sample data sets selected at random from
    population

18
Simulation Procedure (cont.)
  • Morans I computed to test whether simulated
    data sets reflect spatial correlation of real
  • survey data
  • SG, G and R methods applied to each of the
  • 250 sampled data sets
  • Average simulated parameter values compared with
    corresponding population values for each
    estimation method

19
Measures of Estimator Performance
  • Absolute Bias - average absolute difference
    between simulated yield estimates and true
    (population) yield
  • Variance sample variance of simulated yield
    estimates
  • Mean Square Error average squared deviation
    between simulated estimates and true yield (SG
    program also computes analytic MSE)
  • Lower (Upper) Tail Proximity average absolute
    difference between 5th (95th) percentile of
    simulated yield estimates and true yield

20
Pairwise Estimator Comparison for Absolute Bias
( - better method)
Crop Stasny-Goel vs. Ratio Stasny-Goel vs. Ratio Stasny-Goel vs. Griffith Stasny-Goel vs. Griffith
Crop Percent of Counties Favoring Percent of Counties Favoring Percent of Counties Favoring Percent of Counties Favoring
Crop SG R SG G
Barley 90 10 82 18
Corn 92 8 66 34
Cotton (upland) 86 14 58 42
Dry Beans 93 7 73 27
Oats 88 12 63 37
Rye 83 17 47 53
Sorghum 84 16 59 41
Soybeans 88 12 62 38
Sunflower 94 6 69 31
Tobacco (burley) 98 2 56 44
Wheat (spring) 83 17 78 22
Wheat (winter) 83 17 66 34
21
Pairwise Estimator Comparison for Variance
Crop Stasny-Goel vs. Ratio Stasny-Goel vs. Ratio Stasny-Goel vs. Griffith Stasny-Goel vs. Griffith
Crop Percent of Counties Favoring Percent of Counties Favoring Percent of Counties Favoring Percent of Counties Favoring
Crop SG R SG G
Barley 100 0 51 49
Corn 99.9 0.1 33 67
Cotton (upland) 100 0 13 87
Dry Beans 100 0 20 80
Oats 100 0 36 64
Rye 97 3 77 23
Sorghum 98 2 25 75
Soybeans 100 0 40 60
Sunflower 100 0 56 44
Tobacco (burley) 100 0 49 51
Wheat (spring) 100 0 62 38
Wheat (winter) 100 0 43 57
22
Pairwise Estimator Comparison for MSE
Crop Stasny-Goel vs. Ratio Stasny-Goel vs. Ratio Stasny-Goel vs. Griffith Stasny-Goel vs. Griffith
Crop Percent of Counties Favoring Percent of Counties Favoring Percent of Counties Favoring Percent of Counties Favoring
Crop SG R SG G
Barley 92 8 77 23
Corn 94 6 62.5 37.5
Cotton (upland) 89 11 55 45
Dry Beans 96 4 75 25
Oats 90 10 61 39
Rye 87 13 40 60
Sorghum 84 16 51 49
Soybeans 89 11 57 43
Sunflower 95.5 4.5 65 35
Tobacco (burley) 100 0 53 47
Wheat (spring) 85 15 80 20
Wheat (winter) 86 14 64 36
23
Pairwise Estimator Comparison for LTP
Crop Stasny-Goel vs. Ratio Stasny-Goel vs. Ratio Stasny-Goel vs. Griffith Stasny-Goel vs. Griffith
Crop Percent of Counties Favoring Percent of Counties Favoring Percent of Counties Favoring Percent of Counties Favoring
Crop SG R SG G
Barley 92 8 55 45
Corn 93 7 41 59
Cotton (upland) 84 16 41 59
Dry Beans 96 4 64 36
Oats 94 6 52 48
Rye 90 10 40 60
Sorghum 97 3 59 41
Soybeans 85 15 38 62
Sunflower 96 4 56 44
Tobacco (burley) 100 0 31 69
Wheat (spring) 99 1 69 31
Wheat (winter) 89 11 50 50
24
Pairwise Estimator Comparison for UTP
Crop Stasny-Goel vs. Ratio Stasny-Goel vs. Ratio Stasny-Goel vs. Griffith Stasny-Goel vs. Griffith
Crop Percent of Counties Favoring Percent of Counties Favoring Percent of Counties Favoring Percent of Counties Favoring
Crop SG R SG G
Barley 93 7 61 39
Corn 98 2 56 44
Cotton (upland) 97 3 53 47
Dry Beans 98 2 49 51
Oats 92 8 43 57
Rye 97 3 33 67
Sorghum 84 16 32 68
Soybeans 99 1 53 47
Sunflower 91 9 43 57
Tobacco (burley) 98 2 69 31
Wheat (spring) 85 15 47 53
Wheat (winter) 90 10 53 47
25
Additional Bias Evaluation
  • Wilcoxon Rank Sum Test compare median absolute
    error (over simulation runs) of SG vs. R, SG vs.
    G for each county
  • Wilcoxon Signed Rank Test assess whether median
    error of SG, G, R is negative, positive or zero
    (two one-sided tests performed for each county)

26
Results of Rank Sum Tests on Absolute Bias
Crop Stasny-Goel vs. Ratio Stasny-Goel vs. Ratio Stasny-Goel vs. Ratio Stasny-Goel vs. Griffith Stasny-Goel vs. Griffith Stasny-Goel vs. Griffith
Crop Percent of Counties Favoring Percent of Counties Favoring Percent of Counties Favoring Percent of Counties Favoring Percent of Counties Favoring Percent of Counties Favoring
Crop SG R Neither SG G Neither
Barley 82 9 10 74 13 13
Corn 85 7 8 62 27 11
Cotton (upland) 78 13 9 54 33 13
Dry Beans 84 7 9 67 22 11
Oats 76 11 13 61 30 10
Rye 63 10 27 40 40 20
Sorghum 65 13 22 56 35 10
Soybeans 80 12 9 60 32 7
Sunflower 85 5 10 66 25 8
Tobacco (burley) 95 2 3 45 38 16
Wheat (spring) 78 15 7 75 17 9
Wheat (winter) 72 16 12 61 27 12
All 79 11 10 62 27 11
27
Summary of Signed Rank Test Results (All Crops
Combined)
Method Test Result Test Result Test Result Test Result Test Result Test Result
Method Bias lt 0 Bias lt 0 Bias gt 0 Bias gt 0 Bias 0 Bias 0
Method No. Counties No. Counties No. Counties
Stasny-Goel 1607 59 887 32 243 9
Griffith 1456 54 1174 43 82 3
Ratio 292 11 245 9 2200 80
28
Percent of Counties With Average Underestimate
Less Than 10 of True Yield ( - best method)
Crop Method Method Method
Crop Stasny-Goel Griffith Ratio
Barley 81 62 46
Corn 83 71 42
Cotton (upland) 79 78 64.5
Dry Beans 95 74 62.5
Oats 70.5 54 21
Rye 41 52 13
Sorghum 52 41 11
Soybeans 84 76 62
Sunflower 80 63.5 50
Tobacco (burley) 93 98 27
Wheat (spring) 94 55 54
Wheat (winter) 86 75 51.5
29
Convergence Issues
  • SG algorithm not guaranteed to converge within
    fixed limit on number of iterations
  • Non-convergence associated with numerical
    instability conditions
  • Yield estimates produced for non-convergent runs
    may be suspect
  • Convergence generally most reliable for highly
    prevalent crops, least reliable for rare crops

30
Algorithm Convergence Percentage By Crop (Limit
of 5000 Iterations)
Crop Method Method
Crop Stasny-Goel Griffith
Barley 93 68
Corn 87 77
Cotton (upland) 81 89
Dry Beans 89 75
Oats 80 71
Rye 74 83
Sorghum 85 66
Soybeans 93 73
Sunflower 90.5 80
Tobacco (burley) 41 52
Wheat (spring) 63 52.5
Wheat (winter) 88 65
31
Two Approaches to Dealing With SG
Non-Convergence
  • SG(1) - use estimate generated at final allowable
  • iteration (N0)
  • SG(2) - keep track of which iteration (i)
    maximized
  • the log-likelihood
  • - if i lt N0 , rerun
    algorithm to i and use that estimate
  • - if i N0 , resume processing
    at iteration (N01) and continue
  • until either -
  • o convergence occurs (use
    that estimate) OR
  • o log-likelihood decreases
    from one iteration to next (use estimate
  • at next-to-last iteration)

32
Non-Convergence Study
  • Does SG(1) or SG(2) outperform ratio estimator in
  • cases where SG failed to converge?
  • Six cases with high non-convergence percentage
    selected for comparison of SG(1), SG(2) and R
  • - 2002 CO barley (37 simulation runs)
  • - 2002 MS soybeans (105)
  • - 2002 NY winter wheat (39)
  • - 2002 ND dry beans (38)
  • - 2002 OH oats (50)
  • - 2003 OK rye (59)

33
Combined Pairwise Estimator Comparison
forNon-Convergence Test Cases
Measure SG(1) vs. Ratio SG(1) vs. Ratio SG(2) vs. Ratio SG(2) vs. Ratio SG(1) vs. SG(2) SG(1) vs. SG(2)
Measure Percent of Counties Favoring Percent of Counties Favoring Percent of Counties Favoring Percent of Counties Favoring Percent of Counties Favoring Percent of Counties Favoring
Measure SG(1) R SG(2) R SG(1) SG(2)
Absolute Bias 78 22 80 20 23 77
Variance 95 5 99 1 0 100
MSE 81 19 83 17 15 85
LTP 74 26 88 12 13 87
UTP 84 16 90 10 15 85
34
Summary
  • SG yield estimation method outperforms R in all
    efficiency categories and G in most categories (G
    outperforms R)
  • Convergence problems can be alleviated using
    enhanced SG approach
  • SG method recommended for integration into NASS
    County Estimates System
Write a Comment
User Comments (0)
About PowerShow.com