Improving Estimates of Mean Welfare and Uncertainty in Developing Countries

Reliable estimates of economic welfare for small areas are valuable inputs into the design and evaluation of development policies. This paper compares the accuracy of point estimates and confidence intervals for small area estimates of wealth and poverty derived from four different prediction methods: linear mixed models, Cubist regression, extreme gradient boosting, and boosted regression forests. The evaluation draws samples from unit-level household census data from four developing countries, combines them with publicly and globally available geospatial indicators to generate small area estimates, and evaluates these estimates against aggregates calculated using the full census. Predictions of wealth are evaluated in four countries and poverty in one. All three machine learning methods outperform the traditional linear mixed model, with extreme gradient boosting and boosted regression forests generally outperforming the other alternatives. The proposed residual bootstrap procedure reliably estimates confidence intervals for the machine learning estimators, with estimated coverage rates across simulations falling between 94 and 97 percent. These results demonstrate that predictions obtained using tree-based gradient boosting with a random effect block bootstrap generate more accurate point and uncertainty estimates than prevailing methods for generating small area welfare estimates.

Saved in:
Bibliographic Details
Main Authors: Merfeld, Joshua D., Newhouse, David
Format: Working Paper biblioteca
Language:English
English
Published: World Bank, Washington, DC 2023-03
Subjects:POVERTY, WELFARE, PREDICTION OF WEALTH, MACHINE LEARNING, GEOSPACIAL DATA, DEVELOPMENT POLICY, HOUSEHOLD CENSUS DATA, PREDICTION OF POVERTY,
Online Access:http://documents.worldbank.org/curated/en/099413503082334933/IDU0ddd4b90a0930204352095e1087657f2c9ec9
https://openknowledge.worldbank.org/handle/10986/39530
Tags: Add Tag
No Tags, Be the first to tag this record!