Using Survey-to-Survey Imputation to Fill Poverty Data Gaps at a Low Cost
Survey data on household consumption are often unavailable or incomparable over time in many low- and middle-income countries. Based on a unique randomized survey experiment implemented in Tanzania, this study offers new and rigorous evidence demonstrating that survey-to-survey imputation can fill consumption data gaps and provide low-cost and reliable poverty estimates. Basic imputation models featuring utility expenditures, together with a modest set of predictors on demographics, employment, household assets, and housing, yield accurate predictions. Imputation accuracy is robust to varying the survey questionnaire length, the choice of base surveys for estimating the imputation model, different poverty lines, and alternative (quarterly or monthly) Consumer Price Index deflators. The proposed approach to imputation also performs better than multiple imputation and a range of machine learning techniques. In the case of a target survey with modified (shortened or aggregated) food or non-food consumption modules, imputation models including food or non-food consumption as predictors do well only if the distributions of the predictors are standardized vis-à-vis the base survey. For the best-performing models to reach acceptable levels of accuracy, the minimum required sample size should be 1,000 for both the base and target surveys. The discussion expands on the implications of the findings for the design of future surveys.
Main Authors: | , , , , |
---|---|
Format: | Working Paper biblioteca |
Language: | English en_US |
Published: |
Washington, DC: World Bank
2024-03-26
|
Subjects: | CONSUMPTION, POVERTY, SURVEY-TO-SURVEY IMPUTATION, HOUSEHOLD SURVEYS, TANZANIA, NO POVERTY, SDG 1, |
Online Access: | http://documents.worldbank.org/curated/en/099851203262433827/IDU1dce7aeb018793142f91aae715d889956cee9 https://hdl.handle.net/10986/41291 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Survey data on household consumption
are often unavailable or incomparable over time in many low-
and middle-income countries. Based on a unique randomized
survey experiment implemented in Tanzania, this study offers
new and rigorous evidence demonstrating that
survey-to-survey imputation can fill consumption data gaps
and provide low-cost and reliable poverty estimates. Basic
imputation models featuring utility expenditures, together
with a modest set of predictors on demographics, employment,
household assets, and housing, yield accurate predictions.
Imputation accuracy is robust to varying the survey
questionnaire length, the choice of base surveys for
estimating the imputation model, different poverty lines,
and alternative (quarterly or monthly) Consumer Price Index
deflators. The proposed approach to imputation also performs
better than multiple imputation and a range of machine
learning techniques. In the case of a target survey with
modified (shortened or aggregated) food or non-food
consumption modules, imputation models including food or
non-food consumption as predictors do well only if the
distributions of the predictors are standardized vis-à-vis
the base survey. For the best-performing models to reach
acceptable levels of accuracy, the minimum required sample
size should be 1,000 for both the base and target surveys.
The discussion expands on the implications of the findings
for the design of future surveys. |
---|