Outlier Detection for Welfare Analysis
Extreme values are common in survey data and represent a recurring threat to the reliability of both poverty and inequality estimates. The adoption of a consistent criterion for outlier detection is useful in many practical applications, particularly when international and intertemporal comparisons are involved. This paper discusses a simple, univariate detection procedure to flag outliers in the distribution of any variable of interest. It presents outdetect, a Stata command that implements the procedure and provides useful diagnostic tools. The output of outdetect compares statistics—with focus on inequality and poverty measures—obtained before and after the exclusion of outliers. Finally, the paper carries out an extensive sensitivity exercise, where the same outlier detection method is applied consistently to per capita expenditure across more than 30 household budget surveys. The results are clear-cut and provide a sense of the influence of extreme values on poverty and inequality estimates.
Main Authors: | , , |
---|---|
Format: | Working Paper biblioteca |
Language: | English English |
Published: |
World Bank, Washington, DC
2022-11
|
Subjects: | OUTLIERS, EXTREME VALUES, INEQUALITY, POVERTY, INCREMENTAL TRIMMING CURVE, SURVEY DATA OUTLIER CRITERION, OUTLIER DETECTION, STATA, INEQUALITY MEASURE, POVERTY MEASURE, HOUSEHOLD BUDGET SURVEYS, INFLUENCE OF EXTREME SURVEY DATA, |
Online Access: | http://documents.worldbank.org/curated/en/099536211152218834/IDU0d8c0f49d0042704e31095c7006964c6e8ce5 http://hdl.handle.net/10986/38318 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Extreme values are common in survey
data and represent a recurring threat to the reliability of
both poverty and inequality estimates. The adoption of a
consistent criterion for outlier detection is useful in many
practical applications, particularly when international and
intertemporal comparisons are involved. This paper discusses
a simple, univariate detection procedure to flag outliers in
the distribution of any variable of interest. It presents
outdetect, a Stata command that implements the procedure and
provides useful diagnostic tools. The output of outdetect
compares statistics—with focus on inequality and poverty
measures—obtained before and after the exclusion of
outliers. Finally, the paper carries out an extensive
sensitivity exercise, where the same outlier detection
method is applied consistently to per capita expenditure
across more than 30 household budget surveys. The results
are clear-cut and provide a sense of the influence of
extreme values on poverty and inequality estimates. |
---|