When Is There Enough Data to Create a Global Statistic ?

To monitor progress toward global goals such as the Sustainable Development Goals, global statistics are needed. Yet cross-country data sets are rarely truly global, creating a trade-off for producers of global statistics: the lower is the data coverage threshold for disseminating global statistics, the more statistics can be made available, but the lower is the accuracy of these statistics. This paper quantifies the availability-accuracy trade-off by running more than 10 million simulations on the World Development Indicators. It shows that if the fraction of the world’s population for which data are lacking is x, then the global value will on expectation be off by 0.37*x standard deviation, and it could be off by as much as x standard deviations. The paper shows the robustness of this result to various assumptions and provides recommendations on when there is enough data to create global statistics. Although the decision will be context specific, in a baseline scenario, it is suggested not to create global statistics when there are data for less than half of the world’s population.

Saved in:
Bibliographic Details
Main Authors: Mahler, Daniel Gerszon, Serajuddin, Umar, Maeda, Hiroko
Format: Working Paper biblioteca
Language:English
Published: Washington, DC: World Bank 2022-05-05
Subjects:SUSTAINABLE DEVELOPMENT GOALS, SDGs, WORLD DEVELOPMENT INDICATORS, DATA COLLECTION AND ANALYSIS, DATA COVERAGE,
Online Access:http://documents.worldbank.org/curated/en/099715305052234754/IDU05d45ca360cddd044880828c072587f9a9c4f
http://hdl.handle.net/10986/37413
Tags: Add Tag
No Tags, Be the first to tag this record!
id dig-okr-1098637413
record_format koha
spelling dig-okr-10986374132022-05-14T05:10:37Z When Is There Enough Data to Create a Global Statistic ? Mahler, Daniel Gerszon Serajuddin, Umar Maeda, Hiroko SUSTAINABLE DEVELOPMENT GOALS SDGs WORLD DEVELOPMENT INDICATORS DATA COLLECTION AND ANALYSIS DATA COVERAGE To monitor progress toward global goals such as the Sustainable Development Goals, global statistics are needed. Yet cross-country data sets are rarely truly global, creating a trade-off for producers of global statistics: the lower is the data coverage threshold for disseminating global statistics, the more statistics can be made available, but the lower is the accuracy of these statistics. This paper quantifies the availability-accuracy trade-off by running more than 10 million simulations on the World Development Indicators. It shows that if the fraction of the world’s population for which data are lacking is x, then the global value will on expectation be off by 0.37*x standard deviation, and it could be off by as much as x standard deviations. The paper shows the robustness of this result to various assumptions and provides recommendations on when there is enough data to create global statistics. Although the decision will be context specific, in a baseline scenario, it is suggested not to create global statistics when there are data for less than half of the world’s population. 2022-05-13T15:33:00Z 2022-05-13T15:33:00Z 2022-05-05 Working Paper http://documents.worldbank.org/curated/en/099715305052234754/IDU05d45ca360cddd044880828c072587f9a9c4f http://hdl.handle.net/10986/37413 English CC BY 3.0 IGO http://creativecommons.org/licenses/by/3.0/igo World Bank Washington, DC: World Bank Policy Research Working Paper Publications & Research
institution Banco Mundial
collection DSpace
country Estados Unidos
countrycode US
component Bibliográfico
access En linea
databasecode dig-okr
tag biblioteca
region America del Norte
libraryname Biblioteca del Banco Mundial
language English
topic SUSTAINABLE DEVELOPMENT GOALS
SDGs
WORLD DEVELOPMENT INDICATORS
DATA COLLECTION AND ANALYSIS
DATA COVERAGE
SUSTAINABLE DEVELOPMENT GOALS
SDGs
WORLD DEVELOPMENT INDICATORS
DATA COLLECTION AND ANALYSIS
DATA COVERAGE
spellingShingle SUSTAINABLE DEVELOPMENT GOALS
SDGs
WORLD DEVELOPMENT INDICATORS
DATA COLLECTION AND ANALYSIS
DATA COVERAGE
SUSTAINABLE DEVELOPMENT GOALS
SDGs
WORLD DEVELOPMENT INDICATORS
DATA COLLECTION AND ANALYSIS
DATA COVERAGE
Mahler, Daniel Gerszon
Serajuddin, Umar
Maeda, Hiroko
When Is There Enough Data to Create a Global Statistic ?
description To monitor progress toward global goals such as the Sustainable Development Goals, global statistics are needed. Yet cross-country data sets are rarely truly global, creating a trade-off for producers of global statistics: the lower is the data coverage threshold for disseminating global statistics, the more statistics can be made available, but the lower is the accuracy of these statistics. This paper quantifies the availability-accuracy trade-off by running more than 10 million simulations on the World Development Indicators. It shows that if the fraction of the world’s population for which data are lacking is x, then the global value will on expectation be off by 0.37*x standard deviation, and it could be off by as much as x standard deviations. The paper shows the robustness of this result to various assumptions and provides recommendations on when there is enough data to create global statistics. Although the decision will be context specific, in a baseline scenario, it is suggested not to create global statistics when there are data for less than half of the world’s population.
format Working Paper
topic_facet SUSTAINABLE DEVELOPMENT GOALS
SDGs
WORLD DEVELOPMENT INDICATORS
DATA COLLECTION AND ANALYSIS
DATA COVERAGE
author Mahler, Daniel Gerszon
Serajuddin, Umar
Maeda, Hiroko
author_facet Mahler, Daniel Gerszon
Serajuddin, Umar
Maeda, Hiroko
author_sort Mahler, Daniel Gerszon
title When Is There Enough Data to Create a Global Statistic ?
title_short When Is There Enough Data to Create a Global Statistic ?
title_full When Is There Enough Data to Create a Global Statistic ?
title_fullStr When Is There Enough Data to Create a Global Statistic ?
title_full_unstemmed When Is There Enough Data to Create a Global Statistic ?
title_sort when is there enough data to create a global statistic ?
publisher Washington, DC: World Bank
publishDate 2022-05-05
url http://documents.worldbank.org/curated/en/099715305052234754/IDU05d45ca360cddd044880828c072587f9a9c4f
http://hdl.handle.net/10986/37413
work_keys_str_mv AT mahlerdanielgerszon whenisthereenoughdatatocreateaglobalstatistic
AT serajuddinumar whenisthereenoughdatatocreateaglobalstatistic
AT maedahiroko whenisthereenoughdatatocreateaglobalstatistic
_version_ 1756576105856237568