Applying Machine Learning and Geolocation Techniques to Social Media Data (Twitter) to Develop a Resource for Urban Planning

With all the recent attention focused on big data, it is easy to overlook that basic vital statistics remain difficult to obtain in most of the world. This project set out to test whether an openly available dataset (Twitter) could be transformed into a resource for urban planning and development. The hypothesis is tested by creating road traffic crash location data, which are scarce in most resource-poor environments but essential for addressing the number one cause of mortality for children over age five and young adults. The research project scraped 874,588 traffic-related tweets in Nairobi, Kenya, applied a machine learning model to capture the occurrence of a crash, and developed an improved geoparsing algorithm to identify its location. The project geolocated 32,991 crash reports in Twitter for 2012-20 and clustered them into 22,872 unique crashes to produce one of the first crash maps for Nairobi. A motorcycle delivery service was dispatched in real-time to verify a subset of crashes, showing 92 percent accuracy. Using a spatial clustering algorithm, portions of the road network (less than 1 percent) were identified where 50 percent of the geolocated crashes occurred. Even with limitations in the representativeness of the data, the results can provide urban planners useful information to target road safety improvements where resources are limited.

Saved in:
Bibliographic Details
Main Authors: Milusheva, Sveta, Marty, Robert, Bedoya, Guadalupe, Williams, Sarah, Resor, Elizabeth, Legovini, Arianna
Format: Working Paper biblioteca
Language:English
Published: World Bank, Washington, DC 2020-12
Subjects:MACHINE LEARNING, BIG DATA, URBAN PLANNING, ROAD SAFETY, SDGs, GEOGRAPHIC INFORMATION SYSTEM, SOCIAL MEDIA, GEOSPATIAL ANALYSIS, SPATIAL CLUSTERING,
Online Access:http://documents.worldbank.org/curated/en/407261607111342557/Applying-Machine-Learning-and-Geolocation-Techniques-to-Social-Media-Data-Twitter-to-Develop-a-Resource-for-Urban-Planning
https://hdl.handle.net/10986/34910
Tags: Add Tag
No Tags, Be the first to tag this record!
id dig-okr-1098634910
record_format koha
spelling dig-okr-10986349102024-10-17T08:30:22Z Applying Machine Learning and Geolocation Techniques to Social Media Data (Twitter) to Develop a Resource for Urban Planning Milusheva, Sveta Marty, Robert Bedoya, Guadalupe Williams, Sarah Resor, Elizabeth Legovini, Arianna MACHINE LEARNING BIG DATA URBAN PLANNING ROAD SAFETY SDGs GEOGRAPHIC INFORMATION SYSTEM SOCIAL MEDIA GEOSPATIAL ANALYSIS SPATIAL CLUSTERING With all the recent attention focused on big data, it is easy to overlook that basic vital statistics remain difficult to obtain in most of the world. This project set out to test whether an openly available dataset (Twitter) could be transformed into a resource for urban planning and development. The hypothesis is tested by creating road traffic crash location data, which are scarce in most resource-poor environments but essential for addressing the number one cause of mortality for children over age five and young adults. The research project scraped 874,588 traffic-related tweets in Nairobi, Kenya, applied a machine learning model to capture the occurrence of a crash, and developed an improved geoparsing algorithm to identify its location. The project geolocated 32,991 crash reports in Twitter for 2012-20 and clustered them into 22,872 unique crashes to produce one of the first crash maps for Nairobi. A motorcycle delivery service was dispatched in real-time to verify a subset of crashes, showing 92 percent accuracy. Using a spatial clustering algorithm, portions of the road network (less than 1 percent) were identified where 50 percent of the geolocated crashes occurred. Even with limitations in the representativeness of the data, the results can provide urban planners useful information to target road safety improvements where resources are limited. 2020-12-10T15:01:46Z 2020-12-10T15:01:46Z 2020-12 Working Paper Document de travail Documento de trabajo http://documents.worldbank.org/curated/en/407261607111342557/Applying-Machine-Learning-and-Geolocation-Techniques-to-Social-Media-Data-Twitter-to-Develop-a-Resource-for-Urban-Planning https://hdl.handle.net/10986/34910 English Policy Research Working Paper;No. 9488 CC BY 3.0 IGO http://creativecommons.org/licenses/by/3.0/igo World Bank application/pdf text/plain World Bank, Washington, DC
institution Banco Mundial
collection DSpace
country Estados Unidos
countrycode US
component Bibliográfico
access En linea
databasecode dig-okr
tag biblioteca
region America del Norte
libraryname Biblioteca del Banco Mundial
language English
topic MACHINE LEARNING
BIG DATA
URBAN PLANNING
ROAD SAFETY
SDGs
GEOGRAPHIC INFORMATION SYSTEM
SOCIAL MEDIA
GEOSPATIAL ANALYSIS
SPATIAL CLUSTERING
MACHINE LEARNING
BIG DATA
URBAN PLANNING
ROAD SAFETY
SDGs
GEOGRAPHIC INFORMATION SYSTEM
SOCIAL MEDIA
GEOSPATIAL ANALYSIS
SPATIAL CLUSTERING
spellingShingle MACHINE LEARNING
BIG DATA
URBAN PLANNING
ROAD SAFETY
SDGs
GEOGRAPHIC INFORMATION SYSTEM
SOCIAL MEDIA
GEOSPATIAL ANALYSIS
SPATIAL CLUSTERING
MACHINE LEARNING
BIG DATA
URBAN PLANNING
ROAD SAFETY
SDGs
GEOGRAPHIC INFORMATION SYSTEM
SOCIAL MEDIA
GEOSPATIAL ANALYSIS
SPATIAL CLUSTERING
Milusheva, Sveta
Marty, Robert
Bedoya, Guadalupe
Williams, Sarah
Resor, Elizabeth
Legovini, Arianna
Applying Machine Learning and Geolocation Techniques to Social Media Data (Twitter) to Develop a Resource for Urban Planning
description With all the recent attention focused on big data, it is easy to overlook that basic vital statistics remain difficult to obtain in most of the world. This project set out to test whether an openly available dataset (Twitter) could be transformed into a resource for urban planning and development. The hypothesis is tested by creating road traffic crash location data, which are scarce in most resource-poor environments but essential for addressing the number one cause of mortality for children over age five and young adults. The research project scraped 874,588 traffic-related tweets in Nairobi, Kenya, applied a machine learning model to capture the occurrence of a crash, and developed an improved geoparsing algorithm to identify its location. The project geolocated 32,991 crash reports in Twitter for 2012-20 and clustered them into 22,872 unique crashes to produce one of the first crash maps for Nairobi. A motorcycle delivery service was dispatched in real-time to verify a subset of crashes, showing 92 percent accuracy. Using a spatial clustering algorithm, portions of the road network (less than 1 percent) were identified where 50 percent of the geolocated crashes occurred. Even with limitations in the representativeness of the data, the results can provide urban planners useful information to target road safety improvements where resources are limited.
format Working Paper
topic_facet MACHINE LEARNING
BIG DATA
URBAN PLANNING
ROAD SAFETY
SDGs
GEOGRAPHIC INFORMATION SYSTEM
SOCIAL MEDIA
GEOSPATIAL ANALYSIS
SPATIAL CLUSTERING
author Milusheva, Sveta
Marty, Robert
Bedoya, Guadalupe
Williams, Sarah
Resor, Elizabeth
Legovini, Arianna
author_facet Milusheva, Sveta
Marty, Robert
Bedoya, Guadalupe
Williams, Sarah
Resor, Elizabeth
Legovini, Arianna
author_sort Milusheva, Sveta
title Applying Machine Learning and Geolocation Techniques to Social Media Data (Twitter) to Develop a Resource for Urban Planning
title_short Applying Machine Learning and Geolocation Techniques to Social Media Data (Twitter) to Develop a Resource for Urban Planning
title_full Applying Machine Learning and Geolocation Techniques to Social Media Data (Twitter) to Develop a Resource for Urban Planning
title_fullStr Applying Machine Learning and Geolocation Techniques to Social Media Data (Twitter) to Develop a Resource for Urban Planning
title_full_unstemmed Applying Machine Learning and Geolocation Techniques to Social Media Data (Twitter) to Develop a Resource for Urban Planning
title_sort applying machine learning and geolocation techniques to social media data (twitter) to develop a resource for urban planning
publisher World Bank, Washington, DC
publishDate 2020-12
url http://documents.worldbank.org/curated/en/407261607111342557/Applying-Machine-Learning-and-Geolocation-Techniques-to-Social-Media-Data-Twitter-to-Develop-a-Resource-for-Urban-Planning
https://hdl.handle.net/10986/34910
work_keys_str_mv AT milushevasveta applyingmachinelearningandgeolocationtechniquestosocialmediadatatwittertodeveloparesourceforurbanplanning
AT martyrobert applyingmachinelearningandgeolocationtechniquestosocialmediadatatwittertodeveloparesourceforurbanplanning
AT bedoyaguadalupe applyingmachinelearningandgeolocationtechniquestosocialmediadatatwittertodeveloparesourceforurbanplanning
AT williamssarah applyingmachinelearningandgeolocationtechniquestosocialmediadatatwittertodeveloparesourceforurbanplanning
AT resorelizabeth applyingmachinelearningandgeolocationtechniquestosocialmediadatatwittertodeveloparesourceforurbanplanning
AT legoviniarianna applyingmachinelearningandgeolocationtechniquestosocialmediadatatwittertodeveloparesourceforurbanplanning
_version_ 1813416779942199296