Applying Machine Learning and Geolocation Techniques to Social Media Data (Twitter) to Develop a Resource for Urban Planning
With all the recent attention focused on big data, it is easy to overlook that basic vital statistics remain difficult to obtain in most of the world. This project set out to test whether an openly available dataset (Twitter) could be transformed into a resource for urban planning and development. The hypothesis is tested by creating road traffic crash location data, which are scarce in most resource-poor environments but essential for addressing the number one cause of mortality for children over age five and young adults. The research project scraped 874,588 traffic-related tweets in Nairobi, Kenya, applied a machine learning model to capture the occurrence of a crash, and developed an improved geoparsing algorithm to identify its location. The project geolocated 32,991 crash reports in Twitter for 2012-20 and clustered them into 22,872 unique crashes to produce one of the first crash maps for Nairobi. A motorcycle delivery service was dispatched in real-time to verify a subset of crashes, showing 92 percent accuracy. Using a spatial clustering algorithm, portions of the road network (less than 1 percent) were identified where 50 percent of the geolocated crashes occurred. Even with limitations in the representativeness of the data, the results can provide urban planners useful information to target road safety improvements where resources are limited.
Main Authors: | , , , , , |
---|---|
Format: | Working Paper biblioteca |
Language: | English |
Published: |
World Bank, Washington, DC
2020-12
|
Subjects: | MACHINE LEARNING, BIG DATA, URBAN PLANNING, ROAD SAFETY, SDGs, GEOGRAPHIC INFORMATION SYSTEM, SOCIAL MEDIA, GEOSPATIAL ANALYSIS, SPATIAL CLUSTERING, |
Online Access: | http://documents.worldbank.org/curated/en/407261607111342557/Applying-Machine-Learning-and-Geolocation-Techniques-to-Social-Media-Data-Twitter-to-Develop-a-Resource-for-Urban-Planning https://hdl.handle.net/10986/34910 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
dig-okr-1098634910 |
---|---|
record_format |
koha |
spelling |
dig-okr-10986349102024-10-17T08:30:22Z Applying Machine Learning and Geolocation Techniques to Social Media Data (Twitter) to Develop a Resource for Urban Planning Milusheva, Sveta Marty, Robert Bedoya, Guadalupe Williams, Sarah Resor, Elizabeth Legovini, Arianna MACHINE LEARNING BIG DATA URBAN PLANNING ROAD SAFETY SDGs GEOGRAPHIC INFORMATION SYSTEM SOCIAL MEDIA GEOSPATIAL ANALYSIS SPATIAL CLUSTERING With all the recent attention focused on big data, it is easy to overlook that basic vital statistics remain difficult to obtain in most of the world. This project set out to test whether an openly available dataset (Twitter) could be transformed into a resource for urban planning and development. The hypothesis is tested by creating road traffic crash location data, which are scarce in most resource-poor environments but essential for addressing the number one cause of mortality for children over age five and young adults. The research project scraped 874,588 traffic-related tweets in Nairobi, Kenya, applied a machine learning model to capture the occurrence of a crash, and developed an improved geoparsing algorithm to identify its location. The project geolocated 32,991 crash reports in Twitter for 2012-20 and clustered them into 22,872 unique crashes to produce one of the first crash maps for Nairobi. A motorcycle delivery service was dispatched in real-time to verify a subset of crashes, showing 92 percent accuracy. Using a spatial clustering algorithm, portions of the road network (less than 1 percent) were identified where 50 percent of the geolocated crashes occurred. Even with limitations in the representativeness of the data, the results can provide urban planners useful information to target road safety improvements where resources are limited. 2020-12-10T15:01:46Z 2020-12-10T15:01:46Z 2020-12 Working Paper Document de travail Documento de trabajo http://documents.worldbank.org/curated/en/407261607111342557/Applying-Machine-Learning-and-Geolocation-Techniques-to-Social-Media-Data-Twitter-to-Develop-a-Resource-for-Urban-Planning https://hdl.handle.net/10986/34910 English Policy Research Working Paper;No. 9488 CC BY 3.0 IGO http://creativecommons.org/licenses/by/3.0/igo World Bank application/pdf text/plain World Bank, Washington, DC |
institution |
Banco Mundial |
collection |
DSpace |
country |
Estados Unidos |
countrycode |
US |
component |
Bibliográfico |
access |
En linea |
databasecode |
dig-okr |
tag |
biblioteca |
region |
America del Norte |
libraryname |
Biblioteca del Banco Mundial |
language |
English |
topic |
MACHINE LEARNING BIG DATA URBAN PLANNING ROAD SAFETY SDGs GEOGRAPHIC INFORMATION SYSTEM SOCIAL MEDIA GEOSPATIAL ANALYSIS SPATIAL CLUSTERING MACHINE LEARNING BIG DATA URBAN PLANNING ROAD SAFETY SDGs GEOGRAPHIC INFORMATION SYSTEM SOCIAL MEDIA GEOSPATIAL ANALYSIS SPATIAL CLUSTERING |
spellingShingle |
MACHINE LEARNING BIG DATA URBAN PLANNING ROAD SAFETY SDGs GEOGRAPHIC INFORMATION SYSTEM SOCIAL MEDIA GEOSPATIAL ANALYSIS SPATIAL CLUSTERING MACHINE LEARNING BIG DATA URBAN PLANNING ROAD SAFETY SDGs GEOGRAPHIC INFORMATION SYSTEM SOCIAL MEDIA GEOSPATIAL ANALYSIS SPATIAL CLUSTERING Milusheva, Sveta Marty, Robert Bedoya, Guadalupe Williams, Sarah Resor, Elizabeth Legovini, Arianna Applying Machine Learning and Geolocation Techniques to Social Media Data (Twitter) to Develop a Resource for Urban Planning |
description |
With all the recent attention focused on
big data, it is easy to overlook that basic vital statistics
remain difficult to obtain in most of the world. This
project set out to test whether an openly available dataset
(Twitter) could be transformed into a resource for urban
planning and development. The hypothesis is tested by
creating road traffic crash location data, which are scarce
in most resource-poor environments but essential for
addressing the number one cause of mortality for children
over age five and young adults. The research project scraped
874,588 traffic-related tweets in Nairobi, Kenya, applied a
machine learning model to capture the occurrence of a crash,
and developed an improved geoparsing algorithm to identify
its location. The project geolocated 32,991 crash reports in
Twitter for 2012-20 and clustered them into 22,872 unique
crashes to produce one of the first crash maps for Nairobi.
A motorcycle delivery service was dispatched in real-time to
verify a subset of crashes, showing 92 percent accuracy.
Using a spatial clustering algorithm, portions of the road
network (less than 1 percent) were identified where 50
percent of the geolocated crashes occurred. Even with
limitations in the representativeness of the data, the
results can provide urban planners useful information to
target road safety improvements where resources are limited. |
format |
Working Paper |
topic_facet |
MACHINE LEARNING BIG DATA URBAN PLANNING ROAD SAFETY SDGs GEOGRAPHIC INFORMATION SYSTEM SOCIAL MEDIA GEOSPATIAL ANALYSIS SPATIAL CLUSTERING |
author |
Milusheva, Sveta Marty, Robert Bedoya, Guadalupe Williams, Sarah Resor, Elizabeth Legovini, Arianna |
author_facet |
Milusheva, Sveta Marty, Robert Bedoya, Guadalupe Williams, Sarah Resor, Elizabeth Legovini, Arianna |
author_sort |
Milusheva, Sveta |
title |
Applying Machine Learning and Geolocation Techniques to Social Media Data (Twitter) to Develop a Resource for Urban Planning |
title_short |
Applying Machine Learning and Geolocation Techniques to Social Media Data (Twitter) to Develop a Resource for Urban Planning |
title_full |
Applying Machine Learning and Geolocation Techniques to Social Media Data (Twitter) to Develop a Resource for Urban Planning |
title_fullStr |
Applying Machine Learning and Geolocation Techniques to Social Media Data (Twitter) to Develop a Resource for Urban Planning |
title_full_unstemmed |
Applying Machine Learning and Geolocation Techniques to Social Media Data (Twitter) to Develop a Resource for Urban Planning |
title_sort |
applying machine learning and geolocation techniques to social media data (twitter) to develop a resource for urban planning |
publisher |
World Bank, Washington, DC |
publishDate |
2020-12 |
url |
http://documents.worldbank.org/curated/en/407261607111342557/Applying-Machine-Learning-and-Geolocation-Techniques-to-Social-Media-Data-Twitter-to-Develop-a-Resource-for-Urban-Planning https://hdl.handle.net/10986/34910 |
work_keys_str_mv |
AT milushevasveta applyingmachinelearningandgeolocationtechniquestosocialmediadatatwittertodeveloparesourceforurbanplanning AT martyrobert applyingmachinelearningandgeolocationtechniquestosocialmediadatatwittertodeveloparesourceforurbanplanning AT bedoyaguadalupe applyingmachinelearningandgeolocationtechniquestosocialmediadatatwittertodeveloparesourceforurbanplanning AT williamssarah applyingmachinelearningandgeolocationtechniquestosocialmediadatatwittertodeveloparesourceforurbanplanning AT resorelizabeth applyingmachinelearningandgeolocationtechniquestosocialmediadatatwittertodeveloparesourceforurbanplanning AT legoviniarianna applyingmachinelearningandgeolocationtechniquestosocialmediadatatwittertodeveloparesourceforurbanplanning |
_version_ |
1813416779942199296 |