Pover-T Tests: Predicting Poverty Hosted By DrivenData

competition
complete
$15,000

Woohoo! This competition has come to a close!

Many thanks to the participants for all of their hard work and commitment to using data for good!

Collecting household survey data on poverty is expensive and time-consuming, which means that policy makers are often making decisions based on old data. Machine learning could drastically change the game, making poverty measurement cheaper and much closer to real-time.

– Asli Demirguc-Kunt, Director of Research at the World Bank

Why

The World Bank aims to end extreme poverty by 2030. To achieve this goal, they need efficient pipelines for measuring, tracking, and predicting poverty. But measuring poverty is currently hard, time consuming, and expensive. Estimates are typically collected through complex household consumption surveys with data on hundreds of different variables, each of which may be useful when assessing poverty levels.

The Solution

Machine learning offers new approaches for determining which variables are most predictive and how they can be most effectively combined. In this competition, data scientists from more than 130 countries around the world built algorithms to predict household-level poverty status using surveys data from three developing countries, each with a different distribution of wealth. 

The Results

The best algorithms pulled out all the stops, creating ensembles of neural networks, XGBoost, LightGBM, and even CatBoost (to leverage the mostly-categorical nature of the survey data) models. These approaches fed into a research paper publishing the winning solutions and contributing to the democratization of machine learning through resources for future application and training.


RESULTS ANNOUNCEMENT + MEET THE WINNERS

WINNING MODELS ON GITHUB

FEATURED STORY