Open Cities AI Challenge: Segmenting Buildings for Disaster Resilience Hosted By The Global Facility for Disaster Reduction and Recovery
Woohoo! This competition has come to a close!
Many thanks to the participants for all of their hard work and commitment to using data for good!
About the project
Disaster risk management
As urban populations grow, managing this growth in a way that fosters cities’ resilience to natural hazards and climate change becomes a greater challenge that requires detailed, up-to-date geographic data of the built environment. Buildings, roads, and critical infrastructure all need to be mapped frequently, accurately, and in enough detail to represent assets important to every community.
Knowing where and how assets are exposed and vulnerable to damage or disruption by natural hazards is key to disaster risk management (DRM). For example, new building construction into floodplains exposes their inhabitants to flood risk which may be increasing due to sea level rise and extreme weather events. A building’s location, shape, and construction materials may make it more vulnerable to earthquake or wind damage than nearby buildings. Managing disaster risk is most effective with an accurate, detailed, and current understanding of the building stock of a city.
What is disaster risk?
Disaster risk is a combination of three components: hazard, exposure, and vulnerability. Data from each of these categories can be used to paint a picture of risk in a certain location and over time.
- Hazard: a potentially destructive physical phenomenon (e.g., earthquake, windstorm, flood).
- Exposure: the location, attributes, and value of assets that are important to communities (people, buildings, factories, farmland, etc.) and that could be affected by a hazard.
- Vulnerability: the likelihood that assets will be damaged/destroyed/affected when exposed to a hazard. For example, a building with multiple floors may be more vulnerable to shaking from an earthquake and more likely to collapse than a one-story building. Another example, an elderly person may be more vulnerable to the impacts of flooding because s/he has a harder time evacuating or moving quickly.
Types of risk data
Keeping up-to-date about buildings and their attributes (see the GED4ALL exposure data schema for more on what is used in DRM) is a particularly difficult task across Africa where populations are expected to double in the next 25 years. Cities are rapidly growing both denser and more spread-out with a diverse mix of formal and informal building construction. Addressing this challenge requires innovative, open, and dynamic data collection and mapping processes.
Open Cities Africa
The Open Cities Africa program, an initiative of the Global Facility for Disaster Reduction and Recovery (GFDRR), has: 1) established new information infrastructure for disaster risk management and urban resilience planning, 2) fostered local OpenStreetMap (OSM) communities, and 3) collected up-to-date open spatial data related to disaster risk in 11 cities across Africa.
Similar programs like Dar Ramani Huria engaged local communities and students in Dar es Salaam, Tanzania to create highly accurate maps of buildings, roads, drainage networks of the most flood-prone areas that now serve as foundational tools for the city’s planning and growth beyond flood resilience. The Zanzibar Mapping Initiative was the world’s largest aerial mapping exercise and used consumer drones and local mappers to update the base map of Zanzibar. The data is now openly available for all purposes related to the island’s conservation and development.
Each project starts by assessing goals and existing resources, engaging government, community, and other partners, and scoping the mapping work to be done. Updated overhead images of the areas of interest are obtained from consumer drones and satellites. This high resolution imagery is manually inspected and features like buildings, roads, and drainage networks are digitally mapped in a participatory manner. Fieldwork is conducted to map other features and add detailed attributes that may not be clearly visible from overhead imagery.
The collected data is used to design tools and products that support decision-making by partners and stakeholders. These digitized maps are published to OpenStreetMap and the imagery to OpenAerialMap where they serve as data public goods that can be used and improved by all. Training, community engagement, and collaboration are emphasized throughout the process to foster local networks of talent in digital cartography, robotics, software development, and data science.
Machine learning for visual tasks could improve mapping quality, speed, and cost. Recent advancements in ML for mapping include Facebook’s AI-assisted road mapping tool for OSM, Microsoft’s country-scale automated building footprint extraction (in USA, Canada, Tanzania and Uganda), and competitions like SpaceNet for better solutions for road and building mapping and xView2 for post-disaster building damage assessment.
These applications all feature the computer vision task of semantic segmentation: classifying every pixel in an image into categories like building, road, tree, background. Semantic segmentation is useful for mapping because its pixel-level outputs are relatively easy to visually interpret, verify, and use as-is (e.g. calculation of built-up surface area) or as inputs to downstream steps (e.g. segment building footprints first and then classify building attributes in finer detail).
With more and more OSM data labeled on high resolution drone imagery through participatory mapping in diverse urban environments across Africa, how might we use these to develop better open-source building segmentation models to keep up in our understanding of rapidly growing cities? How might we create and apply machine learning systems in the most responsible ways for disaster risk management?
The unique challenges and opportunities of this competition include:
Better mapping of diverse urban environments: Machine learning models for building segmentation have mostly been trained on satellite imagery at spatial resolutions of 30 cm/pixel or lower of geographies outside of Africa. With new training data comprised of drone imagery routinely collected at much higher resolutions (3-20cm/pixel) and buildings labeled by local OSM communities across many African cities, we have the potential to develop models that can better map these diverse and densely built-up urban environments.
Making the most of imperfect training data for more pixel-perfect mapping: The training data imagery represents a diverse range of geographies, spatial resolutions, sensors, and aerial surveying conditions. Their labels (OSM building footprint tracings) are inconsistent: some images have pixel-perfect tracings of every building while others have many missing or misaligned labels. New techniques to make better use of this diverse, noisy data could unlock the potential of OpenStreetMap as ML training data for many new geographies and imagery sources.
Testing model robustness and generalizability to new data: Test imagery may come from areas that are not present in the training set. Participants will need to develop models that perform best on new, unseen data. Doing so will increase the usefulness of ML for mapping on imagery created through similar processes but in new geographies and diverse conditions.
Integrating ML into participatory mapping and open data efforts: Building on efforts like ML-Enabler by the Humanitarian OpenStreetMap Team, what are innovative ways to integrate high-performing open source ML solutions to enhance mapper experience and OSM data quality? On the flip side, what are novel data pre-processing and clean-up techniques that could increase the value of OSM data for geospatial machine learning?
Responsibly using ML to support disaster risk management and urban resilience planning: See the Responsible AI track for more information.
Sample drone imagery and OSM building footprint labels from 10 cities
The Global Facility for Disaster Reduction and Recovery (GFDRR) is a partnership of the World Bank, United Nations, major donors and recipient countries under the International Strategy for Disaster Reduction (ISDR) system to support the implementation of the Hyogo Framework for Action (HFA). Launched in September 2006, GFDRR provides technical and financial assistance to help disaster-prone countries decrease their vulnerability and adapt to climate change. GFDRR works closely with UN agencies, client governments, World Bank regional offices, and other partners.
To meet the needs of a rapidly changing world, GFDRR Labs supports the use of innovative approaches to science, technology, communication and design in promoting new ideas and the development of original tools to empower decision-makers in vulnerable countries to strengthen their resilience. Recent innovations in the field have enabled better access to disaster and climate risk information and a greater capacity to create, manage, and use this information. Lab activities are designed and implemented in partnership with government institutions and key international and local partners, ensuring that all activities add value in planning, operational, and recovery activities.
In 2011, GFDRR launched the Open Data for Resilience Initiative (OpenDRI) to apply the concepts of the global open data movement to the challenges of reducing vulnerability to natural hazards and the impacts of climate change. OpenDRI supports efforts to build capacity and long-term ownership of open data projects that are tailored to meet specific needs. OpenDRI is guided by nine core principles, and engages with client governments in three main areas:
- Sharing data through open data platforms
- Collecting data through community mapping and crowdsourcing
- Using data through risk visualization and communication
Azavea is a software company that focuses on products and professional services for turning geospatial data into actionable insights. Azavea is a B corporation that operates with a mission to advance the state of the art in geospatial technology and apply it for civic, social, and environmental impact.
DrivenData runs online machine learning competitions where data scientists and quantitative experts from all around the world compete to build the best algorithms for social good. The DrivenData Labs team also works on data science projects directly with mission-driven organizations.