STAC Overflow: Map Floodwater from Radar Imagery

Help Microsoft AI for Earth and Cloud to Street detect floodwater through cloud coverage using Sentinel-1 synthetic-aperture radar (SAR) imagery. Accurate flood mapping can save lives by strengthening early warning systems and directing emergency relief. #disasters

$20,000 in prizes
sep 2021
655 joined

Code submission format


This is a code submission challenge! Rather than submitting your predicted labels, you'll package everything needed to do inference and submit that for containerized execution. The runtime repository contains the complete specification for the runtime.

Note: This repository is designed to be compatible with Microsoft's Planetary Computer containers. The Planetary Computer Hub provides a convenient way to compute on data from the Planetary Computer. In this competition, you can train your model in the Planetary Computer Hub and test it using this repo. To request beta access to the Planetary Computer Hub, fill out this form and include "DrivenData" in your area of study.

What to submit

Your final submission should be a zip archive named with the extension .zip (for example, submission.zip). The root level of the submission.zip file must contain a main.py which performs inference on the test images and writes predictions in the form of single-band 512x512 pixel .tifs into the submission folder. You can see an example of this submission setup in the runtime repository.

Here's an example:

codeexecution       # Submission working directory (your zip is extracted here)
├── assets          # Example of configuration and weights for the trained model
│   ├── model.json
│   └── weights.h5
└── main.py         # Inference script

Note: be sure that when you unzip the submission main.py exists in the folder where you unzip. This file must be present at the root level of the zip archive. There should be no folder that contains it.

Directory access

During code execution, your submission will be unzipped and run in our cloud compute cluster. The script will have access to the following directory structure:

codeexecution
├── data
│   ├── test_features
│   │      ├── ...
│   │      ├── <all of the test image polarization bands as GeoTIFFs>
│   │      ├── ...
│   │      ├── abc00_vv.tif
│   │      └── abc00_vh.tif
│   ├── jrc_change
│   │      ├── ...
│   │      ├── <jrc change bands for each test chip as GeoTIFFs>
│   │      ├── ...
│   │      ├── abc00.tif
│   │      └── abc01.tif
│   ├── jrc_extent
│   │      ├── ...
│   │      ├── <jrc extent bands for each test chip as GeoTIFFs>
│   │      ├── ...
│   │      ├── abc00.tif
│   │      └── abc01.tif
│   ├── jrc_occurrence
│   │      ├── ...
│   │      ├── <jrc occurrence bands for each test chip as GeoTIFFs>
│   │      ├── ...
│   │      ├── abc00.tif
│   │      └── abc01.tif
│   ├── jrc_recurrence
│   │      ├── ...
│   │      ├── <jrc recurrence bands for each test chip as GeoTIFFs>
│   │      ├── ...
│   │      ├── abc00.tif
│   │      └── abc01.tif
│   ├── jrc_seasonality
│   │      ├── ...
│   │      ├── <jrc seasonality bands for each test chip as GeoTIFFs>
│   │      ├── ...
│   │      ├── abc00.tif
│   │      └── abc01.tif
│   ├── jrc_transitions
│   │      ├── ...
│   │      ├── <jrc transition bands for each test chip as GeoTIFFs>
│   │      ├── ...
│   │      ├── abc00.tif
│   │      └── abc01.tif
│   └── nasadem
│          ├── ...
│          ├── <nasadem bands for each test chip as GeoTIFFs>
│          ├── ...
│          ├── abc00.tif
│          └── abc01.tif
├── submission
│   └── <empty folder where you will save your test predictions as tifs>
├── main.py
└── <additional assets you put in your submission archive>

Test data

The test images will be available in data/test_features. All images for inference are in the root level of the test_features folder and are identified by image_id. Your script can load one or both polarization bands associated with a chip as input based on the filename, i.e. _vv and _vh.

Supplementary data

Supplementary elevation (NASADEM) and permanent water (JRC) data made available through the Planetary Computer STAC API can be used for feature engineering and model training. For each test chip geography, supplementary permanent water and elevation images are available as static resources in the data folder. Supplementary image files are identified by chip_id. You have access to the following 7 subdirectories, which each contain a single image per test chip:

  • data/jrc_change
  • data/jrc_extent
  • data/jrc_occurrence
  • data/jrc_recurrence
  • data/jrc_seasonality
  • data/jrc_transitions
  • data/nasadem

For example, to load polarization bands along with the nasadem supplementary image for each test chip, you might run:

INPUT_IMAGES_DIR = Path("data/test_features")
NASADEM_DIR = Path("data/nasadem")

for chip_id in chip_ids:
    arr_vh = imread(INPUT_IMAGES_DIR / f"{chip_id}_vh.tif")
    arr_vv = imread(INPUT_IMAGES_DIR / f"{chip_id}_vv.tif")
    nasadem_img = imread(NASADEM_DIR / f"{chip_id}.tif")
    # generate predictions

The use of supplementary input for training and/or inference is optional.

Submission checklist

  • Submission includes main.py in the root directory of the zip. There can be extra files with more code that is called (see assets folder in the example).
  • Submission contains any model weights that need to be loaded. There will be no network access.
  • Script loads the data for inference from the read only data/test_features folder and optionally from the supplementary data folders. All images for inference are in the root level of test_features and are identified by image_id. Supplementary image files are identified by chip_id. Your script can load one or both polarization bands associated with a chip as input based on the filename.
  • Script writes chip-level predictions as 512x512 pixel .tifs that consist of 1 (water) and 0 (no water) pixel values (eg. abc00.tif) to the submission folder during inference. File names must match the input .tifs _without the _vv or _vh polarization band.

For example, if test_features contains abc00_vv.tif and abc00_vh.tif, that means you must create one submission/abc00.tif. We will tar your submission files for you. An archive will not be accepted as a valid prediction.

The following code saves an example prediction to submission:

from pathlib import Path
import numpy as np
from tifffile import imsave

SUBMISSION_DIR = Path("submission")
INPUT_IMAGES_DIR = Path("data/test_features")
NASADEM_DIR = Path("data/nasadem")

input_paths = INPUT_IMAGES_DIR.glob("*.tif")
chip_ids = list(sorted(set(path.stem.split("_")[0] for path in paths)))

for chip_id in chip_ids:
    arr_vh = imread(INPUT_IMAGES_DIR / f"{chip_id}_vh.tif")
    arr_vv = imread(INPUT_IMAGES_DIR / f"{chip_id}_vv.tif")
    nasadem_img = imread(NASADEM_DIR / f"{chip_id}.tif")
    # could do something useful here with arr_vh and arr_vv ;-)
    prediction = np.zeros((512, 512), dtype="uint8")
    output_path = SUBMISSION_DIR / f"{chip_id}.tif"
    imsave(output_path, prediction)

It's important that you use the files that are there to figure out what to output instead of hardcoding anything. We need to be able to run your submission on other images by replacing these files. See the problem description for more details.

Testing your submission locally

If you'd like to replicate how your submission will run online, you can test the submission locally before submitting. This is a great way to work out any bugs locally and ensure that your model will run quickly enough.

Prediction format

Your predictions will be 512x512 pixel masks, saved as .tifs, for each chip in the test set. Remember that each set of two polarization bands (VV and VH) corresponds with a single prediction. Each single-band mask should consist of 1 (water) and 0 (no water) pixel values. Pixel values with missing data in the radar imagery will be excluded during scoring. Each mask must have the same name as the chip_id of its corresponding input image, excluding the _vv or _vh suffix. For example, a prediction made for input image abc00_vv.tif would be named abc00.tif. The order of the files does not matter. An example prediction would look like the following:

Example Label

The first few files in the submission folder might look like:

abc00.tif
abc01.tif
abc02.tif
abc03.tif
abc04.tif

Runtime

Your code is executed within a container that is defined in our runtime repository. The limits are as follows:

  • Your submission must be written in Python using the packages defined in the runtime repository.
  • The submission must complete execution in 8 hours or less. We expect most submissions will complete much more quickly and computation time per participant will be monitored to prevent abuse.
  • The container runtime has access to a single GPU. All of your code should run within the GPU environments in the container, even if actual computation happens on the CPU. (CPU environments are provided within the container for local debugging only).
  • The container has access to 5 vCPUs powered by an Intel Xeon E5-2690 chip and 48GB RAM.
  • The container has 1 Tesla K80 GPU with 12GB of memory.
  • The container will not have network access. All necessary files (code and model assets) must be included in your submission.
  • The container execution will not have root access to the filesystem.

The GPUs for executing your inference code are a shared resource across competitors. We request you are conscientious in your use of them. Please add progress information to your logs and cancel jobs that will run longer than the time limit. Canceled jobs won't count against your submission limit, and this means more available resources to score submissions that will complete on time.

Requesting package installations

Since the docker container will not have network access, all packages must be pre-installed. We are happy to add packages as long as they do not conflict and can build successfully.

To request an additional package be added to the docker image, follow the instructions in the runtime repository and make sure to edit both the CPU and GPU versions. You can test this locally by following the instructions to build the Docker images locally first before opening your pull request.