Kelp Wanted: Segmenting Kelp Forests

Help researchers estimate the extent of Giant Kelp Forests by segmenting Landsat imagery. #climate

$15,000 in prizes
feb 2024
671 joined

Problem description

In this challenge, your goal is to create an algorithm that predicts the presence or absence of kelp canopy using satellite imagery. This is a binary semantic segmentation task. Our data partners at UMass Boston and Woods Hole Oceanographic Institution have provided the two primary datasets for this challenge:

  1. Feature data: Satellite imagery from the Landsat satellite missions, in addition to elevation information and cloud masks.
  2. Label data: Binary masks indicating the presence or absence of kelp created by citizen scientists via the Floating Forests platform.

Feature data

The feature data is a geospatial dataset composed primarily of remote sensing observations collected by Landsat mission satellites. It is provided in the form of unreferenced GeoTIFFs, meaning that the geospatial information has been intentionally removed. These GeoTIFFs consist of seven distinct bands, all of which have been coreferenced (spatially aligned) and have a spatial resolution of 30 meters.

The satellite images have been cropped to square 350 x 350 pixel "tiles". Each tile has been assigned a unique tile_id and corresponds to patches of the coastal waters surrounding the Falkland Islands. These tiles span multiple decades, offering a comprehensive view of the region's kelp forests over time.

Note that while tiles are correlated, each observation or tile_id must be processed independently during inference. That is, other pixels in the same tile_id can be used for generating predictions, but not those in other tile_ids.

The filename for each tile follows the schema <tile_id>_satellite.tif. The corresponding label is named <tile_id>_kelp.tif. These files can be found on the data download page.

Satellite Imagery

The first five bands of each tile are spectral bands extracted from the Level 2 Landsat product. These are observations of surface reflectance, i.e., the fraction of incoming solar radiation that is reflected from Earth's surface, after adjusting for atmospheric and geometric effects. The provided spectra include shortwave infrared (SWIR1), near-infrared (NIR), red, green, and blue bands obtained using Landsat satellites 5, 7, and 8. Together, these bands capture a broad spectrum of electromagnetic radiation and have been selected for their usefulness for monitoring coastal environments.

Surface reflectance values have been re-scaled to 16-bit integers (so-called digital numbers) to optimize computational efficiency. Valid values are positive integers in the range from 0 to 65,536, with the integer -32,768 indicating a missing value. Specific information about the band wavelengths can be found in this USGS resource.

TIP: Competitors are not required to account for wavelength differences between sensors, as these corrections are already undertaken during USGS data processing. However, it has been demonstrated in previous modeling efforts that value normalization can be beneficial. We encourage solvers to explore various preprocessing methods for experimentation.

The remaining two bands can be used to exclude irrelevant pixels. The sixth band is a Digital Elevation Model (DEM) containing elevation measured in meters from sea-level derived from Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) data, which can be utilized to generate a land mask. The seventh and final band is a cloud mask indicating the presence (1) or absence (0) of clouds.

Band Order and Detail

  1. SWIR (Shortwave Infrared) (int): The SWIR band is useful for distinguishing between different types of vegetation, as well as for detecting moisture content in soil and vegetation. When combined with the green band, it can be used to calculate the Modified Normalized Difference Water Index (MNDWI). The MNDWI is especially useful for identifying the intertidal zone, which can be mistaken for kelp canopy at low tides.

  2. NIR (Near-Infrared) (int): The NIR band is essential for vegetation studies, as healthy vegetation reflects a significant amount of NIR light. It is often used to calculate the NDVI (Normalized Difference Vegetation Index), which has been used in past efforts to estimate kelp canopy.

  3. Red (int): The red band captures red light from the visible spectrum.

  4. Green (int): The green band captures green light from the visible spectrum.

  5. Blue (int): The blue band captures blue light from the visible spectrum.

  6. Cloud Mask (int): A binary mask identifying the presence (1) or absence (0) of clouds.

  7. Digital Elevation Map (int): The Digital Elevation Model (DEM) is generated from ASTER data and can be used to generate a land mask. Values represent meters above sea-level and start at 0.


Information about each satellite image is recorded in metadata.csv. The metadata is available on the data download page.

The fields in metadata.csv are:

  • tile_id (str): A unique identifier for a single patch of coastal water
  • filename (str): The filename of the corresponding image, which follows the naming convention {tile_id}_satellite.tif for an input satellite image or {tile_id}_kelp.tif for the binary kelp mask
  • md5_hash (str): The md5 hash value to make sure the data was transmitted correctly
  • filesize_bytes (int): The size of the file in bytes
  • type (str): Whether the GeoTIFF is a satellite or kelp image
  • in_train (bool): TRUE if the image is a part of the training data or FALSE if it is in the test data

Example of first row from metadata.csv (transposed):

tile_id JW725114
filename JW725114_satellite.tif
md5_hash 97b19f0747260df89e23f33caced3632
filesize_bytes 1105392
type satellite
in_train True

Known issues in the data

Satellite data is collected by sensors aboard satellites in space. The Level 2 product is post-processed so that it is ready to use. However, there are data collection and processing issues. Some known issues are catalogued here. Solvers should expect some noise in the input data, including missing values, overcorrected values, and patterned artifacts.

Ground truth data

The ground truth values for this competition are binary masks indicating the presence (1) or absence (0) of kelp. Labels for each tile were generated by citizen scientists on the Floating Forests platform. It's important to note that there may be some false negatives, meaning instances of actual kelp canopy may be labeled as 0 instead of 1. It is also the case that some images may not contain any kelp at all.

Similarly to the feature satellite imagery, the labels are provided as single band 350 x 350 pixel TIFF images. For the same tile ID, each pixel in the satellite data corresponds to a pixel in the same position in the labels. The kelp TIFFs follow the naming convention {tile_id}_kelp.tif.

An example tile with a color image, false color image, and masked image.

Left: A true color image of an example tile using the RGB bands. Center: A false color image using the SWIR, NIR, and Red bands. Right: The false color image with the labeled kelp mask overlayed in cyan.

Update December 20, 2023: Per the competition rules, external data is not allowed in this competition. However, participants can use pre-trained computer vision models as long as they were 1) available freely and openly in that form at the start of the competition and 2) not trained on any data associated with the ground truth data for this challenge.

Performance metric

To evaluate the performance of your model in this binary semantic segmentation task, we will use the Dice Coefficient (also known as the Sørensen-Dice Index) as the performance metric. The Dice Coefficient quantifies the similarity between the predicted and ground-truth binary masks. A higher Dice Coefficient indicates better segmentation accuracy.

The Dice Coefficient is calculated on a per-pixel basis, where each pixel in your submitted TIFF image for a patch is compared to the corresponding pixel in the ground-truth TIFF image for the same patch. This calculation is performed for each image in the test set, and the resulting Dice Coefficients are then averaged to provide an overall assessment of your model's performance.

The Dice Coefficient is defined as:

$$ \text{Dice Coefficient} = \frac{2 \cdot |A \cap B|}{|A| + |B|} $$


  • |$|A|$| represents the size (cardinality) of set A.
  • |$|B|$| represents the size (cardinality) of set B.
  • |$|A \cap B|$| represents the size of the intersection of sets A and B.

Submission format

You must submit your predictions in the form of single-band 350 x 350 tif images, placed into a single archive. The lone band should contain your predictions for whether a pixel contains kelp (1) or not (0).

The format for the submission is a .tar, .tgz, or .zip file containing your binary predictions for each tile ID in the test set. Each TIFF must be named according to the convention {tile_id}_kelp.tif. The order of the files does not matter. Every file in Test Features should have a corresponding file in your prediction.

For example, the first few files in your uncompressed might look like:


Your submission cannot exceed 1GB and the maximum supported precision for the TIFFs is float32. Check out the Data Download page for an example submission.

Good luck

If you're wondering how to get started, check out the MATLAB benchmark blog post.

Good luck and enjoy this problem! If you have any questions you can always visit the user forum.