From Fog Nets to Neural Nets

Model the water output from water-collecting fog nets in Southwest Morocco. Accurate predictions can improve collection efforts and enable greater access to fresh water throughout the year. #development

$15,000 in prizes
may 2016
543 joined

Algorithm challenge description

Your goal is to predict the yield of fog nets in the Anti-Atlas mountains in Southwestern Morocco. We provide you with measurements from weather stations both at the site of the fog nets and for the larger geographical region.

Available data

For this competition, we have both weather data from a sensor co-located with the fog nets (microclimate data) and data from weather stations in three nearby cities in Morocco (macroclimate data). Various metorological measures are recorded at these different weather stations. Your goal is to use these data sources to predict the yield of a collection of fog nets. The yield of these nets is measured every two hours. For the test set, windows of 4 days have been removed from the training data at regular intervals, and competitors will attempt to predict most accurately the yield during these intervals. For some of these intervals, concurrent microclimate data is provided. For others, only macroclimate data is available.

Map of locations

Microclimate weather data

Data is collected by a sensor array co-located with the fog nets. Measurements of meteorological variables such as temperature, humidity and wind are created at 5 minute intervals. We provide at both the 5-minute time scale, and an aggregated 2-hour time scale. Yield predictions are made for every two hour interval during the test periods.

weather station



  • percip_mm - Perciptitation (mm)
  • humidity - A measure of the humidity in the air
  • temp - The temperature
  • leafwet450_min - Leaf wetness (a measure of the presence of dew) sensor 1
  • leafwet460_min - Leaf wetness (a measure of the presence of dew) sensor 2
  • leafwet_lwscnt - Leaf wetness (a measure of the presence of dew) sensor 3
  • gusts_ms - A measure of the highest gust during the reporting interval
  • wind_dir - The dominant direction the wind is blowing in
  • wind_ms - A measure of the current wind speed

Dates and times available

We provide microclimate data on two time scales: The sensor array natively records measurements every 5 minutes. This is too noisy to predict the yield effectively, so we are asking you to make predictions for the yield every 2 hours. For convenience, we provide you with both the native (5 minute interval) data and resampled (every 2 hours) microclimate measurements. It should be less complicated to use the microclimate data that matches the prediction interval (every 2 hours), but you may be able to extract more information from the finer-grained 5 minute interval data.

Note: not all of the measurements are available for the entire time period. Data is unavailable for November 2014 and March - July 2015. At different intervals, measurements may be missing from the microclimate data. For some time windows in the test set, the microclimate weather data has been intentionally withheld. Competitors may want to interpolate these missing values either using the microclimate or macroclimate data.

Macroclimate data

The macroclimate data consists of measures from one weather station on the coast of Morocco in Sidi Ifni and two airport weather stations in the cities of Agadir and Guelmim. These weather stations collect many metorological measurements. The variables recorded differ between the Sidi Ifni weather station and the two airports (Agadir and Guelmim). A description of these variables is below.

Weather Station Variables (Sidi Ifni)

  • Nh - Amount of all the CL cloud present or, if no CL cloud is present, the amount of all the CM cloud present
  • Tx - Maximum air temperature (degrees Celsius) during the past period (not exceeding 12 hours)
  • DD - Mean wind direction (compass points) at a height of 10-12 metres above the earth's surface over the 10-minute period immediately preceding the observation
  • tR - The period of time during which the specified amount of precipitation was accumulated
  • Tn - Minimum air temperature (degrees Celsius) during the past period (not exceeding 12 hours)
  • ff10 - Maximum gust value at a height of 10-12 metres above the earth's surface over the 10-minute period immediately preceding the observation (meters per second)
  • Tg - The minimum soil surface temperature at night. (degrees Celsius)
  • Td - Dewpoint temperature at a height of 2 metres above the earth's surface (degrees Celsius)
  • Date / Local time - Local time in this location. Summer time (Daylight Saving Time) is taken into consideration
  • Po - Atmospheric pressure at weather station level (millimeters of mercury)
  • E' - State of the ground with snow or measurable ice cover.
  • Ff - Mean wind speed at a height of 10-12 metres above the earth's surface over the 10-minute period immediately preceding the observation (meters per second)
  • RRR - Amount of precipitation (millimeters)
  • E - State of the ground without snow or measurable ice cover
  • H - Height of the base of the lowest clouds (m)
  • ff3 - Maximum gust value at a height of 10-12 metres above the earth's surface between the periods of observations (meters per second)
  • sss - Snow depth (cm)
  • N - Total cloud cover
  • P - Atmospheric pressure reduced to mean sea level (millimeters of mercury)
  • U - Relative humidity (%) at a height of 2 metres above the earth
  • T - Air temperature (degrees Celsius) at 2 metre height above the earth's surface
  • VV - Horizontal visibility (km)
  • WW - Present weather reported from a weather station.
  • Ch - Clouds of the genera Cirrus, Cirrocumulus and Cirrostratus
  • Cm - Clouds of the genera Altocumulus, Altostratus and Nimbostratus
  • Cl - Clouds of the genera Stratocumulus, Stratus, Cumulus and Cumulonimbus
  • Pa - Pressure tendency: changes in atmospheric pressure over the last three hours (millimeters of mercury).
  • W2 - Past weather (weather between the periods of observation) 2
  • W1 - Past weather (weather between the periods of observation) 1

Airport Weather Variables (Agadir, Guelmim)

  • W'W' - Recent weather phenomena of operational significance
  • c - Total cloud cover
  • VV - Horizontal visibility (km)
  • DD - Mean wind direction (compass points) at a height of 10-12 metres above the earth's surface over the 10-minute period immediately preceding the observation
  • WW - Special present weather phenomena observed at or near the aerodrome
  • P - Atmospheric pressure reduced to mean sea level (millimeters of mercury)
  • ff10 - Maximum gust value at a height of 10-12 metres above the earth's surface over the 10-minute period immediately preceding the observation (meters per second)
  • U - Relative humidity (%) at a height of 2 metres above the earth
  • T - Air temperature (degrees Celsius) at 2 metre height above the earth's surface
  • Ff - Mean wind speed at a height of 10-12 metres above the earth's surface over the 10-minute period immediately preceding the observation (meters per second)
  • Td - Dewpoint temperature at a height of 2 metres above the earth's surface (degrees Celsius)
  • Date / Local time - Local time in this location. Summer time (Daylight Saving Time) is taken into consideration
  • Po - Atmospheric pressure at weather station level (millimeters of mercury)

Dates and times available

For the macroclimate data, historical measurements are provided for the length of the period of interest. The interval depends on the weather station and there are some brief missing time periods in the dataset.

Note: The macroclimate data has not been divided into test/training sets. The data is provided thanks to rp5.ru.

Target variable

The target variable for this competition is the yield of the fog net array. This measures how much water the nets collect. Water that condenses on the net runs down a gutter to a tipping bucket. Every time the bucket fills, a counter increases. The yield measures the number of tippings across the net system for each 2-hour period.

  • yield - the (rescaled) amount of water that the system yielded as measured at a particular time

Submission format

The submission format is the same as the target variable. You must predict a float value for each of the time periods in the submission format file.

yield
2013-11-24 00:00:00 0.0
2013-11-24 02:00:00 0.0
2013-11-24 04:00:00 0.0
2013-11-24 06:00:00 0.0
2013-11-24 08:00:00 0.0

One last word

Remember, there is a time-based component to this problem. The most useful algorithms for generalizing will not pollute their predictions with data from the future that wouldn't be available at prediction time.

Good luck and enjoy this problem! If you have any questions you can always visit the user forum!