Navigation

Code submission format

In a typical competition, you would craft your algorithms and generate outputs for the evaluation dataset on your local machine. Then you would submit the output to the competition for scoring.

For this competition, you'll submit your model files with the code to make predictions, and we will generate outputs for the evaluation dataset in a containerized runtime in our cloud environment. The runtime repository contains the complete specification for the runtime.

Submissions
What to submit

Execution Environment
Runtime
Requesting packages

What to submit

Your final submission should be a zip archive named with the extension .zip (for example, submission.zip). The root level of the submission.zip file must contain a predict.py which implements a function predict_dst that can take up to seven days worth of data and make a prediction for the current hour t and the hour after that t+1 as follows:

def predict_dst(
    solar_wind_7d: pd.DataFrame,
    satellite_positions_7d: pd.DataFrame,
    latest_sunspot_number: float,
) -> Tuple[float, float]:
    """
    Take all of the data up until time t-1, and then make predictions for
    times t and t+1.

    Parameters
    ----------
    solar_wind_7d: pd.DataFrame
        The last 7 days of satellite data up until (t - 1) minutes [exclusive of t]
    satellite_positions_7d: pd.DataFrame
        The last 7 days of satellite position data up until the present time [inclusive of t]
    latest_sunspot_number: float
        The latest monthly sunspot number (SSN) to be available

    Returns
    -------
    predictions : Tuple[float, float]
        A tuple of two predictions, for (t and t + 1 hour) respectively; these should
        be between -2,000 and 500.
    """

    ########################################################################
    #                         YOUR CODE HERE!                              #
    ########################################################################

    # this is a naive baseline where we just guess the training data mean every time
    prediction_at_t0 = -12
    prediction_at_t1 = -12

    return prediction_at_t0, prediction_at_t1

This function will be called by our main loop (see main.py) many times with seven days of data at a time. We do not guarantee any particular order and we expect that you will not try to maintain state between calls. Making fast predictions on a small subset of past data is an explicit design goal of this challenge.

Note: Your predictions must be in the physically plausible region between -2,000 and 500. Predictions outside this region will cause your submission to be rejected.

Your code will not have network access so you should also package up any necessary resources. Your function may load model artifacts, call into other Python files, and use other resources you have packaged into the zipped submission. You may not load the data files in /data directly.

The data that gets passed to your predict_dst function is identical to the data described in the problem description, but limited to the seven days leading up to a prediction time. Here is what these look like assuming that you are at timedelta 44 days 00:00:00 and making a Dst prediction for t=44 days 00:00:00 (now) and t+1=44 days 01:00:00 (one hour from now):

`solar_wind_7d`

The solar wind data is provided per minute, so each of the seven day dataframes will have 10,080 rows like this:

	bx_gse	by_gse	bz_gse	theta_gse	phi_gse	bx_gsm	by_gsm	bz_gsm	theta_gsm	phi_gsm	bt	density	speed	temperature	source
timedelta
37 days 00:00:00	-5.26	2.45	1.62	15.46	155.34	-5.26	2.45	1.62	15.46	155.34	6.12	3.65	353.56	119329.0	ac
37 days 00:01:00	-5.38	2.23	1.90	17.82	157.79	-5.38	2.23	1.90	17.82	157.79	6.21	3.92	354.05	103905.0	ac
37 days 00:02:00	-5.31	1.85	1.94	18.73	161.07	-5.31	1.85	1.94	18.73	161.07	6.05	4.18	353.87	102326.0	ac
37 days 00:03:00	-5.25	1.64	1.90	18.86	162.75	-5.25	1.64	1.90	18.86	162.75	5.88	4.15	350.32	109681.0	ac
…
43 days 23:57:00	-6.43	0.41	2.27	19.13	176.29	-6.43	0.41	2.27	19.13	176.29	6.93	2.78	393.93	30021.0	ac
43 days 23:58:00	-6.49	0.37	2.29	19.06	176.73	-6.49	0.37	2.29	19.06	176.73	6.98	3.00	394.35	27075.0	ac
43 days 23:59:00	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN

As you can see, some of these may be missing (as seen in the last row). You will have to choose a sensible way to handle missing (NaN) values.

`satellite_positions_7d` (pandas DataFrame)

This is daily data, so you will get seven rows:

	gse_x_ace	gse_y_ace	gse_z_ace	gse_x_dscovr	gse_y_dscovr	gse_z_dscovr
timedelta
38 days	1544159.2	-162085.4	86051.1	NaN	NaN	NaN
39 days	1543593.2	-169941.3	75850.4	NaN	NaN	NaN
40 days	1542170.1	-175305.5	71280.5	NaN	NaN	NaN
41 days	1540515.5	-180435.9	66649.7	NaN	NaN	NaN
42 days	1538486.4	-185278.1	61941.0	NaN	NaN	NaN
43 days	1536138.6	-189933.8	57176.8	NaN	NaN	NaN
44 days	1533530.2	-194457.0	52370.3	NaN	NaN	NaN

`latest_sunspot_number` (float)

Since these SSNs come only once per month, you will simply get the latest one, e.g. 76.9.

For more detail on how to create and test your submission, visit the runtime repository.

Runtime

Your code is executed within a container that is defined in our runtime repository. The limits are as follows:

Your submission must be written in Python (3.8.5) and use the packages defined in the runtime repository.
Your code may not read the files in /data directly. Doing so is grounds for disqualification. Instead, you will implement a function as described above. Using I/O or global variables to pass information between calls, or other attempts to circumvent the setup of this prediction challenge are grounds for disqualification. If in doubt whether something like this is okay, you may email us or post on the forum.
The submission must complete execution in 8 hours or less, and no single prediction can take more than 30 seconds (we expect each prediction to take far shorter).
The container has access to 4 vCPUs and 14GB RAM. There are no GPUs available.
The container will not have network access. All necessary files (code and model assets) must be included in your submission.
The container execution will not have root access to the filesystem.

The cluster for executing your code is a shared resource across participants. We request you are conscientious in your use of them. Please add progress information to your logs and cancel jobs that will run longer than the time limit. Canceled jobs won't count against your submission limit, and this means more available resources to score submissions that will complete on time.

Requesting package installations

Since the Docker container will not have network access, all packages must be pre-installed. We are happy to consider additional packages as long as they are approved under operational constraints by the challenge organizers, do not conflict and can build successfully. Packages must be available through conda for Python 3.8.5. To request an additional package be added to the docker image, follow the instructions in the runtime repository.

Happy building! Once again, if you have any questions or issues you can always head on over the user forum!

MagNet: Model the Geomagnetic Field

Quick Facts

Participants

No. of Entries

Prize

Winner

Ammarali32

Navigation

Code submission format

What to submit

`solar_wind_7d`

`satellite_positions_7d` (pandas DataFrame)

`latest_sunspot_number` (float)

Runtime

Requesting package installations

On this page

Quick Facts

Participants

No. of Entries

Prize

Winner

Ammarali32

Navigation

Code submission format

What to submit

solar_wind_7d

satellite_positions_7d (pandas DataFrame)

latest_sunspot_number (float)

Runtime

Requesting package installations

On this page

`solar_wind_7d`

`satellite_positions_7d` (pandas DataFrame)

`latest_sunspot_number` (float)