Hakuna Ma-data: Identify Wildlife on the Serengeti with AI for Earth Hosted By DrivenData

6 weeks left
$20,000

Submission format

Rather than submitting your predicted labels, you'll package everything needed to do inference and submit that for containerized execution on Azure.

What to submit

Your final submission should be a zip archive named with the extension .zip (for example, submission.zip). The root level of the submission.zip file must contain a main.pyor main.R which performs inference on the test images and writes the predictions to a file named submission.csv in the same directory as the main file. You can see an example of this submission set up here for Python and for R.

Here's an example.

├── inference           # Example directory
│   ├── assets          # An example folder where we include the trained model
│   │   ├── model.json
│   │   └── weights.h5
│   ├── main.R          # One of these files is required
│   └── main.py         # One of these files is required

Note: be sure that when you unzip the submission either main.py or main.R exists in the folder where you unzip. Either of these files needs to be present at the root level of the zip archive. There should be no folder that contains them.

Submission checklist

  • Submission includes main.py or main.R in the root directory of the zip. There can be extra files with more code that is called (see assets folder in the example above).
  • Submission contains any model weights that need to be loaded. There will be no network access.
  • Script loads the data for inference from the data folder in the root directory. All images for inference are in the root level of the data folder. This folder is read-only.
  • Script writes submission.csv to the root directory when inference is finished. This file must match the submission format exactly.
  • Use the versions of submission_format.csv and test_metadata.csv from the data folder provided to the container. Do not include these files with your submission. Do not read them from other locations.

If you'd like to replicate how your submission will run online, you can run run.sh which builds the container and executes the submission.

Submission format

The format for the submission file with predictions is sequence ID, followed by all categories and a floating point representation of the probability that each category is present in the video. Probabilities must be floats and can range between 0.0 and 1.0. Your submission.csv must match this format exactly.

For example, if you predicted there's a 99% chance of an aardvark for the first five sequences,
seq_id aardvark aardwolf baboon bat batearedfox buffalo ... zorilla
ABCEY 0.99 0.0 0.0 0.0 0.0 0.0 ... 0.0
ABCFN 0.99 0.0 0.0 0.0 0.0 0.0 ... 0.0
ABCLU 0.99 0.0 0.0 0.0 0.0 0.0 ... 0.0
ABCOW 0.99 0.0 0.0 0.0 0.0 0.0 ... 0.0
ABCPO 0.99 0.0 0.0 0.0 0.0 0.0 ... 0.0

Your submissions.csv file that you submit would look like:

seq_id,aardvark,aardwolf,baboon,bat,batearedfox,buffalo,bushbuck,caracal,cattle,cheetah,civet,dikdik,duiker,eland,elephant,empty,gazellegrants,gazellethomsons,genet,giraffe,guineafowl,hare,hartebeest,hippopotamus,honeybadger,hyenaspotted,hyenastriped,impala,insectspider,jackal,koribustard,leopard,lionfemale,lionmale,mongoose,monkeyvervet,ostrich,otherbird,porcupine,reedbuck,reptiles,rhinoceros,rodents,secretarybird,serval,steenbok,topi,vulture,warthog,waterbuck,wildcat,wildebeest,zebra,zorilla
ABCEY,0.99,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
ABCFN,0.99,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
ABCLU,0.99,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
ABCOW,0.99,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
ABCPO,0.99,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0

Runtime

Your code is executed within a container that is defined in our runtime repository. The limits are as follows:

The container must complete execution within a timeout of 8 hours. We expect most submissions will complete much more quickly (the benchmark runs in 20 minutes) and computation time per participant will be monitored to prevent abuse.

The container runtime has access to a single GPU. All of your code should run within the GPU environments in the container, even if actual computation happens on the CPU. (CPU environments are provided within the container for local debugging only).

The container has access to 5 vCPUs powered by an Intel Xeon E5-2690 chip and 80GB RAM.

The container has 1 Tesla V100 GPU with 16GB of memory.

The container will not have network access. All necessary files (code and model assets) must be included in your submission.

The container execution will not have root access to the filesystem.

Requesting package installations

Since the docker container will not have network access, all packages must be pre-installed. We are happy to add packages as long as they do not conflict and can build successfully.

To request an additional package be added to the docker image, follow the instructions in the runtime repository where the container is defined.