Hakuna Ma-data: Identify Wildlife on the Serengeti with AI for Earth Hosted By DrivenData
Rather than submitting your predicted labels, you'll package everything needed to do inference and submit that for containerized execution on Azure.
What to submit
Your final submission should be a zip archive named with the extension
.zip (for example,
submission.zip). The root level of the
submission.zip file must contain a
main.R which performs inference on the test images and writes the predictions to a file named
submission.csv in the same directory as the main file. You can see an example of this submission set up here for Python and for R.
Here's an example.
├── inference # Example directory │ ├── assets # An example folder where we include the trained model │ │ ├── model.json │ │ └── weights.h5 │ ├── main.R # One of these files is required │ └── main.py # One of these files is required
Note: be sure that when you unzip the submission either
main.R exists in the folder where you unzip. Either of these files needs to be present at the root level of the zip archive. There should be no folder that contains them.
- Submission includes main.py or main.R in the root directory of the zip. There can be extra files with more code that is called (see
assetsfolder in the example above).
- Submission contains any model weights that need to be loaded. There will be no network access.
- Script loads the data for inference from the
datafolder in the root directory. All images for inference are in the root level of the
datafolder. This folder is read-only.
- Script writes submission.csv to the root directory when inference is finished. This file must match the submission format exactly.
- Use the versions of
datafolder provided to the container. Do not include these files with your submission. Do not read them from other locations.
The format for the submission file with predictions is sequence ID, followed by all categories and a floating point representation of the probability that each category is present in the video. Probabilities must be floats and can range between
submission.csv must match this format exactly.
submissions.csv file that you submit would look like:
seq_id,aardvark,aardwolf,baboon,bat,batearedfox,buffalo,bushbuck,caracal,cattle,cheetah,civet,dikdik,duiker,eland,elephant,empty,gazellegrants,gazellethomsons,genet,giraffe,guineafowl,hare,hartebeest,hippopotamus,honeybadger,hyenaspotted,hyenastriped,impala,insectspider,jackal,koribustard,leopard,lionfemale,lionmale,mongoose,monkeyvervet,ostrich,otherbird,porcupine,reedbuck,reptiles,rhinoceros,rodents,secretarybird,serval,steenbok,topi,vulture,warthog,waterbuck,wildcat,wildebeest,zebra,zorilla ABCEY,0.99,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0 ABCFN,0.99,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0 ABCLU,0.99,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0 ABCOW,0.99,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0 ABCPO,0.99,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Your code is executed within a container that is defined in our runtime repository. The limits are as follows:
The container must complete execution within a timeout of 8 hours. We expect most submissions will complete much more quickly (the benchmark runs in 20 minutes) and computation time per participant will be monitored to prevent abuse.
The container runtime has access to a single GPU. All of your code should run within the GPU environments in the container, even if actual computation happens on the CPU. (CPU environments are provided within the container for local debugging only).
The container has access to 5 vCPUs powered by an Intel Xeon E5-2690 chip and 80GB RAM.
The container has 1 Tesla V100 GPU with 16GB of memory.
The container will not have network access. All necessary files (code and model assets) must be included in your submission.
The container execution will not have root access to the filesystem.
Requesting package installations
Since the docker container will not have network access, all packages must be pre-installed. We are happy to add packages as long as they do not conflict and can build successfully.
To request an additional package be added to the docker image, follow the instructions in the runtime repository where the container is defined.