Pri-matrix Factorization Hosted By Max Planck Institute for Evolutionary Anthropology
Your model should identify the animals (or lack thereof) in a given Chimp&See video. There are 24 categories in total: 23 animal categories plus 1 category corresponding to no animal. Each video is identified by a 10 character alphanumeric string followed by a
abcde12345.mp4. This index is referred to as the video's
filename. Given a video file and it's
filename as input, your trained model should output a list of 24 probabilities corresponding to the model's confidence that each respective category is present in the video.
We have used the crowd-sourced annotations from Chimp&See to generate ground truth labels for each video in the dataset. Some videos have no animals in them, in which case the
blank category of the video's labels will be
1 and all other columns will be
0. Otherwise, if a species is present its entry will be a
1. Multiple species may be present!
Wisdom of the masses: a note on crowdsourcing the truth. We have taken many steps to go from raw annotations to a well-labeled dataset. This includes enforcing certain thresholds on how many user annotations are required to accept a label as well as thresholds related to percentages of user agreement. That said, this technique for leveraging crowdsourced data is uncharted territory and there is bound to be some noise!
The features in this dataset
The only features in this challenge are the videos themselves, named as
subject_id.mp4. Each video is 15 seconds long, but it's unlikely that you'll need all 15 seconds of frames to make a good prediction. Whether or not you downsample the videos is up to you!
To help facilitate faster model prototyping, we've created two downsampled versions of the dataset, referred to as "Micro" and "Nano." See the table below for details about each version.
|Dataset Version||Size||Resolution (px)||Audio Channel|
|Raw||1 TB||960 × 540 (typically, but not all uniform)||yes|
|Micro||3.46 GB||64 x 64||no|
|Nano||1.4 GB||16 x 16||no|
There are 24 categories which may be present or absent in each video. If a
blank label is present, all other categories will be absent. For non-blank categories, multiple may be present.
Video label example
For example, a single label in the dataset may have these values, indicating the presence or absence of categories in video abc0000123.mp4:
Performance is evaluated according to a mean aggregated binary log loss. For each possible category in a video the binary log loss will be computed then the results will be summed (this accounts for potential presence of multiple species in a single video). The sum of the binary losses represents the total loss for the video. The competitor that minimizes the mean value of this loss over all test cases will top the leaderboard.
The format for the submission file is
filename, followed by all categories and a floating point representation of the probability that category
X is present in the video. In the extreme case, every non blank category could be, say,
0.99, which would indicate strong confidence that 24 categories of animals are present in the video.
.csv file that you submit would look like:
subject_id,bird,blank,cattle,chimpanzee,elephant,forest buffalo,gorilla,hippopotamus,human,hyena,large ungulate,leopard,lion,other (non-primate),other (primate),pangolin,porcupine,reptile,rodent,small antelope,small cat,wild dog,duiker,hog abc0000001.mp4,0.7777777777777778,0.0,0.0,0.0,0.0,0.1111111111111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1111111111111111,0.0,0.0,0.0 abc0000002.mp4,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0 abc0000003.mp4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.875,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0 abc0000004.mp4,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0 abc0000005.mp4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
If you're wondering how to get started, check out our benchmark blog post!
Good luck and enjoy this problem! If you have any questions you can always visit the user forum!