Pose Bowl: Detection Track

Develop algorithms for use on inspector spacecraft that take and process photos of other ships. In the Detection Track, solutions will identify the boundaries of generic spacecraft in photos. #science

$12,000 in prizes
may 2024
651 joined

In this challenge, you will help to develop new methods for conducting spacecraft inspections by identifying the boundaries of a target spacecraft in an image.

The challenge consists of a spacecraft detection track and a pose estimation track. In the spacecraft detection track (that's this one!) your task will be to draw bounding boxes around the spacecraft in an image. In the pose estimation track, you will identify changes in the position and orientation of the chaser spacecraft camera across a sequence of images.


Spacecraft inspection in the context of this challenge is the process of closely examining a spacecraft in orbit to assess its condition and functionality. Currently, the most common methods for spacecraft inspection may rely on astronauts to conduct inspections, which can be time-consuming and dangerous, or use expensive and heavy equipment such as LIDAR sensors and robotic arms. Instead, the methods being explored in this challenge involve a small inspector vehicle (or "chaser") that is deployed from a host spacecraft in order to inspect the same host spacecraft. Compared with existing methods, the inspector vehicle is relatively inexpensive, making use of affordable, lightweight cameras and off-the-shelf hardware.

This challenge is specifically geared towards addressing a couple of the main operational hurdles in spacecraft inspection. Firstly, the images in the challenge dataset require that successful solutions work on a variety of different, unknown, and potentially damaged spacecraft types. Secondly, solutions must run in our code execution platform in an environment that simulates the small, off-the-shelf computer board on a NASA R5 spacecraft that demonstrates inspection technologies.



The data for this challenge consists of simulated images of spacecraft taken from a nearby location in space, as if from the perspective of a chaser spacecraft. Images were created using the open-source 3D software Blender using models of representative host spacecraft against simulated backgrounds. A limited number of models and backgrounds were used to generate images; models and backgrounds appear in multiple images.

Distortions were applied to some images in post-processing to realistically simulate image imperfections that result from camera defects or field conditions. The distortions applied to the images include blur (e.g., as chaser vehicle camera would be moving), hot pixels (a common defect in which some pixels have an intensity value of 0 or 255), and random noise (a typical distortion method to support generalizability and robustness).

Below is an example of a typical image. In this example, the host spacecraft can be seen slightly below the horizon. A bounding box has been drawn around the host spacecraft using the ground truth dataset.

Supplemental no-background images

Solutions in this challenge should generalize to unseen backgrounds. An additional set of images without backgrounds is provided as optional, supplemental training data. You may use these data during training and selection to improve your final solution's generalizability beyond the small number of backgrounds represented in the training set.

This data is available as no_backgrounds.tar.gz on the Data Download page. In this archive, you will find an images/ folder (just as with the other training data) as well as a no_backgrounds.csv containing bounding boxes as well as a column identifying which spacecraft model is present, which may be helpful for your own train/test split purposes.


The ground truth data for the public set is available in the train_labels.csv on the Data Download page. This CSV file contains a row for each image in the public set, along with the bounding box coordinates for the spacecraft in that image. Here's what the first few rows look like:

image_id xmin ymin xmax ymax
0001f9a1bf2e194920cbae522a254160 119 596 234 705
00025a7fad8d9f15a7287733efcfb722 1124 63 1255 171
0002a91dcaeefad79e6113378d7e091f 140 545 200 587
0003f4c496f43a558dd8fb9a6c29d544 909 615 1078 1024
000610d7b252ffe5ee988bf04c0424c5 721 616 792 683
... ... ... ... ...

Your task on this track of the challenge is to take a set of test images as inputs to your solution and produce bounding boxes identifying the spacecraft in each image. Your bounding boxes should encompass the entirety of the spacecraft, including all appendages (antennae, solar panels, instrumentation, etc) and not just the main body of the spacecraft.


On the data download page, you will find a train_metadata.csv containing categorical indicators for which spacecraft and backgrounds used in each of the images. This may be a helpful resource for you to set up your train, test and validation splits to ensure your model will generalize well to new spacecraft and backgrounds.

Camera intrinsic parameters

Camera intrinsic parameters are the internal characteristics of a camera which define how the camera projects a 3D scene onto a 2D image. These can help to correct for distortions, and infer the scale and depth of objects. For this challenge's dataset, the intrinsic matrix is:

[[5.2125371e+03 0.0000000e+00 6.4000000e+02]
[0.0000000e+00 6.2550444e+03 5.1200000e+02]
[0.0000000e+00 0.0000000e+00 1.0000000e+00]]

Public vs private datasets

Data generated for this challenge have been divided into different datasets, some of which you will have direct access to for the purpose of training your solution.

Public training data consisting of images and ground truth labels will be made available for all solvers to download. Training data labels and instructions for downloading the training data images are available on the data download page.

Private test data for evaluating your solution's performance will be mounted to the code execution runtime. It will not be directly available to challenge solvers. Performance will be evaluated in two ways, using different splits or subsets of this private data. One subset of the private data will be used to calculate your public leaderboard score, and another will be used to calculate a private leaderboard score. The private leaderboard score will determine final prize rankings.

The public training data and private data used for evaluation are identical in format and were all drawn from the same data generation and post-processing pipeline. However, their contents are slightly different: The training data and private datasets each contain different subsets of spacecraft model and background images. Some spacecraft models will be represented in all datasets, some spacecraft models will only be present in the training data, others will only be present in the test set used to calculate public leaderboard scores, and still others will only be present in the test set used to calculate private leaderboard scores. Similarly, some background types may appear for the first time in the private test set. Importantly, the two private test datasets contain the same ratio of unique spacecraft models and background images relative to the public training set, and there is nothing notable about the unqiue models that appear in the test sets. The two test datasets were designed to resemble the training dataset equally.

Note: As a result of the above process for determining your public and private leaderboard scores, there is no benefit to "overfitting" the public leaderboard. Using sound ML practices such as cross-validation to produce robust solutions that generalize well to new feature conditions (e.g., beyond the limited number of backgrounds included in the training set) is the best way to ensure that your private score is comparable to your public score.

Accessing the data

Check out the Data Download page to download training labels, a submission format file, and instructions on downloading the training data on AWS S3. The training dataset is quite large and is made available in smaller (several GB) chunks, each containing a random subset of images. You may want to initially develop your model with a subset of data, and then incorporate more training data after establishing your modeling approach.

The benchmark blog post is a great place to start to get familiar with the data and the challenge setup. This will walk you through how to download and explore the spacecraft imagery dataset, and then show you how to create a valid submission.

Performance metric

To measure your solution’s performance, we’ll use a metric called Jaccard index, also known as Generalized Intersection over Union (IoU). Jaccard index is a similarity measure between two label sets. In this case, it is defined as the size of the intersection divided by the size of the union of pixels. Because it is an accuracy metric, a higher value is better. The Jaccard index can be calculated as follows:

$$J(A, B) = \frac{\left|A\cap B\right|}{\left|A\cup B\right|} = \frac{\left|A\cap B\right|}{\left|A|+|B|-|A\cap B\right|}$$

where |$A$| is the set of true pixels and |$B$| is the set of predicted pixels. Your overall score will be the average Jaccard index over all the images for which you make predictions.

Submission format

This is a code execution challenge, which means that you'll be submitting a script that will run inference on an unseen test set in our code execution platform. This setup allows us to simulate the computational constraints a cubesat would actually have in orbit.

If you've never participated in a code execution challenge before, don't be intimidated! We make it as easy as possible for you.

See the code submission format page for details on how to make your submission.

Runtime constraints

Our sponsors at NASA hope to adapt the winning solutions for use on future flights, which means your code could one day be launched into orbit. With that goal in mind, the code execution platform is configured to simulate the constraints of the ultimate production environment... SPACE! 🚀

This means that the resources available to you in running your solution will be comparable to the off-the-shelf hardware available to an actual R5 spacecraft, which may be somewhat less powerful than you're used to, especially if you typically use a GPU for these types of tasks.

Your submissions will be run on an A4 v2 virtual machine with an Intel processor and the following constraints:

  • No GPU
  • Limited to 3 cores
  • Limited to 4 GB of RAM
  • Your submission must complete execution in 1.75 hours or less.

Additional rules

In order to ensure that solutions are useful for our competition sponsors, the following rules are also in place:

  • Your solution must identify spacecraft based on the visual content of the images (i.e., the pixel data). You should not attempt to use any other image data or metadata to make predictions.
  • Images must be processed one at a time. Parallelization of multiple images is not permitted.
  • Your solution must complete execution within a time limit. For the detection track, your time is limited to 105 mins (1.75 hours) which allows for about 1 second per test image, plus a buffer for overhead and image loading.

All submissions must complete execution within the maximum allowed time, but the most useful solutions will run even faster. For this reason, top performing solutions are eligible for a speed bonus prize.

Note: Finalist solutions that don't comply with these rules may be disqualified. Please ask questions on the discussion forum if anything is unclear.

Good luck! If you have any questions or issues you can always head over to the user forum!