Mars Spectrometry 2: Gas Chromatography

Help NASA scientists identify the chemical composition of rock and soil samples for Mars planetary science. #science

$30,000 in prizes
oct 2022
538 joined



Did Mars ever have environmental conditions that could have supported life? This is one of the key questions in the field of planetary science. NASA missions like the Curiosity and Perseverance rovers have a rich array of capabilities that can help answer this question, one of which is collecting rock and soil samples and taking measurements that can be used to determine their chemical makeup. These samples can be analyzed for chemical signatures that indicate the environment's habitability, or potentially even signs of microbial life directly.

In the previous DrivenData competition Mars Spectrometry: Detect Evidence for Past Habitability, competitors built models to predict the presence of ten different potential compounds in soil and rock samples using data collected using a chemical technique called evolved gas analysis (EGA). For this competition, participants will be using data collected using another method of chemical analysis that is used by the Curiosity rover's SAM instrument suite—gas chromatography–mass spectrometry (GCMS).


In this challenge, your goal is to build a model to automatically analyze mass spectrometry data for geological samples of scientific interest in understanding the present and past habitability of Mars.

Specifically, the model should detect the presence of certain families of chemical compounds in data collected from performing gas chromatography–mass spectrometry (GCMS) on a set of geological material samples. The winning techniques may be used to help analyze data from Mars, and potentially even inform future designs for planetary mission instruments performing in-situ analysis. In other words, one day your model might literally be out-of-this-world!

Some additional notes:

  • Understanding the data. The mass spectrometry data used in this competition can require specialist knowledge to interpret. See the Problem Description page for discussion of the data that may inform your data processing and feature engineering. If you have questions, please feel welcome to ask on the community forum.

  • External data use. As noted in the Challenge Rules, external data and pre-trained models are allowed in this competition as long as they are freely and publicly available. At the end of the challenge, top-performing participants will need to publicly share or document any external data used in order to be eligible for a prize.

  • Research nature. A focus of this challenge is to feature a new dataset for research and to engage planetary geologists, analytical chemists, and data scientists in working with it. As with any research dataset like this one, initial algorithms may pick up on correlations that are incidental to the task. Solutions in this challenge are intended to serve as a starting point for continued research and development. The challenge organizers intend to make the data available online after the competition for ongoing improvement.

Timeline and Prizes

Competition End Date:

Oct. 31, 2022, 11:59 p.m. UTC

Place Prize Amount
1st $15,000
2nd $7,500
3rd $5,000
Bonus $2,500

Bonus Prize: Best write-up of methods

The 5 top-scoring performers in this competition will be invited to submit a brief write-up of their modeling methodology. A bonus prize will be awarded to the best write-up as selected by a judging panel of subject-matter experts based on factors including interpretability, innovativeness, and potential for future implementation with real Mars data, such as from SAM.


The competition will have two phases with a timed release of additional labels that can be used for training. See the Problem Description page for more details about the dataset splits.

  • Phase 1: Development – August 31–September 29, 2022
  • Phase 2: Final Training – September 30–October 31, 2022 (Validation set labels released)

How to compete

  1. Click the “Compete” button in the sidebar to enroll in the competition
  2. Get familiar with the problem through the overview and problem description. You might also want to reference some of the additional resources from the about page.
  3. Download the data from the data tab
  4. Create and train your own model. The benchmark blog post is a good place to start.
  5. Use your model to generate predictions that match the submission format
  6. Click “Submit” in the sidebar, and “Make new submission”. You’re in!

This challenge is in collaboration with NASA.

Image courtesy of NASA/JPL-Caltech/MSSS.