On Top of Pasketti: Children’s Speech Recognition Challenge - Phonetic Track

Develop automatic speech recognition models that produce phone-level transcriptions of children’s speech in IPA. #education

$50,000 in prizes
6 weeks left
195 joined

Code submission format

This is a code execution challenge! Rather than submitting your predicted labels, you'll package everything needed to perform inference and submit that for containerized execution. The runtime repository contains the complete specification for the runtime and gives you tools for local testing. If you want to learn more about how our code execution competitions work, check out our blog post for a peek behind the scenes.

The general process for making a submission is:

  1. Create a properly formatted submission.zip containing your model and code for performing inference
  2. Debug and verify your submission works correctly by:
    • Testing your submission locally using Docker
    • Submitting as a smoke test to test against a small dataset
  3. Submit as a normal submission to predict on the full test set

Results of your submissions are tracked on the following pages via the left-hand navigation menu:

  • The Code jobs page: this tracks the execution status of your submissions. You can view logs from running your code here. Note: Log output is limited to 500 lines, with up to 300 characters per line.
  • The Submissions page: this tracks the scores of your predictions. Errors that happen during scoring will also appear here.

What to submit

Your submission will be a ZIP archive (e.g., submission.zip). The root level of the submission.zip file must contain a main.py Python script which performs inference in the competition execution environment and writes your predictions to the required output file.

Here's an example of what a submission might look like. The only file that must exist in your submission.zip is main.py. Note that you should make sure that main.py is in the root layer of the archive. Be careful if you're trying to zip up a folder containing your main.py—this often causes it to be nested inside the folder.

submission.zip
├── main.py                 # Required: entrypoint script (executed by the evaluator)
├── my_module/
│   ├── preprocessing.py
│   ├── model.py
│   └── ...
└── my_model_weights.bin

To help develop and test your solution, we provide you with the runtime repository. You can find a few working example submissions there.

What happens during execution

During code execution, your submission will be unzipped and run in our cloud compute cluster. The container will run your main.py script. The script will have access to the following directory structure:

/code_execution/
├── data/
│   ├── audio/
│   │   └── ...
│   ├── submission_format.jsonl
│   └── utterance_metadata.jsonl
├── src/
│   └── ...your files are unzipped into here...
└── submission/
    └── ...you should write your predictions here...

The working directory will be /code_execution. Your code will be unzipped into ./src/ and the test data will be available in ./data/.

File manifest

Within the execution container, you will have access to a file manifest named utterance_metadata.jsonl. This manifest is intended to support efficient audio loading, validation, and processing. It provides metadata for each audio utterance in the test set and contains one JSON object per line with the following fields:

  • utterance_id (str) - unique identifier for each utterance
  • audio_path (str) - path to the corresponding .flac audio file relative to the /audio directory, following the pattern audio/{utterance_id}.flac
  • audio_duration_sec (float) - duration of the audio clip in seconds
  • md5_hash (str) - MD5 checksum of the audio file, used for integrity verification
  • filesize_bytes (int) - size of the audio file in bytes

Output format

Your code must write a JSON Lines (JSONL) file containing one prediction per utterance.

Each line must include:

Be sure to only include characters defined in the scoring script's set of valid IPA characters.

The submission should be written to ./submission/submission.jsonl relative to the working directory.

Testing your submission

Before you make a full submission, you should first test locally, and then submit a smoke test submission.

Testing your submission locally

You should first and foremost test your submission locally using Docker. This is a great way to work out any bugs and ensure that your model performs inference successfully. See the runtime repository's README for further instructions. The runtime repository includes a small demo dataset of 3 audio files.

Smoke tests

For additional debugging, we provide a "smoke test" environment that replicates the test inference runtime but runs only on a small set of audio files. In the smoke test runtime, data/ contains 3,000 audio files from the training set. Smoke tests are not considered for prize evaluation and are intended to let you test your code for correctness.

Submission checklist

  • Submission includes main.py in the root directory of the ZIP archive. There can be additional Python modules if needed—see example above.
  • Submission contains any model weights that need to be loaded. There will be no network access.
  • Script loads the data for inference from the data/ subdirectory of the working directory. This directory is read-only.
  • Script writes predictions to submission/submission.jsonl relative to the working directory. The format of this file must match the submission format exactly. You can use data/submission_format.jsonl as a template.

Please be aware that the machines for executing your code are a shared resource across competitors, so please be conscientious in your use of them. Thoroughly test your submissions locally, add progress information to your logs, and cancel jobs that will fail because you expect them to run into the time limit.

Runtime environment and constraints

Your code will be executed within a container whose image is defined in our runtime repository. The limits are as follows:

  • Your submission must be able to run using Python 3.11 and the Python dependencies defined in the runtime repository. You can see dependencies specified in the pyproject.toml and the uv.lock files.
  • The submission must complete execution in 2 hours or less.
  • The container will have access to the following hardware:
    • A single NVIDIA A100 GPU with 80 GiB of GPU VRAM
    • 24 vCPUs
    • 220 GiB RAM
  • The container will not have network access. All necessary files (code and model assets) must be included in your submission.
  • The container will not have root access to the filesystem.
  • Logging display is limited to 500 lines and 300 characters in each line

Requesting additional dependencies

Since the container will not have network access, all packages must be pre-installed. We are happy to add packages as long as they do not conflict and can build successfully. Python packages should be available via PyPI. To request an additional package be added to the runtime environment, follow the instructions in the runtime repository.

Note: The runtime environment currently supports PyTorch but not TensorFlow. We strongly encourage you to work within the PyTorch ecosystem. There will be a high bar for adding TensorFlow support.

Happy building! Once again, if you have any questions or issues you can always head on over the user forum!