Youth Mental Health Narratives: Novel Variables

Discover novel trends from narratives about youth suicides using machine learning techniques. #health

$25,000 in prizes
Completed nov 2024
372 joined

Overview

Suicide is one of the leading causes of death in the United States for 5-24 year-olds. In order to better understand the circumstances around youth suicides and inform potential interventions, researchers and policymakers rely on several narrative datasets. One of these datasets, the National Violent Death Reporting System (NVDRS) includes data abstracted on the state level from multiple sources including law enforcement, coroner/medical examiners, toxicology, and death certificates.

The NVDRS currently contains over 600 variables extracted from these narratives, but unsupervised machine learning techniques have the potential to uncover more information that could be useful to researchers and policymakers.

The objective of this challenge is to apply machine learning techniques to narratives in the NVDRS dataset and extract novel variables that could be used to advance youth mental health research. The competition results will inform how youth mental health data is tracked, contributing to more effective research into protecting youth mental health. This competition presents a unique opportunity to work with a dataset that has not been de-identified and made public before. In the Automated Abstraction track, participants will automate the population of existing NVDRS variables from narrative text.


Competition End Date:

Nov. 21, 2024, 11:59 p.m. UTC

Place Prize Amount
1st $10,000
2nd $6,000
3rd $4,000

Midpoint Submission Bonus Prizes

$1,000 will be awarded to the most promising midpoint submissions. Up to five midpoint submissions will be chosen. Midpoint submissions are due by October 10, 2024.

Midpoint submissions will be judged using the same criteria as the final submissions. See the Problem Description page for details.

Note on prize eligibility: Federal employees acting within the scope of their employment and federally-funded researchers acting within the scope of their funding are not eligible to win a prize in this challenge.


How to compete

  1. Click the "Compete!" button in the sidebar to enroll in the competition. To sign up, all competitors will be required to sign a Data Sharing Agreement which governs the protection and use of sensitive data in the National Violent Death Reporting System.
  2. Get familiar with the problem through the problem description. You might also want to reference additional resources available on the about page.
  3. Download the data from the data tab.
  4. Explore the data, and propose a new variable to extract from the narratives.
  5. Format your submission according to the submission format.
  6. Click "Submit" in the sidebar, and upload your submission PDF file. You're in! You can change your submission at any point before the final deadline by removing your previous submission, and then re-submitting a new file.

Competition rules

The challenge rules are in place to promote fair competition and useful solutions. If you are ever unsure whether your solution meets the competition rules, ask the challenge organizers in the competition forum or send an email to info@drivendata.org. A few key rules are highlighted below. For more details, see the External Data and Models guidance and the full competition rules.

Use of competition data

When agreeing to the competition terms, all competitors will be required to sign a Data Sharing Agreement which governs the protection and use of sensitive data in the National Violent Death Reporting System. All competitors must abide by the Terms and Conditions of the Data Sharing Agreement to be eligible for access and submission.

Although competition data is de-identified, it is still highly sensitive and confidential. Participants cannot use competition data for any purpose other than the competition, and should delete competition data after the competition has ended.

External model usage

Use of external models is allowed in this competition provided they are freely and publicly available to all participants under a permissive open source license.

However, competition data cannot be shared, duplicated, or published. This includes uploading competition data to any third-party service that will retain the data. For example, participants cannot upload competition data to ChatGPT or Gemini, but can download open-source model weights and run a model locally.


Support

This challenge is organized on behalf of CDC with support from NASA. The contents do not necessarily represent the official views of the CDC.

Statutory authority to conduct the challenge: 15 U.S.C. 3719, Section 1703(a) of the Public Health Service Act (PHSA), 42 U.S.C. 300u-2(a). DrivenData is designing and administering the challenge under contract with the NASA Tournament Lab, under Federal Acquisition Regulation (FAR) procurement regulations authority and in collaboration with CDC’s National Center for Injury Prevention and Control.


Image credit: Zoe on Unsplash, jannoon028 on Freepik