Suicide is one of the leading causes of death in the United States for 5-24 year-olds. In order to better understand the circumstances around youth suicides and inform potential interventions, researchers and policymakers rely on several datasets. One key dataset is the National Violent Death Reporting System (NVDRS), which has been tracking information about violent deaths since 2003. The NVDRS dataset is based on law enforcement reports, medical examiner and coroner reports, and death certificates. The NVDRS dataset includes both narrative descriptions of each incident, and common factor variables like precipitating events. The process of generating consistent narratives and accurate factor variables is time-consuming and prone to error.
In this challenge, solvers will help CDC extract information from narratives in the NVDRS, improving both the quality and coverage of the NVDRS dataset. Higher-quality data can enable researchers across the country to better understand and prevent youth suicides on a national scale.
There are two competition tracks, each with its own associated prizes.
- In the Automated Abstraction track, solvers will apply machine learning techniques to automate the population of factor variables from NVDRS' narrative text. The algorithms developed will help streamline the process of manual abstraction and data quality control.
- In the Novel Variables track, solvers will explore the NVDRS narratives and extract novel variables that could be used to advance youth mental health research.
The data for this competition includes detailed descriptions of youth suicides. These narratives can be upsetting and difficult to read. We encourage you to prioritize your own mental health when deciding whether to work on this challenge and while engaging with the data.
If you or someone you know is struggling with mental health, you can call or text the 988 Suicide & Crisis Lifeline for 24/7, free, and confidential support. The 988 Lifeline website has additional advice and links to specialized resources.
The competitions
Automated Abstraction
Apply machine learning techniques to automate the process of data abstraction from youth suicide narratives. #health
Novel Variables
Discover novel trends from narratives about youth suicides using machine learning techniques. #health