Unsupervised Wisdom: Explore Medical Narratives on Older Adult Falls

Use unsupervised machine learning approaches to extract insights from emergency department narratives about how, when, and why older adults (age 65+) fall. #health

$70,000 in prizes
oct 2023
660 joined


The Centers for Disease Control and Prevention's National Center for Injury Prevention and Control (CDC/NCIPC) helps Americans stay safe and healthy by uncovering insights about the factors associated with the occurrence and severity of common injuries, including older adult falls.

One important source of data on injuries is the National Electronic Injury Surveillance System (NEISS). NEISS is operated by the Consumer Product Safety Commission and was designed to generate standardized data about injuries caused by or involving consumer products. NEISS works with a sample of hospitals in the United States and captures information about every patient who comes to the emergency department due to a non-fatal injury involving a product. NEISS data are captured with the help of medical abstractors who manually review all emergency department medical records and code standardized information about the patients, their injuries, and their treatment.

NEISS data are public, and anyone can download and view data that have been scrubbed of sensitive information, like information that could be used to identify patients. The CDC/NCIPC has access to a private, expanded version of these data from the NEISS All Injury Program (NEISS-AIP). The CDC/NCIPC's private, expanded dataset includes all non-fatal injuries that present to in-sample emergency departments, and it includes more detailed injury and patient information.

The CDC/NCIPC has used the NEISS and NEISS-AIP datasets to produce research relevant to reducing older adult fall risk. This research was limited to descriptive analyses and has been conducted using standard epidemiologic and statistical methods from public health (e.g., stratification by demographic variables, comparison of confidence intervals). NEISS and NEISS-AIP include narrative text data, which has been used in research, typically after a process of manual coding or dictionary-based automated coding.

Map showing the locations of NEISS emergency departments

Machine learning techniques for natural language processing (NLP) have the potential to greatly expand the CDC/NCIPC’s capacity for analyzing medical record narratives. Learning more about the circumstances of older adult falls can help inform prevention strategies. This challenge was designed to uncover creative applications of machine learning to the analysis of medical record narratives including tokenizing raw narratives, generating word embeddings, clustering narratives, and uncovering meaningful insights about older adult falls along the way.

You can learn more here: