PREPARE Challenge: Data for Early Prediction (Phase 1)

Find, curate, or contribute data to help the National Institute of Aging, a center of the National Institute of Health, create representative and open datasets that can be used for the early prediction of Alzheimer's disease and related dementias. #health

$200,000 in prizes
jan 2024
376 joined

PREPARE: Pioneering Research for Early Prediction of Alzheimer's and Related Dementias EUREKA Challenge

Phase 1 [Find IT!]: Data for Early Prediction


Alzheimer's disease and Alzheimer's disease related dementias (AD/ADRD) are a set of brain disorders affecting more than 6 million Americans. The main clinical features of AD/ADRD are progressive impairments of cognition and function and changes in behavior. Early intervention is important for successful disease modification, but detecting early signs of cognitive decline and Alzheimer's disease and related dementias (AD/ADRD) is challenging. Current clinical methods lack the sensitivity needed for early prediction. Alternative approaches (e.g. neuroimaging, fluid biomarkers, neuropsychological tasks, and digital and passive measures) have drawbacks, including cost, complexity, and accessibility. This is especially true for underrepresented groups in research.

The National Institute on Aging (NIA), a component of the National Institutes of Health (NIH), aims to diversify data resources, considering under-resourced communities disproportionately burdened by AD/ADRD. For example, factors other than the amyloid protein – which has long been considered a biomarker for AD – may play a bigger role in cognitive impairment for Asian, Black, or Hispanic older adults (Wilkins et al., 2022; Dark and Walker, 2022). Identifying new biomarkers and social determinants of health is crucial to improve early detection in these groups and to address racial and ethnic disparities in diagnoses.


The goal of the PREPARE Challenge (Pioneering Research for Early Prediction of Alzheimer's and Related Dementias EUREKA Challenge) is to inform novel approaches to early detection that might ultimately lead to more accurate tests, tools, and methodologies for clinical and research purposes. Advances in artificial intelligence (AI), machine learning (ML), and computing ecosystems increase possibilities of intelligent data collection and analysis, including better algorithms and methods that could be leveraged for the prediction of biological, psychological (cognitive), socio-behavioral, functional, and clinical changes related to AD/ADRD.

To make progress, the challenge aims to address the need for:

  • Data from a wider set of sources and types, including data relevant to low-resourced, underserved communities disproportionately impacted by AD/ADRD to better understand and address biases in existing data sources;
  • Open, shareable data, stored in trusted repositories to determine “distributional robustness” of predictive algorithms; and
  • Algorithms that meet "right to explanation" mandates (i.e., if an AI algorithm impacts people, people have a right to an explanation of how AI conclusions were reached).

Challenge structure and phases

The goal of this challenge is to spur and reward the development of solutions for accurate, innovative, and representative early prediction of AD/ADRD. To achieve this goal, the challenge will feature three phases that successively build on each other.

This first phase, Find IT!: Data for Early Prediction, is focused on finding, curating, or contributing data to create representative and open datasets that can be used for the early prediction of AD/ADRD. For more details on the different phases and associated prizes, visit the challenge home page.

Phase overview

Phase Anticipated Date Description
Phase 1 [Find IT!]: Data for Early Prediction (YOU ARE HERE) September 2023 Find, curate, or contribute data to create representative and open datasets that can be used for early prediction of AD/ADRD.
Phase 2 [Build IT!]: Algorithms and Approaches September 2024 Advance algorithms and analytic approaches for early prediction of AD/ADRD, with an emphasis on explainability of predictions.
Phase 3 [Put IT All Together!]: Proof of Principle Demonstration March 2025 Top solvers from Phase 2 demonstrate algorithmic approaches on diverse datasets and share their results at an innovation event .

Eligibility and Participation Guidelines

To be eligible to win a prize under this challenge, participants:

  • must be 18 years of age or older at the time of submission
  • must be citizens or permanent residents of the United States (in the case of a Team, the Team must identify a Team Captain who is a citizen or permanent resident of the United States; in the case of an Entity, the Entity shall be incorporated in and maintain a primary place of business in the US)
  • must not be a federal entity or federal employee acting within the scope of their employment
  • must not be an employee of the Department of Health and Human Services (HHS, or any other component of HHS) acting in their personal capacity
  • if employed by a federal agency or entity other than HHS (or any component of HHS), should consult with an agency ethics official to determine prize eligibility

Participants who receive federal funds from a grant award or cooperative agreement must either not use their funds to develop their Challenge submissions or to fund efforts in support of their Challenge submissions, or (if use of such funds is consistent with the purpose, terms, and conditions of the grant award or cooperative agreement), must participate in the Challenge as an Entity on behalf of the awardee institution, organization, or entity.

For a full list of eligibility and participation rules, please refer to the rules page and the challenge announcement.

Phase 1 key dates

Challenge launch September 1, 2023
(Optional) Midpoint review deadline for executive summary drafts
(does not apply to data collection ideas)
November 15, 2023 at 11:59:59 PM UTC
Webinar event
Presentation by the NIH team and challenge Q&A
December 13, 2023 at 3:00 PM ET
Executive summary drafts due
(does not apply to data collection ideas)
January 17, 2024 at 11:59:59 PM UTC
Final submissions due January 31, 2024 at 11:59:59 PM UTC
Finalists notified March 18, 2024
Finalists provide data access for verification June 1, 2024

Phase 1 prizes

The total prize purse for Phase 1 is $200,000. A pool of $150,000 will reward the strongest submissions of representative, inclusive, open, and shareable datasets that can be used for a data science competition focused on the early prediction of AD/ADRD. A bonus prize of $25,000 may be awarded to the submission that best addresses populations disproportionately impacted by AD/ADRD. Solvers may also compete for one of five $5,000 prizes by submiting an idea for new data collection aligned with challenge aims.

Phase 1 Competition End Date: Jan. 31, 2024, 11:59 p.m. UTC

Award Prize Amount
1st $50,000
2nd $40,000
3rd $30,000
4th $20,000
5th $10,000
Disproportionate Impact Bonus Prize $25,000
Data Collection Proposal Prizes $25,000
Total $200,000

Disproportionate Impact Bonus Prize ($25,000)

Teams awarded prizes for their primary submissions in Phase 1 will be eligible for the Disproportionate Impact Bonus Prize. This prize recognizes the submission with the most potential to support algorithms that generalize to populations disproportionately impacted by AD/ADRD.

Prizes for Data Collection Ideas (subtotal $25,000)

Participants who submit ideas for data collection will be eligible for prizes recognizing proposals to collect new representative and open datasets that can be used for early prediction of AD/ADRD, with an emphasis on addressing biases in existing data sources. Up to five total Data Collection Proposal Prizes ($5,000 each) may be awarded from a prize pool of $25,000.

How to compete (Phase 1)

  1. Click the "Compete!" button in the sidebar to enroll in the competition.
  2. Click on "Team" if you need to create or join a team with other participants.
  3. Get familiar with the problem through the Problem description and About pages.
  4. Submit a draft of your executive summary by January 17, 2024 at the latest by clicking on Executive summary draft in the sidebar and filling out the form. To receive feedback based on your submission idea, submit your executive summary draft by the midpoint review deadline, November 15, 2023. Partway there!
  5. Post questions you have about the challenge to the challenge forum. Questions may be answered directly or at a Webinar event with DrivenData and NIH on December 13, 2023.
  6. Dive into the details of your data! Make sure to prepare your data and materials according to the submission format. You can refer to the evaluation criteria and submission template for guidance.
  7. Submit a detailed description of your data along with an updated executive summary by January 31, 2024. Submit by clicking on Final submission in the sidebar and filling out the form. You’re in! Note that the last submission before challenge close will be considered the final submission.

The primary goal of the challenge is to find an existing dataset to be used in later challenge phases; however, solvers may have ideas for collecting or building out datasets that align with the challenge aims but would not be ready to submit for this challenge. For more information about how to compete for a data collection proposal prize, see the ideas for data collection page.

The challenge rules are in place to promote fair competition and useful solutions. If you are ever unsure whether your solution meets the competition rules, ask the challenge organizers in the competition forum or send an email to

Challenge sponsor

This challenge is sponsored by the National Institute on Aging (NIA), an institute of the National Institute of Health (NIH)


with support from NASA