Problem description
The challenge is centered around developing better methods for prediction of Alzheimer's disease and Alzheimer's disease related dementias (AD/ADRD) as early as possible. Phase 3 — [Put IT All Together!]: Proof of Principle Demonstration — is focused on improving winning Phase 2 models to provide useful proofs-of-concept about early prediction methods.
In Phase 3, the winning teams from Phase 2 will spend several months refining their winning solutions from Phase 2 and share learnings from the process. In Phase 2, model performance was a primary focus. In Phase 3, the focus is making models more fair, equitable, and generalizable.
Key events:
- Refinement Period (April 23 – July 15): Participants improve their Phase 2 winning models
- Submission Deadline (July 15, 2025, 11:59 p.m. UTC): Participants submit their refined codebase and supporting materials
- Virtual Pitch Event (Late July or early August 2025): Participants present their refined solutions
- In-Person Event (Late September 2025): 1st and 2nd place Overall Winners share insights and engage with NIH stakeholders
Objectives
In PREPARE — Phase 3, solvers will explore new modeling approaches and data to provide proof of principle for the early prediction of AD/ADRD. In this work, solvers can help guide research and improve detection of AD/ADRD in a real-world population. For example, which methods should be investigated more? Which new datasets should be collected? How should existing data collection processes be changed?
The primary goal is not to optimize the performance of a single trained model on a specific test set, because a single test set would not be able to capture the full population of interest. Final Phase 3 models should serve as proofs of concept for early detection by demonstrating novel methods, data usage, or other approaches.
Participants are encouraged to explore any approaches that improve model performance on a real-world population, with a particular emphasis on equitable performance. Below is a non-comprehensive list of example model refinement activities. Participants do not need to complete all of the below — identify which activities are most relevant to your model, either on this list or something different.
- Incorporating new data, particularly new data from populatixons disproportionately impacted by AD/ADRD or data that enables earlier predictions
- Correcting for overfitting on the Model Arena competition test set, and improving or demonstrating generalizability across data sources or populations. For example, winners from the Acoustic Track can utilize the metadata provided in the Report Arena to account for correlations between language, corpus, and diagnosis in the competition data.
- Identifying and mitigating model bias
- Testing novel/experimental or established biomarkers
- Testing existing theoretical models of AD/ADRD
- Improving explainability and better understanding the contribution of different model features
- Addressing gaps in team expertise. For example, soliciting input from experts or clinical users to inform model updates
- Improving data cleaning and processing, particularly for data from the Acoustic Track
A few examples of activities that would not be helpful as part of Phase 3:
- Directly collecting additional data
- Increasing model performance by increasing compute (eg. number of ensembled models with different random seeds), rather than by changing methods
- Making an interactive dashboard to share model predictions without changing the model itself
Submission format
To be eligible for prizes, participants are required to both make a competition submission and participate in a virtual pitch event.
Each submission should be a ZIP archive containing the following components:
- Key Insights: A concise, easily digestible summary of practical learnings targeted towards AD/ADRD researchers (
key_insights.pdf
) - Codebase: A well-organized codebase for training the final model and performing inference (
codebase.zip
orcodebase.pdf
) - Phase 3 Activity Summary: A summary of what participants worked on during Phase 3, and how the model was changed from Phase 2 (
activity_summary.pdf
)
The ZIP archive of your submission cannot be larger than 1GB. You are only allowed to make one submission. To make changes, you can delete and re-upload your submission as many times as you like. Only the last entry submitted will be considered.
1. Key Insights
"Key Insights" provides a summary of takeaways for AD/ADRD research. Key Insights should be targeted to an audience of AD/ADRD researchers who have subject matter expertise, but may only have basic familiarity with machine learning. The goal is that an AD/ADRD researcher should easily be able to use the insights to improve or direct research.
The final model should be used as a proof-of-concept for early prediction. Learnings from the experimentation and development process are just as important as the final product. Insights can be derived from work done during both Phase 2 and Phase 3.
Suggested sections and prompts
- Overview
- Brief overview of the final model, including model architecture, datasets used, and key data processing and feature engineering steps
- Performance
- Describe model performance metrics on different datasets (if applicable) and patient groups. How well does the solution perform? How does this compare with existing approaches?
- For which groups does the solution struggle most with accurate early prediction, and how might performance be improved?
- Other key limitations and weaknesses relevant for interpretation and clinical use
- Methodology
- Takeaways about which modeling methods do or don’t improve model performance / generalizability and why
- Data
- Takeaways about which existing datasets do or don’t improve model performance / generalizability and why
- Any recommendations of top priority additional datasets to gather
- Contributing factors
- Takeaways about how different features influence model predictions and why (i.e., analysis of feature importance and real-world meaning of the results)
- Future directions
- Recommended next steps for advancing early prediction of AD/ADRD. What methodologies should be tested? What are key limiting factors in the field overall?
Technical requirements
- 5 pages maximum including figures and tables. References may be added on additional pages
- A PDF file with the name
key_insights.pdf
- On paper size 8.5 x 11 inch or A4, with minimum margins of 1 inch
- Minimum font size of 11
- Minimum single-line spacing
2. Codebase
Participants should submit a well-organized codebase for their final model, including all steps for model training and for inference on at least one dataset. The goal of the codebase is to lower the barrier for other researchers using similar methods. For example, other researchers should be able to easily find starter code for things like:
- Working with useful datasets (loading, cleaning, processing, etc)
- Implementing useful modeling methods, feature engineering, and data preprocessing
Codebases should be written to maximize reproducibility in the spirit of open science. We recommend starting with DrivenData's Cookiecutter project structure. You may even win a prize for clean code!
Technical requirements
- The codebase should be submitted as either:
codebase.pdf
: A PDF file that includes a link to a public Github repositorycodebase.zip
: A ZIP file containing all code files that is structured like a Github repository. Keep in mind that your whole submission ZIP cannot be larger than 1GB
- Model weights do not need to be included in the submitted code
- The submitted code should not contain any raw data, but should contain instructions for how to set up the raw data before running training and inference
3. Phase 3 Activity Summary
The "Phase 3 Activity Summary" should provide an overview of what work was done as part of Phase 3, and how solutions were updated from Phase 2 final models. The target audience is competition judges (i.e., a group familiar with both the subject matter and machine learning). "Key Insights" describes the final model and full experimentation process, while the "Phase 3 Activity Summary" describes only the changes made during Phase 3.
Suggested sections and prompts
- Activity Overview
- High-level summary of activity during Phase 3
- Summary of what changes you made to your final Phase 2 model
- Data
- Any additional datasets you experimented with and the results
- Generalizability
- How you explored making your model more generalizable outside of the competition dataset
- Experimentation to identify and mitigate bias
- Methodology
- Details about any other ways you experimented with improving model performance
- References
- Any ways that existing AD/ADRD research, theory, or practice influenced your work. E.g., literature references, user interviews to better understand clinical needs, expert input, etc
Technical requirements
- 3 pages maximum including figures and tables. References may be added on additional pages
- A PDF file with the name
activity_summary.pdf
- On paper size 8.5 x 11 inch or A4, with minimum margins of 1 inch
- Minimum font size of 11
- Minimum single-line spacing
Evaluation
Overall Prizes
Overall prize winners (1st, 2nd, and Runners-up) will be selected based on the following weighted criteria. Submissions will be judged by a panel of experts.
- Insights and Innovation (30%): What is the depth and relevance of insights gained? Does the submission include novel or innovative takeaways that can guide future directions of AD/ADRD research? This can be demonstrated through feature importance interpretation, assessment of model limitations and possible methods for improvement, demonstration of new modeling approaches, etc.
- Generalizability (25%): How likely is the model to generalize well to a real-world population of interest, in particular historically underserved segments of the population? This can be demonstrated through incorporation of new data, bias assessment and mitigation, and rigorous model performance assessment.
- Communication (25%): How clearly and effectively are findings communicated? In particular, how well are key insights communicated to an audience of AD/ADRD researchers who have subject matter expertise but may not be familiar with machine learning?
- Rigor (20%): Is the technical basis of the solution correct? How sound is the solution’s methodology?
Submissions are required to be in English, but will not be judged based on English fluency. Judgment will be based on the content and ideas communicated. For example, participants may choose to write in a different language, and then use a tool like Google translate to submit in English.
Clean Code Bonus Prizes
Clean Code Bonus Prizes will be awarded based on the "Codebase" component of the submission. Submissions will be evaluated based on how clear, reproducible, and usable the codebase is. I.e., Is it easy for others to understand and learn from the code? How effectively does the codebase lower the barrier for others to work with similar datasets or methods?
Below are some good resources for how to organize and write clean code in the spirit of open science:
- Cookiecutter Data Science: Logical, reasonably standard project structure for data science that reflects best practices (created by DrivenData)
- In particular, take a look at the Cookiecutter opinions about how data science should work
- Computational reproducibility course from Utrecht University
- Think we're missing a good resource? Let us know in the forum!
Data
The focus of Phase 3 is demonstrating the potential of algorithms and approaches developed in Phase 2, with key emphasis placed on solution equity, generalizability, and explainability. As part of this, participants should find and incorporate new data, data processing methods, and feature engineering.
External data is permitted in this competition provided participants have all rights, licenses, and permissions to use it as contemplated in the Competition Rules.
If you aren't sure whether a specific dataset is allowable, please reach out to us in the competition forum or send an email to info@drivendata.org.
Additional data resources
Below are some useful links and tools for finding additional data for experimentation.
General resources:
- Global Alzheimer’s Association Interactive Network (GAIN): GAIN has a data portal listing all datasets collected by partner organizations related to Alzheimer's disease. Different datasets include different types of data, icnluding biomarkers, cognitive tests, etc.
Acoustic Track:
-
DementiaBank corpus: The full DementiaBank corpus includes language samples and cases not used in the Phase 2 data as well as newly released content. Of particular note is the Delaware corpus, an ongoing study specifically designed to capture changes in speech and language abilities across the progression of dementia with a robust, updated version of the Pitt protocol. To access data, solvers must apply to join the DementiaBank consortium and use the data only for the competition’s purpose and duration; commercial use is prohibited.
-
Past DementiaBank competition datasets: DementiaBank includes cleaned and curated datasets from four previous research challenges, each with distinct goals and processing pipelines. See the "Challenges" section of the DementiaBank page. To access data, solvers must apply to join the DementiaBank consortium and use the data only for the competition’s purpose and duration; commercial use is prohibited.
-
Global Voice Datasets Repository Map: The NIH Bridge2AI program’s map of all publicly accessible speech datasets collected for neurological research may be a useful tool to find additional datasets. The data is organized geographically and contains licensing information.
-
Yang et al. 2022 literature review: In a 2022 paper "Speech-based AD Prediction Review", Yang et al. include a useful literature review table listing datasets by language, task / structure, and label availability.
Social Determinants of Health track:
-
MHAS: MHAS shares data products publicly, including additional subjects, years, and features that were not part of the Phase 2 data. This also includes some restricted MHAS data that requires access approval, such as genetic data. To access data, follow instructions and all terms of use on the MHAS website.
-
HCAP Network: MHAS is just one of a group of studies based on the Harmonized Cognitive Assessment Protocal (HCAP). Participants can experiment with comparable surveys in other countries. For example:
Additional tools
Participants may use any additional or external tools as long as they are publicly available and free to use. If you want to use a tool that is not clearly designated as open source, you must reach out to competition organizers for approval at info@drivendata.org.
Events
Virtual office hours
To support participants, there will at least one virtual office hours session during the competition. At office hours, participants will be able to meet with and receive guidance from experts. Possible topics include bias mitigation, clean code, acoustic data, and social determinants of health data.
Details and instructions for attending will be shared on the Announcements Page.
Virtual pitches
To be eligible for prizes, participants are required to present a summary of their work in a virtual pitch event in addition to making a competition submission. The primary goal of the virtual pitch event is to share learnings with the broader AD/ADRD research community. Pitches should generally focus on the same content as the "Key Insights" submission component. Attendees may include competition organizers, competition judges, representatives from the NIH, and others in the AD/ADRD research community. Pitches should be targeted to an audience of AD/ADRD researchers with a basic knowledge of machine learning, but who may not be machine learning experts.
Technical requirements:
- Each pitch presentation should be a maximum of 5 minutes long
- There is no required format for pitch presentations. Pitches may include slides, screen sharing, a simple voiceover, or any other desired format
- If an individual or team is unable to attend the event, they may submit a pre-recorded pitch presentation instead
The virtual pitch event will be scheduled for July or August 2025. Exact timing and event logistics will be shared on the Announcements Page at a later date.
In-person winner showcase
Phase 3 will culminate in an in-person event where the winners (1st and 2nd Place) will have an opportunity to connect with the broader AD/ADRD research community. Additional participants may be invited to the event depending on capacity.
Support for travel and lodging will be provided for winners to attend (with limitations on the number of participants per team). The event will take place in the Washington, D.C. area in September 2025. Details will be shared with teams before they confirm attendance.
The in-person winner showcase will include:
- A very short presentation from each winner, followed by a longer question and answer session. Winners are required to either attend the showcase in person, or to submit a short pre-recorded presentation
- An award ceremony
- Opportunities to meet and exchange ideas with prominent professionals in the AD/ADRD research space
Good luck
Good luck and have fun engaging with this challenge! If you have any questions, send an email to the challenge organizers at info@drivendata.org or post on the forum!