The Challenge
Large data sets containing personally identifiable information (PII) are exceptionally valuable resources for research and policy analysis in a host of fields supporting America's First Responders such as emergency planning and epidemiology.
Temporal map data is of particular interest to the public safety community in applications such as optimizing response time and personnel placement, natural disaster response, epidemic tracking, demographic data and civic planning. Yet, the ability to track a person's location over a period of time presents particularly serious privacy concerns.
The Differential Privacy Temporal Map Challenge required participants to develop algorithms that preserve data utility as much as possible while guaranteeing individual privacy is protected. The challenge featured a series of coding sprints to apply differential privacy methods to temporal map data, where each record is tied to a location and each individual may contribute to a sequence of events.
The Results
When the NIST synthetic data challenges started in 2018, there was skepticism as to whether synthetic data was feasible with differential privacy. Over the course of the 6 sprints across two challenges, we have seen the competitors rise to the occasion, discover unexpected powerful tricks and techniques, and address problems such as large, complex feature spaces, sparse data, high sensitivity queries (temporal data), heterogeneous map segments, small epsilon values, edit constraints and all the complexities of real world data.
They were able not only to perform well on these real world problems. Top contestants of the final sprint also demonstrated algorithms that produce records with both more privacy and greater accuracy than the typical subsampling techniques used by many government agencies to release records. These results hold immediate promise for public safety communities and other groups interested in data sharing with the formal privacy guarantees offered by differential privacy.
Check out the challenge sprints below and learn more about the winners and their leaderboard-topping approaches!
SPRINT 1 (BALTIMORE 911 INCIDENTS): RESULTS AND WINNERS
SPRINT 2 (AMERICAN COMMUNITY SURVEY): RESULTS AND WINNERS
SPRINT 3 (CHICAGO TAXI RIDES): RESULTS AND WINNERS
CHALLENGE REPOSITORY AND DATA SETS
Sprint 1
Open Arena
START HERE! Help public safety agencies share data while protecting privacy. This is part of a series of contests to develop algorithms that preserve the usefulness of temporal map data while guaranteeing individual privacy is protected. #privacy
Prescreened Arena
CALLING PRESCREENED PARTICIPANTS! Help public safety agencies share data while protecting privacy. If you haven't been prescreened yet, head on over to the Open Arena to learn more and get started. #privacy
Sprint 2
Open Arena
START HERE! Help public safety agencies share data while protecting privacy. This is part of a series of contests to develop algorithms that preserve the usefulness of temporal map data while guaranteeing individual privacy is protected. #privacy
Prescreened Arena
CALLING PRESCREENED PARTICIPANTS! Help public safety agencies share data while protecting privacy. If you haven't been prescreened yet, head on over to the Open Arena to learn more and get started. #privacy
Sprint 3
Open Arena
START HERE! Help public safety agencies share data while protecting privacy. This is part of a series of contests to develop algorithms that preserve the usefulness of temporal map data while guaranteeing individual privacy is protected. #privacy
Prescreened Arena
CALLING PRESCREENED PARTICIPANTS! Help public safety agencies share data while protecting privacy. If you haven't been prescreened yet, head on over to the Open Arena to learn more and get started. #privacy