Navigation

Quick access

Data Track A: Financial Crime

Transforming Financial Crime Prevention

There are two data use case tracks for the PETs prize challenge. This is the Phase 2 data overview page for the financial crime prevention track. In this track, innovators will develop end-to-end privacy-preserving federated learning solutions to detect potentially anomalous payments, leveraging a combination of input- and output-privacy techniques.

Background
Background

Data
Overview
Threat Profile

Modeling
Anomaly Detection
Partitioning Data
Example Centralized Baseline

Background

The financial crime track is focused on enhancing cross-organization, cross-border data access to support efforts to combat fraud, money laundering and other financial crime. You are asked to develop innovative, privacy-preserving solutions to enable detection of potentially anomalous payments, utilizing synthetic datasets representing data held by the SWIFT payments network and datasets held by partner banks.

“Anomalous transactions” covers a range of payments that vary significantly from the norms seen in the dataset, and thus may be indicative of fraud, money laundering, or other financial crime. Examples include a transaction that is of an unexpected amount or currency, uses unusual corridors (senders/receivers), has unusual timestamps, or contains other unusual fields. In the scope of this challenge, the problem is framed as a classification task. The training datasets are labeled with anomalies, and therefore you do not need a detailed understanding of financial crime issues.

This is a high-impact and exciting use case for novel privacy-enhancing technologies. There are currently challenging trade-offs between enabling sufficient access to data to build tools to effectively detect illegal financial activity, and limiting the identifiability of innocent individuals or inference of their sensitive information within those data sets. The scale of the problem is vast: the UN estimates that US$800-2000bn is laundered each year, representing 2-5% of global GDP.

Though novel innovation for this use case alone could achieve significant real-world impact, the challenge is designed to incentivize development of privacy technologies that can be applied to other use cases where data is distributed across multiple organisations or jurisdictions, both in financial services and elsewhere. The best solutions will deliver meaningful innovation towards deployable solutions in this space, with consideration of how to evidence the privacy guarantees offered to data owners and regulators, but also have the potential to generalize to other situations.

Data Overview

Innovators will use synthetic datasets representing data held by the SWIFT global payments network and by its partner banks. In Phase 1, you were provided with two development datasets:

Dataset 1: A synthetic dataset representing transaction data from the SWIFT global payment network
Dataset 2: Synthetic customer / account metadata, including flags, from SWIFT's partner banks

There are approximately 4 million rows across the two development datasets.

Additional development data may be released for Phase 2.

Note: The challenges are based on synthetic data to minimize the security burden placed on competitors during the development phase; of course the intent of the challenge is that privacy solutions are developed that would be appropriate for use on real data sets with demonstrable privacy guarantees. However, competitors must adhere to a data use agreement (see competition rules for more details).

Dataset 1: Transaction data held by SWIFT

In Phase 1, you were provided a synthetic dataset derived from data from the SWIFT global payment network. Each row in this dataset is an individual transaction, representing a payment from one sending bank to one receiving bank. The dataset will:

Contain data elements as defined in the ISO20022 pacs.008 / MT103 message format
Comprise transactions between fictitious originators and beneficiaries, sender and receiving banks, payment corridor, amount and timestamps

Expertise in financial crime or ISO20022 messaging is not an expected prerequisite for entering the challenge, and the assessment process will not focus on detailed understanding of the use case itself. The details in the sections below should be sufficient for understanding the data within the scope of the challenge. However, participants unfamiliar with this space may find it informative to consult a general introduction to ISO20022. You may also find the ISO20022 message definitions informative.

The dataset reflects a snapshot of transactions sent by an ordering customer or institution to credit a beneficiary customer or institution. The dataset covers roughly a month’s worth of transactions involving 50 institutions.

The synthetic data is not generated based on any real traffic and will not contain any statistical properties of the real SWIFT transaction data (SWIFT has applied normal and uniform distributions).

Dataset 1 details

Dataset 1 contains the following fields:

MessageId - Globally unique identifier within this dataset for individual transactions
UETR - The Unique End-to-end Transaction Reference—a 36-character string enabling traceability of all individual transactions associated with a single end-to-end transaction
TransactionReference - Unique identifier for an individual transaction
Timestamp - Time at which the individual transaction was initiated
Sender - Institution (bank) initiating/sending the individual transaction
Receiver - Institution (bank) receiving the individual transaction
OrderingAccount - Account identifier for the originating ordering entity (individual or organization) for end-to-end transaction,
OrderingName - Name for the originating ordering entity
OrderingStreet - Street address for the originating ordering entity
OrderingCountryCityZip - Remaining address details for the originating ordering entity
BeneficiaryAccount - Account identifier for the final beneficiary entity (individual or organization) for end-to-end transaction
BeneficiaryName - Name for the final beneficiary entity
BeneficiaryStreet - Street address for the final beneficiary entity
BeneficiaryCountryCityZip - Remaining address details for the final beneficiary entity
SettlementDate - Date the individual transaction was settled
SettlementCurrency - Currency used for transaction
SettlementAmount - Value of the transaction net of fees/transfer charges/forex
InstructedCurrency - Currency of the individual transaction as instructed to be paid by the Sender
InstructedAmount - Value of the individual transaction as instructed to be paid by the Sender
Label - Boolean indicator of whether the transaction is anomalous or not. This is the target variable for the prediction task.

End-to-end transactions

Each row in this dataset is an individual transaction, representing a payment from a sender bank to a receiver bank. An end-to-end transaction is a transaction from an originating ordering entity (a.k.a. ultimate debtor) to a final beneficiary entity (a.k.a. ultimate creditor) and may involve one or more individual transactions. The end-to-end transaction is one individual transaction in the case where the originating orderer's bank sends payment directly to the final beneficiary's bank. However, it may be the case where the payment is not directly sent, but is instead routed through one or more intermediary banks. In such a case, there are multiple individual transactions belonging to the single end-to-end transaction, with each individual transaction representing a bank-to-bank payment. Each end-to-end transaction is uniquely identified by the UETR field. In the case of a sequence of multiple individual transactions for one end-to-end transaction, all individual transactions share a value for UETR, and the Sender and Receiver banks form a chain from the originating ordering bank through one or more intermediary banks to the final beneficiary bank.

Because each end-to-end transaction is defined by one originating orderer and one final beneficiary, this means the Ordering* columns for the orderer and Beneficiary* columns for the beneficiary have been included in this dataset in a denormalized fashion—the values are duplicated across all the individual transactions (rows) belonging to the same end-to-end transaction. Additionally, this means that the OrderingAccount and BeneficiaryAccount in a given row may not necessarily belong to the bank in that row's Sender and the bank in that row's Receiver, respectively. The correct way to associate an OrderingAccount to the correct bank is to identify the Sender bank in the originating (first) individual transaction in that end-to-end transaction, and the correct way to associate a BeneficiaryAccount to the correct bank is to identify the Receiver bank in the final (last) individual transaction in that end-to-end transaction.

MessageId	UETR	Sender	Receiver	OrderingAccount	BeneficiaryAccount	...
...	...	...	...	...	...	...
10	00012345-...	A	B	111	222	...
11	00012345-...	B	C	111	222
12	00012345-...	C	D	111	222
...	...	...	...	...	...	...

Illustrative example showing the how to associate the originating orderer and final beneficiary information with the correct banks for one end-to-end transaction made up of three individual transactions. The orderer and beneficiary account information is duplicated across all rows in this group, and the sender and receiver banks form a chain. The bank and account information of the originating orderer is highlighted in blue, and the bank and account information for the final beneficiary is highlighted in yellow.

Dataset 2: Account data held by banks

Participants were provided access to account-related data representative of that held by banks. This dataset contains account-level information, including flags signaling whether the account is valid, suspended, etc.

Data can be linked using the Account field in the bank data and the OrderingAccount or BeneficiaryAccount in the SWIFT transaction data. Please see the previous section on end-to-end transactions for details on how to identify which bank an OrderingAccount or BeneficiaryAccount should be linked to.

Note that bank nodes will not have access to data on the SWIFT node and vice-versa—a case of vertical data partitioning. It is up to you to determine how to exchange this information in a secure and private way.

Dataset 2 details

Dataset 2 will contain the following fields:

Bank - Identifier for the bank
Account - Identifier for the account
Name - Name of the account
Street - Street address associated with the account
CountryCityZip - Remaining address details associated with the account
Flags - Enumerated data type indicating potential issues or special features that have been associated with an account. Flag definitions are provided below:
- 00 - No flags
- 01 - Account closed
- 03 - Account recently opened
- 04 - Name mismatch
- 05 - Account under monitoring
- 06 - Account suspended
- 07 - Account frozen
- 08 - Non-transaction amount
- 09 - Beneficiary deceased
- 10 - Invalid company ID
- 11 - Invalid individual ID

Note that this dataset is provided unpartitioned, with all banks' data in one table.

Note that the flags may not be representative of real-world practices. For example, in the real world, banks may use different flags and may interpret or weight them differently based on appetite for risk.

Development and Evaluation Data

The datasets being provided are intended for local development use in both Phase 1 and Phase 2. The transaction dataset has been split in time–the bulk of the dataset is the training set, and the final week of the dataset is a test set. The prediction task, as detailed in a later section, is to predict a confidence score for each individual transaction in the test set as to whether it is an anomalous transaction. The ground truth is provided for both the training and set sets in the development dataset.

In Phase 2, a separate and held-out dataset will be used for solution evaluation. Some aspects of the Phase 2 evaluation data may be changed that should be learnable by your model. You will submit code for your solution to a code execution environment. The code execution runtime will run cold-start federated training on the new dataset's training split and then run inference to generate predictions for the new dataset's test split. Your solution's performance will be measured by evaluating its predictions against the ground truth for the new dataset's test split.

New Development Dataset

New development dataset published December 3, 2022.

A new development dataset has been provided by our data partners at SWIFT and is available on the data download page, indicated by the [NEW] tag. This new development dataset is a synthetic dataset that is largely similar to the previously-provided synthetic development dataset (also still available on the data download page), but has some differences that better represent what is observed in the real world. You should expect that the evaluation dataset (held out and will never be made available) will have similar distributions as this new dataset. Local development and self-reported results in the technical paper should primarily use the new development dataset.

Threat Profile

You will design and develop end-to-end solutions that preserve privacy across a range of possible threats and attack scenarios, through all stages of the machine learning model lifecycle. You should therefore carefully consider the overall privacy of your solution, focusing on the protection of sensitive information held by all parties involved in the federated learning scenario. The solutions you design and develop should include comprehensive measures to address the threat profiles described below. These measures will provide an appropriate degree of resilience to a wide range of potential attacks defined within the threat profile. For more information on threat profiles, please visit the privacy threat profile section of the problem description.

Scope of sensitive data

Your solution must prevent the unintended disclosure of

a) sensitive information in the SWIFT transaction dataset
b) sensitive information in the bank dataset, to any other party, including other insider stakeholders (for example, SWIFT and other financial institutions) and outsiders.

The sensitive information for the SWIFT dataset is all personally-identifiable information about the originating orderer (a.k.a. ultimate debtor) and final beneficiary (a.k.a. ultimate creditor) parties, including personal details like names and addresses, and group membership information. This includes but is not limited to the raw private data about the orderer/beneficiary stored directly in the account number, name, and address fields, and the transaction identifiers and timestamps.

The sensitive information for the bank datasets include all personally identifiable information about parties involved in the transactions, including names and addresses and group membership information. This includes but is not limited to the raw private / business data reflected in account numbers/names/addresses and flags.

Anomaly detection models

The analytical objective is to train a model that enables SWIFT to identify anomalous transactions. In the context of this challenge, this is a classification model to be trained on provided training data with ground truth labels. In real-world deployments, such transactions might be subject to additional verification actions or flagged for further investigation, dependent on context.

Prediction Target and Evaluation Metric

The target variable for the modeling task is a confidence score (between 0.0 and 1.0) for whether each individual transaction is anomalous. As discussed previously, anomalous is not precisely defined and should be learned by your model via supervised learning on provided training data.

The evaluation metric will be Area Under the Precision–Recall Curve (AUPRC), also known as average precision (AP), PR-AUC, or AUCPR. This is a commonly used metric for binary classification that summarizes model performance across all operating thresholds. This metric rewards models which can consistently assign anomalous transactions with a higher confidence score than negative non-anomalous transactions. AUPRC is computed as follows:

$$ \text{AUPRC} = \sum_n (R_n - R_{n-1}) P_n $$

where $P_{n}$ and $R_{n}$ are the precision and recall, respectively, when thresholding at the n^th individual transaction sorted in order of increasing recall.

Partitioned Datasets for Federated Learning

Partitioning Overview

A number of banks are working with SWIFT to collaboratively train a model. The parties are working jointly to do this, and can take a common approach to technical design, infrastructure, etc., but are not able to enable access to each other's data. In the real world there are a number of barriers that might prevent this; banks are subject to a variety of privacy, competition and financial industry regulations, may be operating in different jurisdictions, and have legitimate commercial and ethical reasons for not sharing customer data with competitors.

The key task of this challenge is to design a privacy solution so that SWIFT can safely train and deploy such a model without compromising the privacy requirements (more details on the requirements, and an associated threat model, are described in the problem description).

Diagram comparing a centralized model with a federated model

Above: Diagram comparing the centralized model (M_C) with a privacy-preserving federated learning model (M_PF).

Partitioning Details

This use case features both vertical and horizontal data partitioning. Data is vertically partitioned between Dataset 1 (SWIFT) and Dataset 2 (partner banks), and it is horizontally partitioned within Dataset 2 (between each partner bank).

For local development, you were provided a full, unpartitioned dataset. In Phase 2 evaluation, evaluation will occur with predetermined partitioning along institutional boundaries. The SWIFT data will always belong a single federation unit who represents the SWIFT Data Store and only has access to the SWIFT data. Banks will be split up among federation units such that one bank's account data entirely belongs within one partition. In cases where there are fewer bank partitions than the number of banks, each bank partition may contain data from more than one bank.

Any partitioning of the data that you might perform in your local development experiments should take this into account. Your solution should be able to handle any number of bank partitions, and in Phase 2, we may evaluate your solution with a number of bank partitions between 1 and 10.

Key Task

For the purposes of the challenge, you should demonstrate your solution by training two models:

$M_C$ = a centralized model trained on datasets 1 and 2 in a non-privacy preserving way
$M_{PF}$ = a privacy-preserving federated model trained using your privacy solution

Example Centralized Baseline

In Phase 1, SWIFT provided sample Python code for training a centralized anomaly detection model ($M_C$) on the ISO20022 training data. This code snippet used the dataset provided as input and trained a simple anomaly detection model using the XGBoost Classifier.

The core of the evaluation will be assessing the comparison between a centralized model $M_C$, and an alternative model $M_{PF}$ that combines a federated learning approach with innovative privacy-preserving techniques.

In the real world, SWIFT may wish to train a model collaboratively with a number of banks, in order to increase the volume and variety of data being used to train the model. You should therefore aim to develop scalable solutions that enable additional nodes to be integrated into the federated network whilst incurring an acceptable additional performance overhead.

The federated learning scenario thus consists of one node hosting the SWIFT dataset, and N nodes hosting bank data. We may evaluate solutions for values of N between 1 and 10, in order to assess how well solutions scale as more banks are added to the network. During solution development, you may have full autonomy over how you partition the bank dataset in order to understand the scalability of your solution.

Full details on evaluation criteria can be found in the problem description.

For additional reference, here is a technical brief for this use case track provided through the U.K. challenge. This brief has been assembled in collaboration by the U.K. and U.S. challenge organizers. This may help to give a sense for the use case and capabilities expected, though note that details in the brief may not match exactly how the U.S. challenge will operate.

Good luck

Good luck and enjoy this problem! For more details on submission and evaluation, visit the problem description page. If you have any questions, you can always ask the community by visiting the DrivenData user forum or the cross-U.S.–U.K. public Slack channel. You can request access to the Slack channel here.

Competition Rules

Hosted by National Institute of Standards and Technology, US Department of Commerce, Information Technology Laboratory; National Science Foundation

Original Version Approved July 18, 2022
Update Approved October 21, 2022
Update Approved November 7, 2022
Update Approved November 22, 2022
Update Approved January 24, 2023
Update Approved March 22, 2023

PETs Prize Challenge: Advancing Privacy-Preserving Federated Learning Official Rules

This document outlines the official rules for the PETs Prize Challenge: Advancing Privacy-Preserving Federated Learning (Challenge), co-sponsored by the Information Technology Laboratory (ITL) of the National Institute of Standards and Technology (NIST) and the National Science Foundation (NSF) (Organizers). Nothing within this document or in any supporting documents shall be construed as obligating the Department of Commerce, NIST, NSF, or any other Federal agency or instrumentality to any expenditure of appropriated funds, or any obligation or expenditure of funds in excess of or in advance of available appropriations.

This Challenge is being conducted in parallel with a similar prize challenge sponsored by the government of the United Kingdom. Participants in this Challenge, whether they are individuals, entities, or team members, are prohibited from participating in the UK prize challenge, and participants in the UK prize challenge are likewise prohibited from participating in this Challenge. Individuals, entities, or team members who are found to have entered both the US and the UK challenges will be disqualified from participating in either. Additional details regarding eligibility are included at Sections 1.9 Minimum Submission Requirements and 9.2 Eligibility Requirements.

1. Introduction

1.1 Overview:

This Challenge seeks innovators to help mature federated learning approaches and build trust in adoption by accelerating the development of efficient privacy-preserving federated learning (PPFL) solutions. The U.S. and UK collaboration to develop parallel prize challenges was announced during President Joe Biden’s Summit for Democracy as part of a grand challenge series focused on advancing democracy-affirming technologies.

The PETs Prize Challenge is a multi-track, multi-phase competition with the total amount of prize awards valued at up to $800,000. The Organizers are seeking solutions that leverage a combination of input and output privacy techniques to drive innovation in the technical development and application of privacy enhancing technologies (PETs).

This Challenge will consist of four phases, and members of the public who are eligible to compete (see additional details regarding eligibility herein), whether as an individual, a team, or a private entity (Participants), may participate as either Blue Team Participants or Red Team Participants in accordance with the timelines and instructions set forth herein. The Blue Team Participants and Red Team Participants will compete in different phases of the Challenge.

In Phases 1 and 2, Blue Team Participants will generate and build solutions to the problems presented by this Challenge. In Phase 3, the Red Team Participants will compete against one another while they test the top Blue Team Participants’ submissions, as chosen by the Judges Panel. The results of the tests performed by the Red Teams will be used to determine the winners of Phase 2 among the Blue Team Participants. In Phase 4, the top-performing Blue Team Participants will be invited to participate by contributing their Phase 2 solutions to an open-source website.

The Challenge will begin with an abstract and a concept paper in Phase 1, followed by a solution development contest in Phase 2, which will involve the development and implementation of PPFL solutions. For Phase 3 a red teaming contest will test the privacy protections of the top implemented solutions from Phases 1 and 2. There are no fees needed to enter any stage of the Challenge. Blue Team Participants must participate in Phase 1 in order to compete in Phase 2. Red Teams will be distinct from the Blue Teams that they are testing, and no overlap or cross-over between membership on Blue Teams and Red Teams will be permitted.

1.2 Background:

Artificial intelligence (AI) and other emerging analytics techniques are amplifying the power of data, making it easier to discover new patterns and insights, ranging from better models to predict the impacts of climate change to new methods for detecting financial crimes. However, while data is enabling innovation and insights across sectors, it can still be challenging to harness the full potential due to the imperative for adequate privacy protections and without proper privacy preserving mechanisms, use of such data may result in more harm to individuals and society.

This Challenge seeks to engage Participants in developing effective PPFL solutions to enable the use of valuable sensitive data without moving the data or compromising the privacy of individuals whose information is contained within the data set.

Federated learning, or more generally collaborative learning, shows huge promise for machine learning applications derived from sensitive data by enabling training on distributed data sets without sharing the data among the participating parties. This results in a machine learning model that aggregates local models obtained from distributed parties. As data from each participating party does not need to be shared, the approach already provides a general level of privacy. However, as the global federated model is trained, the parameters related to the local models could be used to learn about the sensitive information contained in the training data of each client. Similarly, the released global model could also be used to infer sensitive information about the training datasets used.

1.3 Participation Options for Blue Team Participants:

Blue Team Participants will develop solutions that enhance privacy protections across the lifecycle of federated learning. This Challenge offers two different data use cases – financial crime prevention, or Track A, and pandemic response and forecasting, or Track B. Blue Team Participants may develop solutions directed to either one or both tracks. If Blue Team Participants choose to develop and submit a more generalizable solution to run on both Track A and B, they can also qualify for additional prizes dedicated to Generalizable Solutions. See Section 2 for additional details.

Track A: Financial Crime Prevention. The United Nations estimates that up to $2 trillion of cross-border money laundering takes place each year, financing organized crime and undermining economic prosperity. Financial institutions such as banks and credit agencies, along with organizations that process transactions between institutions, such as the SWIFT global financial messaging provider, must protect personal and financial data, while also trying to report and deter illicit financial activities. Federated learning shows huge promise for machine learning applications derived from sensitive data by enabling training on distributed data sets without sharing the data among the participating parties. Using synthetic datasets provided by SWIFT, Participants will design and later develop innovative privacy-preserving federated learning solutions that facilitate cross-institution and cross-border anomaly detection and combat financial crime. Participants in Track A will have the opportunity to engage with subject matter experts and banking regulators as they develop their solutions.

Track B: Pandemic Response and Forecasting. The COVID-19 pandemic has taken an immense toll on human lives and has had an unprecedented level of socio-economic impact on individuals and societies around the globe. As we continue to deal with COVID-19, it has become apparent that better ways to harness the power of data through analytics are critical for preparing for and responding to public health crises. Federated learning approaches could allow for responsible use of sensitive data to develop cross-organization and cross-border data analysis that would result in more robust forecasting and pandemic response capabilities. Using synthetic population datasets, Participants will design and later develop PPFL solutions that can predict an individual’s risk for infection. If an individual knows that they are likely to be infected in the next week, they can take more prophylactic measures (masking, changing plans, etc.) than they usually might. Furthermore, this knowledge could help public health officials better predict interventions and staffing needs for specific regions.

Generalizable Solutions. Cross-organization, cross-border use cases are certainly not limited to the financial or public health domains. Developing out-of-the-box generalized models that can be adjusted for use with specific data or problem sets has great potential to advance the adoption and widespread use of PPFL for public and private sector organizations in multiple sectors. Participants will develop a solution using both the Track A and Track B datasets to be eligible for additional awards dedicated to generalizable solutions.

1.4 Goals and Objectives:

Organizers seek to mature federated learning approaches and build trust in adoption by accelerating the development of efficient PPFL solutions that leverage a combination of input and output privacy techniques to:

Drive innovation in the technological development and application of novel privacy enhancing technologies;
Deliver strong privacy guarantees against a set of common threats and privacy attacks; and
Generate effective models to accomplish a set of predictive or analytical tasks that support the use cases.

1.5 Summary of Phases:

The Challenge will occur across four phases. The Blue Team Participants will compete against one another in Phases 1 and 2, and the Red Team Participants will compete against each other in Phase 3. The results of the Red Team Participants’ competition in Phase 3 will be used to assist in evaluating the Blue Team Participants’ submissions in Phase 2, and the top Blue Team solutions from Phase 2 will be invited to participate in Phase 4 by making their winning solutions open and available to the public.

Phase 1: Concept Paper. Blue Team Participants will produce a technical white paper (“Concept Paper” or “White Paper”) setting out their proposed solution approach. Technical papers will be evaluated by a panel of judges across a set of weighted criteria. Participants will be eligible to win prizes awarded to the top technical papers, ranked by points awarded.

Phase 2: Solution Development. Blue Team Participants will develop working prototypes of their solutions. These solutions are expected to be functional, and capable of being used to train a model against the evaluation data set, with measurement of relevant performance and accuracy metrics. Solutions will be evaluated by a panel of judges across a set of weighted criteria. This evaluation will determine a pool of up to 9 finalist Blue Team Participants that may advance to testing by Red Teams. The ideal pool to advance will include the 3 top-scoring solutions from each of Track A, Track B, and generalizable solutions. If a top generalizable solution is also a top 3 performer in a data track, then the next highest scoring solutions for that track will also be selected to advance to the red teaming phase. If a top generalizable solution results in an overlap of a top 3 performer across data tracks, the judges may also select the next highest scoring solution to advance to the red teaming phase. The intent is to advance as many solutions, up to 9, as possible for red teaming, however, advancement will be at the sole discretion of the judges.

Phase 3: Red Teaming. Red Team Participants will develop and implement attack scenarios to test the concepts and solutions developed by Blue Team Participants in Phases 1 and 2. Up to 9 final Blue Team Participants’ solutions from Phase 2 judges' evaluations will advance to this red teaming, where the privacy guarantees will be tested against a set of common privacy attacks. Results of the Red Team attacks will be used to determine the final rankings among the Blue Team Participants for Phase 2. The three best performing Red Team Participants will be awarded prizes for Phase 3.

Phase 4: Open Source. Up to 7 of the top-scoring Blue Team Participants’ may be invited to participate in this Phase by contributing their solutions to an open source repository or website, as set forth in Section 1.7, below.

1.6 Timeline & Important Dates

Phase 1: Concept Paper

Event	Date
Launch	July 20, 2022
Abstracts Due, Blue Team Registration Closes	September 4, 2022
Concept Papers Due	September 19, 2022
Winners Announced	November 10, 2022

Phase 2: Solution Development

Event	Date
Launch	October 5, 2022
Solution Developments Due	January 26, 2023
Announcement of Finalists to be Tested in Phase 3 Red Teaming	February 13, 2023
Winners Announced	March 30, 2023

Phase 3: Red Teaming

Event	Date
Launch (Red Team Open Registration Period)	November 10, 2022 – December 2, 2022
Scenario Development Period	December 9, 2022 – February 13, 2023
Attack Period	February 13, 2023 - February 28, 2023
Winners Announced	March 30, 2023

Phase 4: Open Source

Event	Date
Open Source Submissions Due	April 20, 2023

NOTE: As set forth in these rules, the Organizers reserve to themselves the right, in their sole discretion, to revise the terms hereof governing any phases taking place after the effective date of any such revision, including the timeline and important dates.

1.7 Prizes and Awards

TOTAL POOL: $800,000

The anticipated number of and amount of the Awards that will be awarded for this Challenge is set forth below; however, the Organizers are not obligated to make all or any Awards, and they reserve the right to award fewer than the anticipated number of Awards in the event that an insufficient number of eligible Participants meet the enumerated Evaluation Criteria for the relevant Phase of the Challenge, based on the Organizers’ sole discretion. Organizers additionally reserve the right to reallocate any unawarded prize amounts from earlier phases to other prize categories in later Challenge phases.

Phase 1: Concept Paper Prizes (TOTAL $55,000)

Open to Blue Team Participants

1st Prize:	$30,000
2nd Prize:	$15,000
3rd Prize:	$10,000

Phase 2: Solution Development Prizes (TOTAL $575,000)

Open to Blue Team Participants

Track A: Financial Crime Prevention (SUBTOTAL $175,000)

1st Prize:	$100,000
2nd Prize:	$50,000
3rd Prize:	$25,000

Track B: Pandemic Response and Forecasting (SUBTOTAL $175,000)

1st Prize:	$100,000
2nd Prize:	$50,000
3rd Prize:	$25,000

Generalized Solutions (SUBTOTAL $175,000)

1st Prize:	$100,000
2nd Prize:	$50,000
3rd Prize:	$25,000

The Top 7 Solutions Overall: Up to 7 Blue Team Participants with the highest overall scores in Phase 2 will be awarded an invitation to participate in Phase 4. If any of the top Blue Team Participants decline the invitation to Phase 4, the Organizers may, at their discretion, offer such invitation(s) to the next highest scoring Blue Team(s). However, the Organizers are under no obligation to award all 7 invitations.

Special Recognition Prizes (SUBTOTAL $50,000)

Blue Team Participants that are not awarded a 1st, 2nd, or 3rd Prize in Phase 2 will be eligible for special recognition prizes from a pool of $50,000 for excellence in specific areas of privacy innovation. Up to 5 total Special Recognition Prizes may be awarded in one or more of the following categories: (1) novelty, (2) advancement in a specific privacy technology, (3) usability, and/or (4) efficiency. At the judges’ discretion, fewer than 5 total Special Recognition Prizes may be awarded and/or Special Recognition Prizes may not be awarded for all categories. Participant prize amounts will be based upon an even split of the number of prizes awarded.

Phase 3: Red Teaming Prizes (TOTAL $120,000)

Open to Red Team Participants

1st Prize:	$60,000
2nd Prize:	$40,000
3rd Prize:	$20,000

Phase 4: Open Source Prizes (TOTAL $140,000)

Open to Blue Team Participants

Up to 7 Blue Team Participants may be invited to release their solutions as open-source software. Each Blue Team that agrees to release their solution to an open source repository and submits proof of their participation in accordance with these rules will be awarded a prize of $20,000.

UPDATE March 22, 2023: Prize amounts have been increased to $20,000 per award for a total of up to $140,000 for Phase 4 (the increase is reallocated prize moneys that were not awarded in earlier phases). The total amount of prize awards for the Challenge remains $800,000.

1.8 Summary of Judging Criteria

See (https://petsprizechallenge.drivendata.org) for details.

Phase 1: Concept Paper. The Judges will evaluate the Blue Team Participants’ Concept Papers describing proposed solutions based on weighted qualitative factors which include, but are not limited to, understanding of the technical challenges, application of privacy techniques, and anticipated accuracy, usability, and efficiency, as further described in Section 2, below. Up to 3 submissions may receive an award, as provided in Section 1.7, above.
Phase 2: Solution Development. The Judges will evaluate the Blue Team Participants’ submissions by utilizing a combination of weighted factors including quantitative measures assessed by the test harness and qualitative expert judgements of the evaluation panel which will assess their vulnerability to privacy attacks. Weighted factors include accuracy, scalability and efficiency, vulnerability to Phase 3 Red Team privacy attacks, adaptability, usability, innovation, and explainability. See Sections 1.7 and 3 for further details regarding prizes and awards, including information on Special Recognition Prizes for specific elements such as novelty, advancement in a specific privacy technology, usability, and efficiency.
Phase 3: Red Teaming. Success of the attacks by Red Team Participants on the submissions of Blue Team Participants will be assessed by a panel of Judges based on factors such as their effectiveness, applicability, generality, and innovation, as further described in Section 4.
Phase 4: Open Source. Up to 7 Blue Team Participants from Phase 2 may be invited to participate in this portion of the Challenge. Each of the up to 7 Blue Team Participants that release their solutions to an open source repository and provide the Organizers with verification of such release in accordance with these rules will receive a prize of $20,000.

1.9 Minimum Submission Requirements

The Organizers are interested in efficient and usable federated learning solutions that provide end-to-end privacy and security protections to harness the potential of AI for overcoming significant global challenges while preserving privacy and security. The four-phase Challenge will be administered by DrivenData (https://www.drivendata.org/) on behalf of the Organizers.

To be eligible for prizes in the Challenge, Participants must:

Meet all eligibility requirements laid out herein.
Visit (https://www.drivendata.org/) to register for a DrivenData account then navigate to the Challenge Website (https://petsprizechallenge.drivendata.org) to sign up for either Phase 1 as a Blue Team Participant, or for Phase 3 as a Red Team Participant before the Phases’ respective registration deadlines. If you already have a DrivenData account, you can sign up directly as a Blue Team or Red Team Participant during the appropriate Phase registration period on the Challenge website (https://petsprizechallenge.drivendata.org).
Not participate using more than one DrivenData account. A Participant, whether an individual, a team, or a private entity, using more than one account is deemed cheating and, if discovered, will result in disqualification from the Challenge and any other affected challenges and may result in banning or deactivation of affected DrivenData accounts. DrivenData reserves the right to request information from Participants for the purpose of any investigation of suspected cheating. Failure by a Participant to respond to these requests (including failure to furnish the requested information) within 10 days is grounds for disqualification from the Challenge. Participants are prohibited from privately sharing source or executable code developed in connection with or based upon data received from the Challenge outside of their Team, and any such sharing is a breach of these Challenge Rules and may result in disqualification.
Electronically acknowledge acceptance of these Official Rules and satisfy all of the requirements described herein.
If participating as a Red Team participant, electronically acknowledge and sign the Non-Disclosure Agreement for the Challenge.
For Phase 1, submit via and according to the instructions on the Challenge website (https://petsprizechallenge.drivendata.org) an abstract and a Concept Paper that meet the criteria outlined in the Rules for Phase 1 on or in advance of the submission deadlines. Blue Team Participants’ abstracts must briefly describe a proposed track specific or a generalized federated learning (FL) solution and PETs that could be employed within that solution. Blue Team Participants’ Concept Papers must incorporate and expand on the abstract further detailing the technical approach, privacy guarantees, anticipated privacy issues, and privacy tradeoffs, as well as the anticipated performance of the model.
For Phase 2, upload the source code for a centralized baseline model (MC) and a privacy-preserving federated model (MPF) related to at least one use case track, along with supporting documentation and full license agreements for any licensed software embedded in the source code. DrivenData must be able to run both Blue Team Participants’ source code for both models in order for Blue Team Participants to compete in Phase 2 and 3 evaluations. Blue Team Participants must make their submission per the instructions on the Challenge website (https://petsprizechallenge.drivendata.org) on or before the submission deadline.
For Phase 3, conduct privacy attacks and audits against assigned Blue Team Participants’ PPFL solutions and submit, via the Challenge website (https://petsprizechallenge.drivendata.org), a single Attack Report which details flaws and vulnerabilities found in each assigned Blue Team PPFL model. Red Team Participants must submit their reports on or before the close of the Red Team Attack period.
For Phase 4, submit an email to PrivacyEng@nist.gov which provides proof of their open source upload and license per the directions in the Rules for Phase 4 - Open Source. Invited Blue Team Participants must complete the open source requirements and send the email on or before the Phase 4 deadline.

2. Rules for Phase 1 - Concept Paper

Phase 1 of the Challenge is intended to source detailed concepts for innovative and practical solutions. In this stage, Participants are asked to propose federated learning solutions which incorporate privacy-preserving techniques. Participants are free to determine the set of privacy technologies used in their solutions, with the exception of hardware enclaves (or similar approaches) and other specialized hardware-based solutions.

Participation

Phase 1 invites all eligible, registered Blue Team Participants to submit an abstract and concept paper detailing input and output privacy technique integration into a federated learning (FL), or more generally, a collaborative learning model.

Participants will have the opportunity to expand their knowledge in real world federated learning use cases by hearing from and possibly engaging directly with financial crime and public health data subject matter experts and active practitioners in these fields. In addition to the opportunity to win monetary prizes, eligible Blue Team Participant winners may also be invited to present their solutions at the next Presidential Summit for Democracy currently slated for Spring 2023.

Process

Blue Team Participants will follow the instructions set forth in the Minimum Submission Requirements section above to sign up for a DrivenData account and register for Phase 1. Participants may form teams using the DrivenData upon creation of an account and registration.

Upon registration, the Organizers intend to provide all participants with the following for Tracks A and B, as available:

Detailed concept briefs which further detail the technical use cases for each Track
Sample Data and data schema relevant to each Track; and
A centralized machine learning data model for each Track.

The Organizers also intend to schedule webinars or other similar engagements with use case track subject matter experts to enhance Blue Team Participants understanding of data sharing needs, applicable compliance needs, and threats to each specific use case. Further information regarding these events will be posted to the Challenge website or on the Challenge website’s forum. (https://petsprizechallenge.drivendata.org)

Blue Team Participants may write and submit their abstract and Concept Paper at any time up until the Challenge’s posted deadlines for each requirement.

There are two data use case tracks in the contest—Track A: Financial Crime Prevention and Track B: Pandemic Response and Forecasting. Blue Team Participants can elect to apply their solution to either or both tracks and are required to specify which. There are four possible cases:

Your solution is for Track A
Your solution is for Track B
The solution is a generalized solution for both Track A and Track B. A generalized solution is one that can be executed and evaluated on both data use case tracks where there is the same core privacy architecture and where adaptations to specific use cases are relatively minor and separable from a shared codebase.
There are two separate solutions: one for Track A and one for Track B. In such a case, you are required to make two sets of submissions that meet the submission requirements—one for each solution. This is required for both solutions, when further developed, to be eligible in the following Solution Development Phase. However, each team can win at most one prize in the Concept Paper Contest whether they have one or two sets of submissions.

As you propose your technical solutions, be prepared to clearly describe the technical approaches and sketch out proof of or justification for privacy guarantees. Participants should consider a broad range of privacy threats during the model training and model use phases and consider technical and process aspects including but not limited to cryptographic and non-cryptographic methods, and protection needed within the deployment environment.

There are additional privacy threats that may be specific to the technical approaches proposed by the participating teams. Teams will be asked to clearly articulate any other novel privacy attacks that are addressed through their solutions.

Successful technical approaches and proofs of privacy guarantees will include the design of any algorithms, protocols, etc. utilized, as well as formal or informal arguments of how the solution will provide privacy guarantees. Participants will clearly list any additional privacy issues specific to the technological approaches used and justify initial enhancements or novelties compared to the current state-of-the-art. Participant submissions must describe how the solution will cater to the types of data provided to participants and how generalizable the solution is to multiple domains. Expected efficiency/scalability of improvements, privacy vs. utility trade off should be articulated, if possible, at this conceptual stage.

The use and purpose of licensed software or proprietary hardware should also be anticipated and acknowledged as the conceptual stage evolves.

If so desired, Blue Team Participants may initiate solution development based upon the information and sample data received in Phase 1 and concurrently with the writing of the Phase 1 Concept Papers in Phase 1. Blue Team Participants who chose to do so, but fail to meet the minimum submission requirements or criteria in Phase 1, will be ineligible to advance to Phase 2 and must immediately delete all Phase 1 data.

Requirements

To be eligible for prizes, a successful submission for Phase 1 will include two separate elements: a one-page abstract and a Concept Paper.

Abstract: The one-page abstract must include a title and a brief description of the proposed solution, including the proposed privacy mechanisms and architecture of the federated model. The description should also describe the proposed machine learning model and expected results with regard to accuracy. Successful abstracts will outline how solutions will achieve privacy while minimizing loss to accuracy, a proposed solution, and the anticipated results, as more fully described on the Challenge Website. Abstracts must be submitted by following the instructions on the Challenge Website. Abstracts will be screened by the DrivenData and Organizers’ staff for contest eligibility and used to ensure the composition of the judging panel’s expertise aligns to proposed solutions that will be evaluated throughout the course of the Challenge. Feedback will not be provided.

Concept Paper: The Concept Paper should conceptualize solutions that describe the technical approaches and lay out the proof of privacy guarantees that solve a set of predictive or analytic tasks that support the use cases. Successful Concept Papers will incorporate the originally submitted abstract and be no more than ten pages in length. References will not count towards page length. Participant submissions shall:

Include a title and abstract for the solution
Clearly articulate the selected track(s) the solution addresses, understanding of the problem, and opportunities for privacy technology within the current state-of-the-art.
Clearly describe the technical approaches and proof of privacy guarantees based on their described threat model, including:
- The design of any algorithms, protocols, etc. utilized,
- The formal or informal arguments of how the solution will provide privacy guarantees.
Clearly list any additional privacy issues specific to the technological approaches used.
Justify initial enhancement or novelty compared to the state-of-the-art.
Articulate:
- The expected efficiency and scalability of the privacy solution,
- The expected accuracy and performance of the model,
- The expected tradeoffs between privacy and accuracy/utility,
- How the explainability of model outputs may be impacted by your privacy solution,
- The feasibility of implementing the solution within the competition timeframe.
Describe how the solution will cater to the types of data provided to participants and articulate what additional work may be needed to generalize the solution to other types of data.
Articulate the anticipated use and purpose of licensed software.
Be free from typographical and grammatical errors.

Participants should refer to Section 7 on general submission requirements for additional guidance and style guidelines.

Judges will score the Concept Papers against the weighted criteria outlined in the table below. Solutions will need to carefully consider trade-offs between criteria such as privacy, accuracy, and efficiency, and should take the weightings of the criteria into account when considering these trade-offs. Concept Papers must also demonstrate how acceptable levels of both privacy and accuracy will be achieved – one must not be completely traded off for the other (a fully privacy-preserving but totally inaccurate model is not of use to anyone). Proposals that do not sufficiently demonstrate how both privacy and accuracy will be achieved will not be eligible to score points in the remaining criteria.

Evaluation Criteria and Judging

Submissions to Phase 1 will undergo initial filtering to ensure they meet minimum criteria before they are reviewed and evaluated by not fewer than three members of the expert judging panel. These minimum criteria include:

Participant meets eligibility requirements;
All required points listed above are included; and
Proposed solutions are coherently presented and plausible.

Submissions that have passed the initial filtering step will be reviewed for technical merit by members of the expert judging panel and evaluated against the following evaluation criteria.

Topic	Specific Criteria	Weighting (/100)
Technical Understanding	Does the white paper demonstrate an understanding of the technical challenges that need to be overcome to deliver the solution?	10
Privacy	Has the white paper considered an appropriate range of potential privacy attacks, and how the solution will mitigate those?	25
Accuracy	Is it credible that the proposed solution could deliver a useful level of model accuracy?	10
Efficiency and scalability	Is it credible that the proposed solution can be run within a reasonable amount of computational resources (CPU, memory, communication, storage), when compared to a centralized approach for the same machine learning technique? Does the white paper propose an approach to scalability that is sufficiently convincing from a technical standpoint to justify further consideration, and reasonably likely to perform to an adequate standard when implemented? Solution scalability will be evaluated primarily for a) the number of connected banks / financial institutions involved, b) volume of transactions, and c) volume of customer data held by banks	15
Adaptability	Is the proposed solution potentially adaptable to different use cases and/or different machine learning techniques?	5
Feasibility	Is it likely that the solution can be meaningfully prototyped within the timeframe of the challenge?	10
Innovation	Does the white paper propose a solution with the potential to improve on the state of the art in privacy enhancing technology? Does the white paper demonstrate an understanding of any existing solutions or approaches and how their solution improves on or differs from those?	20
Usability and Explainability	Does the proposed solution show that it can be easily deployed and used in the real world and provide a means to preserve any explainability of model outputs?	5

Participant solutions may be grouped by similar technical approaches for ease of comparison. Reviewers’ scores will be aggregated for each submission. The specific scores will not be released publicly (unless required by law) or provided to the submitting participants. The submissions will be ranked based on their score and receive prize awards of 1st Prize, 2nd Prize, or 3rd Prize. Submissions that have the same scores will be presented to the entire panel for review and merit-based ranking. All Blue Team Participants whose concept papers meet the minimum criteria described above will win invitations to advance to Phase 2.

NOTE: As detailed herein, Phase 3 of the Challenge includes the disclosure of Blue Team Participants’ Phase 1 and 2 submissions, including Concept Papers, to Red Team Participants, and by submitting an entry to any phase of the Challenge, Blue Team Participants agree to such disclosure. Accordingly, Blue Team Participants may wish to take appropriate measures to protect any intellectual property contained within their Challenge submissions, and such protection should be sought prior to entering them into the Challenge.

3. Rules for Phase 2 - Solution Development

The objective of the Challenge is to develop a PPFL solution that is capable of training a machine learning model on the datasets, while providing a demonstrable level of privacy against potential threats.

This PPFL solution should aim to:

Provide robust privacy protection for the collaborating parties;
Minimize loss of accuracy in the model; and
Minimize additional computational resources (including CPU, memory, communication), as compared to a non-federated (centrally trained) approach, and a baseline federated learning approach.

In addition to this, the evaluation process will reward Participants who:

Show a high degree of novel innovation;
Demonstrate how their solution (or parts of it) could be applied or generalized to other use cases;
Effectively prove or demonstrate the privacy guarantees offered by their solution, in a form that is comprehensible to data owners or regulators; and
Consider how their solution, or a future version of it, could be applied in a production environment.

Participation

Phase 2 is open to Blue Team Participants who won invitations because their Concept Papers met the minimum criteria described in the Phase 1 Scoring Process. No further registration will be required for the invitees to advance to Phase 2; however, invited Blue Team Participants should make necessary updates to their registration via their DrivenData account to reflect any changes in team composition or contact information.

The Solution Development Phase will have two data use case tracks matching those previously from the Concept Paper Phase—Track A: Financial Crime Prevention and Track B: Pandemic Response and Forecasting. Teams can elect to participate in either or both tracks, with solutions that apply to one track individually or with a generalized solution. Each solution must correspond to a concept paper that met the requirements from the Concept Paper Phase.

Each data use case track has a prize category for top solutions. All solutions that apply to Track A or Track B are eligible for a prize within that respective track. Teams who submit solutions for Track A and Track B are eligible to win a prize from each track.

Additionally, participants can enter a third Generalized Solutions category of prizes for the best solutions that are shown to be generalized and applied to both use cases with minor adaptations. Teams submitting generalized solutions are eligible for Generalized Solution prizes in addition to being eligible for prizes from both Track A and Track B.

Process

Upon challenge launch, the Organizers intend to provide Blue Team Participant access to:

Detailed concept briefs and instructions for each Track; and
Robust development dataset(s) relevant to each Track that either comprises 3 nodes or may be partitioned into 3 or more nodes.

A range of support will be provided to Participants during the Challenge including opportunities to engage with data protection and public and private sector organizations operating in arenas relevant to the Challenge’s tracks, as well as ad-hoc technical support from the third-party data providers.

Participants are—subject to having detailed their solution in their concept paper proposal—free to determine the set of privacy technologies used in their solutions within the parameters of the challenge, with the exception of hardware enclaves and other specialized hardware-based solutions. For example, any de-identification techniques, differential privacy, cryptographic techniques, or any combination thereof may be used.

Participants will prepare and submit code for a centralized baseline model (MC) and a privacy-preserving federated model (MPF) along with documentation for how to train and make inferences from the models, including a list of dependencies (e.g., a requirements.txt), and their own experimental privacy, accuracy, and efficiency metrics for the two models via the DrivenData Challenge Website.

Additional Requirements

In order to be eligible for prizes, Participants must submit all source code used in their solution for review. If the solution includes licensed software (e.g., open-source software), Participants must include the full license agreements with the submission. Include licenses in a folder labeled “Licenses”. Within the same folder, include a text file labeled “README” that explains the purpose of each licensed software package as it is used in your solution. DrivenData/NIST are not required to contact Participants for additional instructions if the code does not run. If DrivenData is unable to run the solution, including any requirement to download a license, the submission may be rejected. Blue Team Participants shall contact DrivenData immediately with any concerns about this requirement.

Evaluation Criteria and Judging

Submitted solutions will be deployed and evaluated on a technical infrastructure hosted by DrivenData. This infrastructure will provide a common environment for the testing, evaluation, and benchmarking of the solutions.

NOTE: As set forth in these Rules, the Organizers reserve to themselves the right, in their sole discretion, to revise the terms governing any phases taking place after the effective date of any such revision. The Organizers anticipate that the diversity and complexity of solutions proposed in Phase 1 may influence the final design of Phase 2. The Organizers will utilize abstracts and monitor and assess Participant feedback and discussions on the Challenge Website forum and during the informational webinars with financial crime and pandemic forecasting use cases subject matter experts, the level of diversity of solutions, judges’ preliminary review of the Concept Papers and technical capabilities and resources available for evaluation in order to refine requirements. Any refinements will be made in consultation with the data subject matter experts, the judges panel, DrivenData, and the UK Challenge partners. Announcements regarding updated requirements, processes, and judging and evaluation criteria will be posted on the Competition Website’s forum not less than 10 days prior to the Phase launch and posted publicly to the Competition Website at the start of the Red Team Participant registration. Participants should follow discussions and updates on the website forum (https://petsprizechallenge.drivendata.org/) to stay current on all Challenge news and events. Initial evaluation of the developed solutions will be based on a combination of quantitative metrics, and qualitative assessments by judges according to the following criteria:

Topic	Factors	Weighting (/100)
Privacy	Information leakage possible from the PPFL model during training and inference, for a fixed level of model accuracy. Ability to clearly evidence privacy guarantees offered by solution in a form accessible to a regulator and/or data owner audience	35
Accuracy	Absolute accuracy of the PPFL model developed (e.g., F1 score). Comparative accuracy of PPFL model compared with a centralized model, for a fixed amount of information leakage	20
Efficiency and scalability	Time to train PPFL model and comparison with the centralized model. Network overhead of model training. Memory (and other temporary storage) overhead of model training. Ability to demonstrate scalability of the overall approach taken for additional nodes	20
Adaptability	Range of different use cases that the solution could potentially be applied to, beyond the scope of the current challenge	5
Usability and Explainability	Level of effort to translate the solution into one that could be successfully deployed in a real world environment. Extent and ease of which privacy parameters can be tuned. Ability to demonstrate that the solution implementation preserves any explainability of model outputs.	10
Innovation	Demonstrated advancement in the state-of-the-art of privacy technology, informed by above-described accuracy, privacy and efficiency factors	10

Phase 2 may include one round of interaction with the teams so that they can provide any clarification sought by the judges. Comparison bins may be created to compare similar solutions. Solutions should make a case for improvements against existing state-of-the-art solutions.

Track-specific prize finalists will be determined based on the factors above for solutions submitted to each data track. Generalizable solutions run on both Tracks will be evaluated by combining the factors above, along with dedicated judging of the generalizability of the solution as described on the Challenge Website. as follows:

Topic	Factors	Weighting (/100)
Performance on Track A	See table above	40
Performance on Track B	See table above	40
Generalizability	Assessment of the technical innovation contributing to generalizability of the solution to the two use cases, and potential for other use cases	20

As with Phase 1, the outcomes of trade-off considerations among criteria as made in the Concept Papers should be reflected in the developed solution. Solutions must meet a minimum threshold of privacy and accuracy, as assessed by judges and measured quantitatively, to be eligible to score points in the remaining criteria.

The top solutions ranked by points awarded for Track A, Track B, and Generalized Solutions will advance to Red Team evaluation as described below. The results of the Red Team evaluation will be used to finalize the scores above in order to determine final rankings.

Rules for Phase 3 - Red Teaming

The objective of Phase 3 is to test the strength of the privacy-preserving techniques of the developed PPFL modes through a series of privacy audits and attacks. Red Team Participants will plan and launch audits and attacks against the highest-scoring solutions developed during Phase 2.

Process

Red Team Participants will register through the DrivenData Challenge Website. Red Teams must be distinct from the Blue Teams they are testing. (https://petsprizechallenge.drivendata.org/) Red Team registration will remain open from November 10, 2022 to December 2, 2022.

After close of registration period, Red Team Participants will receive access to:

A selection of Concept Papers submitted during Phase 1; and
Evaluation environment specifications, development data, and documentation to allow Red Team Participants to prepare for attacks.

Red Team Participants may begin constructing their attack scenarios upon access to Concept Papers and the red-teaming materials. Red Team Participants shall develop attack scenarios based upon the range of solutions identified in the Concept Papers. The Red Team scenario development period will run concurrently with Blue Team solution development. The Organizers intend to provide informational sessions or webinars with subject matter data and practicing experts in the use case fields.

Following the testing and judges review of developed solutions, Red Team Participants will enter the Red Team Attack period. As many as 9 of the top ranked solutions by points awarded may advance. Ideally, this would include 3 of the top ranked solutions in each track and at least 3 generalizable solutions. The intent is to advance the most possible solutions while limiting overlap across the tracks and with the generalized solutions. Therefore, if a top generalized solution is also one of the top 3 in one of the data tracks, then the next best-scoring solution in that data track will also be included as a finalist in the red team evaluation. Likewise, if a generalized solution scores in the top three of both data tracks, the next highest ranking solution from both data tracks may be advanced at the judges’ discretion. However, if the judges determine that less than 9 solutions meet the minimum criteria for privacy and accuracy, then fewer than 9 solutions will be advanced to the red teaming phase.

The Organizers may assign any Red Team Participants any set of Blue Team finalists' solutions. The number of Blue Team solutions assigned to each Red Team will not exceed five. Red Team Participants may be assigned Blue Team solutions where different solutions use different techniques and/or be for different tracks (Financial Crime Prevention, Pandemic Forecasting, or Generalized). Therefore, Red Team Participants should be prepared to attack any and all Blue Team solutions regardless of those solutions' underlying privacy mechanisms or track designations.

Assigned Blue Team solutions will be made available to the assigned Red Team Participants at the start of the Attack Period in order to initiate their privacy attacks. Datasets for common baseline privacy attacks may also be provided to Red Teams. Using their attack scenarios Red Teams will attempt to exploit vulnerabilities in each of the Blue Team solutions’ privacy-preserving techniques. Red Team Participants will document exposed vulnerabilities found in a single cumulative attack report per solution and submit the reports no later than the end of the Red Team Attack period.

Submission Criteria

Red Team Participants must submit required evaluation materials, including attack reports, by the submission deadline providing evidence addressing the evaluation criteria below. Reports will include justification if the attacks were not successful (supporting the robustness of the solutions). Red Team Participants must evaluate all assigned solutions to be eligible for prizes.

See the DrivenData Challenge Website at the start of the contest for more information on submission criteria for the attack report. A complete submission (corresponding to one assigned blue team solution) should consist of the following items:

A technical report of no more than 4 pages, describing the the privacy claims tested by the attack(s), the attack(s) themselves, and experimental results showing their effectiveness
An appendix of no more than 8 pages, to provide additional details and proofs (note that reviewers will not be required to read the appendix)
The implementation used to generate the experimental results in the report
A code guide of no more than 1 page, describing how the code implements the attack(s) described in the report

A template or example of the attack report will also be provided for reference. (https://petsprizechallenge.drivendata.org/)

Red Team Contest Evaluation Criteria and Judging

Success of red team attacks will be assessed by a panel of judges using the factors below, in order to evaluate the empirical results reported, the approaches taken and the severity of the flaws the Red Teams are able to exploit. Red team results on the same solutions will be compared to provide individual scores. Each Red Team’s attacks across the solutions it was assigned to attack will be assessed for consistency, difficulty, novelty, rigor, and practicality to inform scores per the criteria below.

NOTE: As set forth in these Rules, the Organizers reserve to themselves the right, in their sole discretion, to revise the terms governing any phases taking place after the effective date of any such revision. The Organizers anticipate that the diversity and complexity of solutions proposed in Phase 1 will influence the final design of Phase 3. The Organizers will utilize Concept Papers, monitor and assess Participant feedback and discussions on the Challenge Website forum and during the informational webinars with financial crime and pandemic forecasting use cases subject matter experts, along with the level of diversity of solutions, and technical capabilities and resources available for evaluation in order to refine requirements. Any refinements will be made in consultation with the data subject matter experts, the judges panel, DrivenData, and the UK Challenge partners. Announcements regarding updated requirements, processes, and judging and evaluation criteria will be posted on the Competition Website’s forum not less than 10 days prior to the Phase launch and posted publicly to the Competition Website at the start of the Red Team Participant registration. Participants should follow discussions and updates on the website forum (https://petsprizechallenge.drivendata.org/) to stay current on all Challenge news and events.

Topic	Factors	Weighting (/100)
Effectiveness	How completely does the attack break or test the privacy claims made by the target solution? (e.g. what portion of user data is revealed, and how accurately is it reconstructed)?	40
Applicability / Threat Model	How realistic is the attack? How hard would it be to apply in a practical deployment?	30
Generality	Is the attack specific to the target solution, or does it generalize to other solutions?	20
Innovation	How significantly does the attack improve on the state-of-the-art?	10

5. Rules for Phase 4: Open Source

Up to 7 Blue Team Participants that have been awarded prizes in Phase 2 may be selected by the Organizers to move to Phase 4. Teams scoring highest at the end of the Solution Development Contest, that have been validated as privacy-preserving through Red Team testing, will be invited to upload their code to a NIST central repository or to otherwise make it available on a publicly accessible repository or website of their choice to provide unrestricted, irrevocable, royalty-free use of their solution to the community as a whole, with the goal of furthering the state of the art in privacy-enhancing technologies and federated learning. Qualifying Blue Team participants who choose to develop their submission in a manner allowing deposit in an open source repository and demonstrate that they have completed the upload to a repository within 21 days of award announcements of the Solution Development Contest, are eligible for an Open Source Prize. Up to 7 open source prizes in the amount of $20,000 will be awarded.

To participate following invitation, send an email to PrivacyEng@nist.gov with the following information:

“PETs Open Source Submission” as the subject line.
The name of your submission in the Solution Development Contest.
A URL to open source repository, which contains:
- Full solution source code,
- Complete documentation for the code,
- A complete written explanation of their solution, algorithm,
- Any additional data sources used other than the provided data set(s),
- A correct mathematical proof that their solution satisfies differential privacy,
- An open source license.

Open Source awards and cash prizes will be subject to verification that the submission has been posted in full.

6. Judging Panel

The submissions will be judged by a qualified panel of expert(s) selected by the Organizers and appointed by the Director of NIST. The panel consists of Department of Commerce, National Institute of Standards and Technology and non- Department of Commerce, National Institute Standards and Technology experts who will judge the submissions according to the judging criteria identified above in order to select winners. Judges will not (A) have personal or financial interests in, or be an employee, officer, director, or agent of any entity that is a registered participant in a contest; or (B) have a familial or financial relationship with an individual who is a registered participant.

The decisions of the Judging panel for the contest will be announced in accordance with the dates noted in these rules. The Organizers will not make participants’ evaluation results from the Judging panel available to participants or the public (unless required by law).

7. General Submission Requirements for All Contests

In order for submissions to be eligible for review, recognition, and award, Participants must meet the following requirements:

Deadline - The submission must be available for evaluation by the end date noted in these rules.
No Organizers’ logo – The submission must not use any of the Organizers’ logos or official seals and must not claim or imply the endorsement of any Organizer.
Each submission must be original, the work of the Participant, and must not infringe, misappropriate, or otherwise violate any intellectual property rights, privacy rights, or any other rights of any person or entity.
It is an express condition of submission and eligibility that each Participant warrants and represents that the Participant's submission is solely owned by the Participant, that the submission is wholly original with the participant, and that no other party has any ownership rights or ownership interest in the submission. The Participant must disclose if they are subject to any obligation to assign intellectual property rights to parties other than the Participant and/or must disclose if the Participant is licensing or, through any other legal instrument, utilizing intellectual property of another party.
Each Participant further represents and warrants to the Organizers’ that the submission, and any use thereof by the Organizers shall not: (i) be defamatory or libelous in any manner toward any person, (ii) constitute or result in any misappropriation or other violation of any person's publicity rights or right of privacy, or (iii) infringe, misappropriate, or otherwise violate any intellectual property rights, privacy rights, or any other rights of any person or entity.
Each submission must be in English.
Abstract, Concept Papers and Attack Papers must be submitted in PDF format using 11-point font for the main text with page sizes set to 8.5”x 11” with margins of 1 inch all around.

Submissions will not be accepted if they contain any matter that, in the sole discretion of the Organizers, is indecent, obscene, defamatory, libelous, in bad taste, or demonstrates a lack of respect for public morals or conduct, promotes discrimination in any form, or which adversely affects the reputation of the Organizers. The Organizers shall have the right to remove any content from the contest websites in its sole discretion at any time and for any reason, including, but not limited to, any online comment or posting related to the Challenge.

If the Organizers, in their discretion, find any submission to be unacceptable, then such submission shall be deemed disqualified.

8. No Endorsement

You understand and agree that nothing in these Rules grants you a right or license to use any names or logos of NIST, the Department of Commerce, NSF, or any other Federal department, agency, or entity, or to use any other intellectual property or proprietary rights of NIST, the Department of Commerce, NSF, any other Federal department, agency, or entity, or their employees or contractors.

Further, you understand and agree that nothing in these Rules grants you a right or license to use any names or logos of the Challenge’s third-party data providers, such as SWIFT and the University of Virginia (UVA). In no event is a Participant permitted to use the name(s), mark(s), or logo(s) of SWIFT or UVA in a manner that states or implies an endorsement from such entities or in the context of the advertisement or sale of any product or service.

9. Terms and Conditions

9.1 Verification of Winners

ALL POTENTIAL CHALLENGE WINNERS WILL BE SUBJECT TO VERIFICATION OF IDENTITY, QUALIFICATIONS AND ROLE IN THE CREATION OF THE SUBMISSION BY THE ORGANIZERS.

Participants must comply with all terms and conditions of the Official Rules. Winning a prize is contingent upon fulfilling all requirements contained herein. The potential winners will be notified by email, telephone, or mail after the date of winning results. Each potential winner of a monetary or non-monetary prize will be required to sign and return to the National Science Foundation, which will disburse the award on behalf of the Organizers, within ten (10) calendar days of the date the notice is sent, a Fast Start Direct Deposit Form (NSF Form 1379) and a Contestant Eligibility Verification form in order to claim the prize.

In the sole discretion of the Organizers, a potential winner will be deemed ineligible to win if: (i) the person/entity cannot be contacted; (ii) the person/entity fails to sign and return an Fast Start Direct Deposit Form (NSF Form 1379) and a Contestant Eligibility Verification form within the required time period; (iii) the prize or prize notification is returned as undeliverable; or (iv) the submission or person/entity is disqualified for any other reason. In the event that a potential or announced winner is found to be ineligible or is disqualified for any reason, the Organizers in their sole discretion, may award the prize to another Participant.

9.2 Eligibility Requirements:

A Participant (whether an individual, team, or private entity) must have registered to participate and complied with all of the requirements under Section 105 of the America COMPETES Reauthorization Act of 2010 (Pub. L. No. 111-358), as amended by Section 401 of the American Innovation and Competitiveness Act of 2016 (Pub. L. No. 114-329) and codified in 15 U.S.C. § 3719 (hereinafter “America COMPETES Act” or “15 U.S.C. § 3719”) as contained herein..

A Participant who registers or submits an entry (whether an individual, private entity, or team or anyone acting on behalf of a private entity or team) to participate in this Challenge represents that they have read, understood, and agree to all terms and conditions of these official rules.

To be eligible to win a cash prize, a Participant must register as an individual, private entity, or team, as defined below:

Individual: a person aged 18 or older at time of entry into Challenge and a U.S. citizen or permanent resident of the United States or its territories.
Private Entity: a company, institution, or other organization that is incorporated in and maintains a primary place of business in the United States or its territories.
Team: a group of individuals or a group of private entities with at least one member of the team meeting the definition for either Individual or Private Entity.
Participants not eligible for cash prizes: a Participant that enters the Challenge without the ability to claim a cash prize based on the eligibility requirements above. Participants not eligible for cash prizes must be 18 years or older at the time of entry into the Challenge and cannot be individuals on the denied persons list nor from entities or countries sanctioned by the U.S. Government.

For all Participants, general eligibility requirements include:

Participants may not be a Federal entity or Federal employee acting within the scope of their employment.
Participants may not be a NIST or NSF employee.
Non-NIST or non-NSF Federal employees acting in their personal capacities should consult with their respective agency ethics officials to determine whether their participation in this Challenge is permissible. A Participant shall not be deemed ineligible because the individual or entity used Federal facilities or consulted with Federal employees during this challenge if the Federal employees and facilities are made available to all contestants on an equitable basis.
Participants may not be a NIST or NSF contractor or associate, or private entity providing services to NIST or NSF acting within the scope of their contract, employment, or funding or acquisition agreement with NIST or NSF which would involve the use of NIST or NSF funding to support a Participant’s participation in the Challenge.
Participants may not be working with NIST or NSF as a CRADA collaborator if the statement of work of the CRADA includes the subject matter of the Challenge or if the CRADA provides the Participant with a competitive advantage.
Participants may not be individuals or private entities which provide program support services to NIST or NSF including strategic planning, project / program management, communications, reporting, program evaluation, or other similar services to NIST or NSF.
Participants may not be employees, contractors, or associates of the Challenge’s third-party data providers, including SWIFT and UVA. Individuals who are former NIST or NSF Federal Employees or NIST or NSF Associates are not eligible to enter as an individual or member of a team for 365 days from their last date of paid employment or association with NIST or NSF with the exception of individuals in a student internship, experiential learning, or similar temporary employment status.
Individuals who are former employees or associates of the Challenge’s third-party data providers, including SWIFT and UVA, are not eligible to enter as an individual or member of a team for 365 days from their last date of paid employment or association with such third-party data providers, with the exception of individuals in a student internship, experiential learning, or similar temporary employment status.
Any individuals (including an individual’s parent, spouse, or child) or private entities involved with the design, production, execution, distribution or evaluation of the Challenge are not eligible to enter as an individual or member of a team.
Employees of any official co-sponsoring entities are not eligible to enter.
A Participant (whether participating as an individual, private entity, or member of a team) must not have been convicted of a felony criminal violation under any Federal law within the preceding 24 months and must not have any unpaid Federal tax liability that has been assessed, for which all judicial and administrative remedies have been exhausted or have lapsed, and that is not being paid in a timely manner pursuant to an agreement with the authority responsible for collecting the tax liability.
Contestants must not be suspended, debarred, or otherwise excluded from doing business with the Federal Government.
Individuals currently receiving NIST or NSF funding through a grant or cooperative agreement are eligible to compete but may not utilize the NIST or NSF funding for competing in this Challenge.
Previous and current NIST or NSF prize challenge participants are eligible to enter.

Multiple individuals and/or legal entities may collaborate as a team to submit a single entry.

9.3 All Participants must designate an Official Representative

At the time of entry, all Participants must designate one individual to serve as their Official Representative, and one individual to serve as an alternate to assume the role and requirements of the Official Representative if, and only if, the first individual has resigned from their role as Official Representative or has failed to respond to Organizers’ communications for a period of 30 consecutive days. The Official Representative will be the only individual with the authority to officially interact and communicate with the Organizers regarding the contestant-created materials, completion of tasks as part of the Challenge, signing official documentation related to the Challenge, providing information to process prize payments, and any other administrative requests related to the Challenge.

The eligibility of a Participant is determined by the Participant’s registration status (individual, private entity or team) as defined above – the Official Representative does not determine the Participant’s eligibility.

For Individual Participants, by default the Official Representative must be the individual.
For Private Entity Participants, the Official Representative can be any individual designated by the Private Entity.
For Team Participants
- If the Team is comprised of Individuals, the Official Representative must be a team member who individually meets the eligibility requirements of an Individual Participants.
- If the Team is comprised of Private Entities, the Official Representative can be any individual designated by the Private Entity leading the team.
- If the Team is comprised of a mix of Individuals and Private Entities, the Official Representative, designated by the team, can be any qualified individual meeting the requirements of an Individual or member of a Private Entity.

The Official Representative will be authorized to interact with the Organizers and be responsible for meeting all entry, evaluation, and administrative requirements of the challenge.

If in the event a Participant decides to withdraw their submission from consideration, the Official Representative must notify the Organizers in writing of their decision.

If a Participant (whether an individual, private entity, or team) is selected as a prize winner, NSF will award a single dollar amount to the account named in the Fast Start Direct Deposit Form (NSF Form 1379) standard form by the Official Representative. The named account must belong to an individual or private entity as defined above in the eligibility requirements for Individual or Private Entity.

On behalf of the Team as defined above, the Official Representative shall be solely responsible for allocating any prize amount among the members of the Team. The Organizers will not arbitrate, intervene, advise on, or resolve any matters between team members.

9.4 Winners Not Eligible for Cash Prizes

Winners who are found to be ineligible for cash prizes may still be publicly recognized. In the event that the prize award normally allotted to the place or rank of an ineligible winner occurs, the cash prize will be awarded to the next eligible winner in the series or ranking. Throughout the challenge, winners who are ineligible for cash prizes will continue to have opportunities to have their work viewed and appreciated by stakeholders from industry, government and academic communities.

9.5 Submission & Intellectual Property Rights

Except as otherwise provided for herein, any applicable intellectual property rights to a submission will remain with the Participant. By participating in the Challenge, the Participant is not granting any rights in any patents, pending patent applications, or copyrights related to the technology described in the entry.

As detailed in these rules, Phase 3 of the Challenge includes the disclosure of Blue Team Participants’ Phase 1 and 2 submissions to Red Team Participants, and by submitting an entry to any phase of the Challenge, Blue Team Participants acknowledge and agree to such disclosure. Accordingly, Blue Team Participants may wish to take appropriate measures to protect any intellectual property contained within their submissions, and such protection should be sought prior to the entry of such submissions into the Challenge.

Rights Granted to Reviewing Parties

By submitting an entry to any phase of the Challenge, the Participant is granting to the Department of Commerce, the Organizers, the National Aeronautics and Space Administration (NASA), and any parties acting on their behalf, including DrivenData (Reviewing Parties) certain limited rights as set forth below.

The Participant grants to the Reviewing Parties the right to review the submission, to describe the submission in any materials created in connection with this Challenge, to screen and evaluate the submission, and to have the Judges, Challenge administrators, and the designees of any of them, review the submission. The Reviewing Parties will also have the right to publicize the Participant’s name and, as applicable, the names of Participant’s team members and/or organization that participated in the submission following the conclusion of the Challenge.
The Participant grants to the Reviewing Parties the right to name the Participant as a Participant and to use the Participant’s name and the name and/or logo of Participant’s company or institution (if the Challenge entry is from a company or institution) on the Challenge website and in materials from the Reviewing Parties that announce the winners, finalists, or Participants in the Challenge. Other than the uses set forth herein, the Participant does not grant to any of the Reviewing Parties any right to the Participant’s trademarks.
The Participant grants to the Reviewing Parties a royalty-free, non-exclusive, irrevocable, worldwide license to display publicly and use for promotional purposes the participant’s entry (“demonstration license”). This demonstration license includes posting or linking to the Participant’s entry on the Reviewing Parties’ websites, including the competition website and partner websites, and inclusion of the Participant’s submission in any other media, worldwide.

In the event of any conflict between the rights granted pursuant to these rules and the terms of use of any third-party administrator of the Challenge, such as DrivenData, these rules are controlling.

Rights in Data Generated by the Reviewing Parties

Any data generated in the evaluation of Participant’s submissions is the property of the Organizers. The Participants, reviewers, and Judges involved in the evaluation acknowledge and agree that the Organizers will own this evaluation data, and that the evaluation-created data can be used in future research and development activities. To the extent that the Organizers are able to do so, they will anonymize any such data for research purposes, whether it is being used internally or published, and will not include any Participant’s, reviewer’s, or Judge’s personally identifiable information. The Participant acknowledges and agrees that the data generated through the evaluation of submissions may be used by the Organizers for future research related to the Challenge.

Rights Granted to Red Team Participants

As detailed herein, Phase 3 of the Challenge includes the disclosure of Blue Team Participants’ Phase 1 and 2 submissions to Red Team Participants. By submitting an entry to the Challenge, a Blue Team Participant acknowledges and agrees to such disclosure and grants to the Red Team Participants, as applicable, certain limited rights as set forth below. Accordingly, a Blue Team Participant may wish to take appropriate measures to protect any intellectual property included within its submissions, and such protection should be sought prior to entry of such submissions into the Challenge.

The Blue Team Participant grants to Red Team Participants the right to review, screen, evaluate, test, exploit, and attack its submissions and otherwise use its submissions in the performance of the Challenge.

Rights in Materials Supplied by Third Parties

As described in the rules, the Challenge will include sample code and synthetic datasets provided by one or more third parties, such as SWIFT and the University of Virginia (UVA), for use by the Participants in the Challenge. By submitting an entry to any phase of the Challenge, the Participants agree that their use of such third-party materials will conform with requirements set forth below.

For those materials and data supplied by SWIFT, such as synthetic datasets, sample code on AI/machine learning (ML) models, methodology, and code, all intellectual rights contained therein shall remain vested in SWIFT exclusively. For those materials and data supplied by UVA, such as synthetic datasets, UVA shall retain ownership of any rights it may have therein.

To the extent that the Participant incorporates sample code for AI/ML model provided by SWIFT into Participant’s own software, models, solutions, or submissions, SWIFT grants to the Participant a free, perpetual, non-exclusive, non-transferable license to use such sample code for any purposes, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the software developed based on this code. The Participant must acknowledge SWIFT as the source of the sample code and the owner of any intellectual property rights contained therein, while the Participant will own all intellectual property rights contained in the Participant’s modifications, enhancements, or upgrades of the sample code.
The Participant’s use of the SWIFT synthetic datasets is limited to the Participant’s performance in the Challenge only, and any and all rights to such datasets remain vested in SWIFT. SWIFT grants to the Participants a free, non-transferable, non-exclusive license to use the datasets only for the purpose of the Challenge.
The Participant’s use of the UVA synthetic datasets is limited to the Participant’s performance in the Challenge only. Where use of the UVA synthetic datasets results in any written, visual, or oral academic publication or presentation by the Participant which concerns the Participant’s use of the UVA synthetic datasets in the Challenge, the Participant must acknowledge UVA as the source of the datasets and the owner of any intellectual property rights contained therein as follows: “This work is based on the dataset developed under NSF RAPID: COVID-19 Response Support: Building Synthetic Multi-scale Networks, NSF Expeditions in Computing and the University of Virginia Strategic Investment Fund (provided under the UK-US PETs Prize Challenges, available only to participants) by the University of Virginia Biocomplexity Institute and Initiative, Network Systems Science and Advanced Computing Division, and such dataset is used with express permission under the UK-US PETs Prize Challenge Rules.”
Where use of the UVA synthetic datasets results in any non-academic public announcement, disclosure, advertisement, or news release about the use of the UVA synthetic datasets in the Challenge, the Participant must acknowledge UVA as the source of the datasets and the owner of any intellectual property rights contained therein as follows: “Dataset provided by the University of Virginia Biocomplexity Institute and Initiative, Network Systems Science and Advanced Computing Division.”
At no time may a Participant in the Challenge share any of the third-party material, information, code, or datasets that have been provided for the purpose of the Challenge with any entity or individual outside of those working in the performance of the Challenge on the Participant’s behalf.
In any instance in which a Participant electronically transmits any third-party material, information, code, or datasets that have been provided for the purpose of the Challenge, the Participant must encrypt such material, information, code, or datasets during the transmission process.
Upon the conclusion of the Blue Team Participant’s participation in the Challenge, but no later than the date on which Phase 2 submission are due according to these rules, the Blue Team Participant shall ensure that all of the third-party material, information, code, and datasets that have been provided for the purpose of the Challenge are removed and deleted from all of Participant’s devices, networks, and storage services. Except as otherwise provided for herein, no Participant may retain any portion of the third-party material, information, code, or datasets that have been provided for the purpose of the Challenge beyond the Challenge’s conclusion.
Upon the conclusion of the Red Team Participant’s participation in the Challenge, but no later than the date on which the date on which the Red Team Participant’s Phase 3 attack reports are due, the Red Team Participant shall ensure that all of the third-party material, information, code, and datasets that have been provided for the purpose of the Challenge are removed and deleted from all of Participant’s devices, networks, and storage services. Except as otherwise provided for herein, no Participant may retain any portion of the third-party material, information, code, or datasets that have been provided for the purpose of the Challenge beyond the Challenge’s conclusion.
If notified by DrivenData and/or the Organizers at any point during the Challenge, the Participant may be required to remove and delete some or all of the third-party material, information, code, and datasets that have been provided for the purpose of Challenge from the Participant’s devices, networks, and storage services and to cease any and all use of the same. In such an instance, the Participant agrees to comply with this requirement within 10 business days of notification thereof.

Independent Creation of Intellectual Property

This Challenge in no way limits the ability of third-party providers of data, including SWIFT and UVA, to independently develop or acquire similar or competing machine learning or AI models, privacy-enhancing technologies, products, processes, or services.
The Participant acknowledges that such third-party providers of data, including SWIFT and UVA and their affiliates, may currently or in the future develop information internally or acquire from third parties certain intellectual property rights that are similar in substance to the solutions developed by the Participant during the Challenge.
SWIFT and UVA may use general knowledge acquired during the Challenge and other residual information for any purpose, including, without limitation, use in development, manufacture, promotion, sales, and maintenance of its products and services. Residual information means any information that is retained in the unaided memories of an employee, consultant, or contractor who receives access to the Challenge information. For the avoidance of doubt and except as otherwise provided for herein, the Organizers will not share any of the Participant’s submissions with either SWIFT or UVA without the prior, explicit consent of the Participant.

9.6 Warranties:

By submitting an entry, each Participant represents and warrants that: the Participant is the sole author and copyright owner of the submission, and the submission is an original work of the Participant; or, if the submission is not entirely owned by the participant, that the Participant has acquired through an appropriate license or other written agreement, sufficient rights to use and to authorize others, including the Reviewing Parties, to use the submission, as specified throughout the Official Rules. Each Participant further represents and warrants that the submission does not infringe upon any copyright or upon any other third-party rights of which the Participant is aware and that the submission is free of malware.

By submitting an entry, the Participant represents and warrants that all information submitted is true and complete to the best of the Participant’s knowledge, that the Participant has the right and authority to submit the entry on the Participant’s own behalf or on behalf of the persons and entities that the Participant specifies within the entry, and that the entry (both the information and materials submitted in the entry and the underlying technology/method/idea/treatment protocol/solution described in the entry):

is the Participant’s own original work, or is submitted by written permission with full and proper credit given within the entry;
does not contain confidential information or trade secrets (the Participant’s or anyone else’s);
does not knowingly violate or infringe upon the patent rights, industrial design rights, copyrights, trademarks, rights in technical data, rights of privacy, publicity or other intellectual property or other rights of any person or entity;
does not contain malicious code, such as viruses, malware, timebombs, cancelbots, worms, Trojan horses or other potentially harmful programs or other material or information;
does not and will not violate any applicable law, statute, ordinance, rule, or regulation, including, without limitation, United States export laws and regulations, including but not limited to, the International Traffic in Arms Regulations and the Department of Commerce Export Regulations; and
does not trigger any reporting or royalty or other obligation to any third party.

9.7 No Confidential Information

By making a submission to this Challenge, each Participant agrees that no part of its submission includes any trade secret information, ideas or products, including but not limited to information, ideas, or products within the scope of the Trade Secrets Act, 18 U.S.C. § 1905. All submissions to this Challenge are deemed non-confidential. Since the Organizers do not intend to receive or hold any submitted materials “in confidence” it is agreed that, with respect to the Participant’s entry, no confidential or fiduciary relationship or obligation of secrecy is established between NIST and the Participant, the Participant’s team, or the company or institution the Participant represents when submitting an entry, or any other person or entity associated with any part of the Participant’s entry.

9.8 No Obligation of Funds

This document outlines the Official Rules for the PETs Prize Challenge. Nothing within this document or in any documents supporting the Challenge shall be construed as obligating the Department of Commerce, NIST, NSF, or any other Federal agency or instrumentality to any expenditure of appropriated funds, or any obligation or expenditure of funds in excess of or in advance of available appropriations.

9.9 Challenge Subject to Applicable Law

All phases of the Challenge are subject to all applicable federal laws and regulations. Participation constitutes each Participant's full and unconditional agreement to these Official Rules and administrative decisions, which are final and binding in all matters related to the challenge. Eligibility for a prize award is contingent upon fulfilling all requirements set forth herein. This notice is not an obligation of funds; the final award of prizes is contingent upon the availability of appropriations.

Participation is subject to all U.S. federal, state and local laws and regulations, including, but not limited to, payment of tax obligations on the amount of monetary awards, if any. See section 9.12 below. Participants are responsible for checking applicable laws and regulations in their jurisdiction(s) before participating in the prize competition to ensure that their participation is legal. The Organizers shall not, by virtue of conducting this prize competition, be responsible for compliance by Participants in the prize competition with Federal Law including licensing, export control, and nonproliferation laws, and related regulations or with state laws. Individuals entering on behalf of or representing a company, institution or other legal entity are responsible for confirming that their entry does not violate any policies of that company, institution or legal entity.

9.10 Resolution of Disputes

The Organizers are solely responsible for administrative decisions, which are final and binding in all matters related to the challenge.

In the event of a dispute as to any registration, the authorized account holder of the email address used to register will be deemed to be the Participant. The "authorized account holder" is the natural person or legal entity assigned an email address by an Internet access provider, online service provider or other organization responsible for assigning email addresses for the domain associated with the submitted address. Participants and potential winners may be required to show proof of being the authorized account holder.

9.11 Publicity

The winners of these prizes (collectively, "Winners") will be featured on the Organizers', NASA's, and any other parties acting on their behalfs websites, newsletters, social media, and other outreach materials of the Organizers and any parties acting on their behalf.

Participation in the Challenge constitutes each winner's consent to the Organizers, its agents', and any Challenge Co-Sponsors’ use of each winner's name, likeness, photograph, voice, opinions, and/or hometown and state information for promotional purposes through any form of media, worldwide, without further permission, payment or consideration.

9.12 Payments

The prize competition winners will be paid prizes directly from the National Science Foundation. Prior to payment, winners will be required to verify eligibility. The verification process includes providing the full legal name, tax identification number or social security number, routing number and banking account to which the prize money can be deposited directly.

All cash prizes awarded to Participants are subject to tax liabilities, and NSF will not perform any tax withholding on behalf of the Participant when disbursing a case prize to that Participant.

9.13 Liability and Insurance

Any and all information provided by or obtained from the Federal Government and/or this Challenge’s third-party data providers, including SWIFT and UVA, is provided “AS-IS” and without any warranty or representation whatsoever, including but not limited to its suitability for any particular purpose. Upon registration, all Participants agree to assume and, thereby, have assumed any and all risks of injury or loss in connection with or in any way arising from participation in this challenge, development of any application or the use of any application by the Participants or any third-party. Upon registration, except in the case of willful misconduct, all Participants agree to and, thereby, do waive and release any and all claims or causes of action against the Federal Government and its officers, employees and agents for any and all injury and damage of any nature whatsoever (whether existing or thereafter arising, whether direct, indirect, or consequential and whether foreseeable or not), arising from their participation in the challenge, whether the claim or cause of action arises under contract or tort. Further, in no event shall SWIFT or UVA be held liable for any claim, damages, or other liability, whether in an action of contract, tort, or otherwise, arising from, out of, or in connection with the use of the provided sample code, datasets, or other materials in the Challenge. Upon registration, all Participants agree to and, thereby, shall indemnify and hold harmless the Federal Government and its officers, employees and agents for any and all injury and damage of any nature whatsoever (whether existing or thereafter arising, whether direct, indirect, or consequential and whether foreseeable or not), including but not limited to any damage that may result from a virus, malware, etc., to Government computer systems or data, or to the systems or data of end-users of the software and/or application(s) which results, in whole or in part, from the fault, negligence, or wrongful act or omission of the Participants or Participants' officers, employees or agents.

Participants are not required to obtain liability insurance for this Challenge.

9.14 Records Retention and FOIA

All materials submitted to the Organizers as part of a submission become official records and cannot be returned. As noted earlier (see section 9.7), by making a submission to this prize competition, each Participant agrees that no part of its submission includes any confidential or trade secret information, ideas or products, and all submissions shall be deemed non-proprietary. The Organizers reserve the right to reject and deem ineligible for consideration any information designated confidential, whether at the time of submission or otherwise. Submitters may be notified of any Freedom of Information Act (FOIA) requests for their submissions, to the extent the information is not already made public in this competition or otherwise, in accordance with the respective FOIA rules of the Organizers.

9.15 Privacy Advisory

The DrivenData, Inc website is hosted by a private entity. To review DrivenData’s privacy policy see https://www.drivendata.org/privacypolicy/. The solicitation and collection of Participants’ personal or individually identifiable information (e.g., registration) is subject to the hosts’ privacy and security policies and will not be disclosed or shared, except as provided in those policies and in these Challenge Rules. Challenge winners shall be required to provide additional personally identifiable information to NSF in order to collect an award.

9.16 508 Compliance

NIST and NSF consider universal accessibility to information a priority for all individuals, including individuals with disabilities. The Organizers are strongly committed to meeting their compliance obligations under Section 508 of the Rehabilitation Act of 1973, as amended, to ensure the accessibility of its programs and activities to individuals with disabilities. This obligation includes acquiring accessible electronic and information technology. When evaluating submissions for this challenge, the extent to which a submission complies with the requirements for accessible technology required by Section 508 will be considered.

9.17 General Conditions

This prize competition shall be performed in accordance with the America COMPETES Reauthorization Act of 2010, Pub. Law 111-358, title I, § 105(a), Jan. 4, 2011, codified at 15 U.S.C. § 3719 and amended by the American Innovation and Competitiveness Act of 2016 (Pub. L. No. 114-329) (hereinafter “America COMPETES Act”).

The Organizers reserve the right to cancel, suspend, and/or modify the challenge, or any part of it, if any fraud, technical failures, or any other factor beyond the Organizers’ reasonable control impairs the integrity or proper functioning of the challenge, as determined by the Organizers in their sole discretion. The Organizers shall not be responsible for, nor shall be required to count or consider, incomplete, late, misdirected, damaged, unlawful, or illicit submissions, including those secured through payment or achieved through automated means.

The Organizers reserve the right in their sole discretion to extend or modify the dates of the Challenge, and to change the terms set forth herein governing any phases taking place after the effective date of any such change. By entering, you agree to the terms set forth herein and to all decisions of the Organizers and/or all of their respective agents, which are final and binding in all respects.

ALL DECISIONS BY the Organizers (NIST and NSF) ARE FINAL AND BINDING IN ALL MATTERS RELATED TO THE CHALLENGE.

10. Point of Contact

For questions contact PrivacyEng@nist.gov

I understand that the DrivenData website is hosted by a private entity and is not a service of NIST or NSF. The solicitation and collection of personal or individually identifiable information is subject to the host’s privacy and security policies. Personal or individually identifiable information collected on this website may be shared with NSF for prize payment purposes only, and only for winners of the Challenge.

I acknowledge that I have read the DrivenData data privacy and terms of use policies and understand that all data, except that expressly marked for government use, is subject to these DrivenData policies. I further agree not to hold the Organizers or the U.S. Government liable for the protection, use, or retention of any information submitted through this website.

I acknowledge that any Contribution that I post to the site will be considered non-confidential and non-proprietary. By providing any User Contribution on the Website, I grant DrivenData and their affiliates and service providers, and each of my and their respective licensees, successors and assigns the right to use, reproduce, modify, perform, display, distribute and otherwise disclose to third parties any such material for any purpose, except as otherwise stated explicitly in these Challenge Rules.

By pressing I agree, I acknowledge that I have read, understood, and accept all of the PETs Prize Challenge Official Challenge Rules, terms and conditions expressed here.

U.S. PETs Prize Challenge: Phase 2 (Financial Crime–Federated)

Quick Facts

Participants

No. of Entries

Prize

Winner

Scarlet Pets

Navigation

Quick access

Data Track A: Financial Crime

Transforming Financial Crime Prevention

Background

Data Overview

Dataset 1: Transaction data held by SWIFT

Dataset 1 details

End-to-end transactions

Dataset 2: Account data held by banks

Dataset 2 details

Development and Evaluation Data

New Development Dataset

Threat Profile

Scope of sensitive data

Anomaly detection models

Prediction Target and Evaluation Metric

Partitioned Datasets for Federated Learning

Partitioning Overview

Partitioning Details

Key Task

Example Centralized Baseline

Good luck

On this page