Competition: Pushback to the Future: Predict Pushback Time at US Airports (Open Arena)

Navigation

Quick access

About the data

The nature of air travel is connecting far-flung places; the downside is that events in one location can have ripple effects throughout the system. Effective decision-making needs to consider a large number of factors, and the right decision might not always be the intuitive one. Furthermore, the drivers for these decisions are highly uncertain systems in their own right; weather is the major driver of delays in the National Airspace System (NAS), for example. Airlines and the Federal Aviation Administration (FAA) have many decision-support tools at their disposal to make the most informed decisions as possible. The stakeholders for these tools are always seeking better algorithms, predictors, optimization approaches, and other elements to gain some predictability and save on costs.

The FAA works with airports, airlines, and other flight operators to collect raw data about all flights in the NAS and distribute that data via SWIM (System-Wide Information Management). As part of the Airspace Technology Demonstrations 2 (ATD-2) project, NASA developed Fuser to process this torrent of raw data and provide cleaned, real-time data on the status of individual flights nationwide, facilitating downstream air traffic management tools.

The next generation of these services will be housed in the the Digital Information Platform (DIP). DIP will transform ATD-2 into a service-oriented architecture, where individual predictions are made by separate services, allowing them to be combined in new and innovative ways. Pushback prediction will be one of these services, providing a new prediction that could be used for ground traffic prediction, delay mitigation, flight trajectory optimization, and more.

Diagram of the Digital Information Platform — ^{A diagram showing the transformation from Airspace Technology Demonstrations 2 (ATD-2), a monolithic service, to the Digital Information Platform (DIP), a service-oriented architecture that provides air traffic management predictions as microservices..}

About Pushback

The ATD-2 project has already used Fuser to implement various data-driven machine learning systems to predict airport configuration, runway assignment, taxi time, and more. All of these predictions follow from where a flight begins: Pushback. In order to predict everything else about a flight, a system must first predict its pushback time. However, because pushback time depends upon factors not directly observed by the FAA, such as passenger load, cargo load, and crew procedures, pushback time is difficult to predict, and contributes quite a bit of uncertainty to all following predictions.

This data may hold the key to better predictability, but because it is proprietary to each individual flight operator, it is impossible to train a machine learning model using conventional methods. This is why, for this challenge, we are exploring the use of federated learning. In theory, federated learning could allow us to train better models, while preserving the privacy of flight operators’ data. This challenge will help us test that theory on real-world NAS data.

About Federated Learning

In federated learning, model training is decentralized and parties do not need to share any data. This process is generally broken into four steps:

A central server who is coordinating the model training starts with an initial trained model.
The central server transmits the initial model to each of the data holders.
Each data holder conducts training with the model on their own data locally.
The data holders send only their training results back to the central server, which securely aggregates these results into the final trained model.

Diagram of basic federated learning process. — ^{A diagram showing the basic training process under federated learning. Adapted from Wikimedia Commons under CC-BY SA.}

Here are some additional general resources about federated learning:

About the NASA team

In recent years, the amount of data available in the NAS has exploded, as has the capability of data science algorithms to extract meaning and make decisions from large volumes of data. For example, NASA’s Airspace Technology Demonstrations project has made use of machine learning as part of the Integrated Arrival/Departure/Surface (IADS) system to optimize traffic at airports. Efforts such as this have shown that the main bottleneck now is the effort of accessing, understanding, and consuming many disparate data feeds from many different sources. The goal of the Digital Information Platform is to provide a consistent, easy to use platform where a wide variety of NAS data is readily available. This will accelerate the transformation of the NAS by facilitating the development of state-of-the-art data-driven services for use by both traditional airlines and emergent operations like Unmanned Aerial Systems and Urban Air Mobility.

Additional resources

Introduction to Air Traffic Management, NASA Berkeley Aviation Data Science Seminars

Pushback to the Future: Predict Pushback Time at US Airports (Open Arena)

Quick Facts

Participants

No. of Entries

Prize

Winner

SKS_cube