Hateful Memes

alt-text

The top submissions substantially exceed the performance of the baseline models we provided as part of the challenge. We hope others across the AI research community will continue to build on their work and be able to improve their own systems.
— Dr. Douwe Kiela, Facebook AI Research Scientist and author of Hateful Memes dataset

Why

At the massive scale of the internet, the task of detecting multimodal hate is both extremely important and particularly difficult. Relying on just text or just images to determine whether a meme is hateful is insufficient. By using certain types of images, text, or combinations, a meme can become a multimodal type of hate speech.

The Solution

The goal of this challenge was to develop multimodal machine learning models—which combine text and image feature information—to automatically classify memes as hateful or not. The team at Facebook AI created the Hateful Memes dataset to engage a broader community in the development of better multimodal models for problems like this. Participants were given a limited number of submissions to achieve the highest AUC ROC score when classifying an unseen test set of memes.

The Results

Over the course of the competition, we saw over 3,000 participants enter the challenge, visiting the site from more than 150 countries around the world! The top five submissions in Phase 2 achieved AUC ROC scores of 0.79–0.84 on the unseen test set, all significantly above the baselines reported in the paper accompanying the dataset release. These scores also corresponded to accuracies of 70–75%, compared to accuracy under 65% for the top benchmark from the paper.

All prize-winning solutions and accompanying academic write-ups from this competition are openly available for anyone to use and learn from. For more information see the links below.

RESULTS ANNOUNCEMENT + MEET THE WINNERS

WINNING MODELS ON GITHUB

NEURIPS COMPETITION REPORT

Hateful Memes

Why

The Solution

The Results

The competitions

Phase 1

Phase 2