Hateful Memes: Phase 1

Detecting hateful content presents a unique challenge in memes, where multiple data modalities need to be analyzed together. Facebook is calling on researchers around the world to help identify which memes contain hate speech. #society

Development arena
Completed apr 2021
3,929 joined

alt-text

Overview

Your goal is to create an algorithm that identifies multimodal hate speech in internet memes.

Take an image, add some text: you've got a meme. Internet memes are often harmless and sometimes hilarious. However, by using certain types of images, text, or combinations of each of these data modalities, the seemingly non-hateful meme becomes a multimodal type of hate speech, a hateful meme.

At the massive scale of the internet, the task of detecting multimodal hate is both extremely important and particularly difficult. As the illustrative memes above show, relying on just text or just images to determine whether or not a meme is hateful is insufficient. Your algorithm needs to be multimodal to excel at this task!

The team at Facebook AI has decided to take the challenge head on—and they want your help! The team defines hate speech as:

A direct or indirect attack on people based on characteristics, including ethnicity, race, nationality, immigration status, religion, caste, sex, gender identity, sexual orientation, and disability or disease. We define attack as violent or dehumanizing (comparing people to non-human things, e.g. animals) speech, statements of inferiority, and calls for exclusion or segregation. Mocking hate crime is also considered hate speech.

This definition mirrors the community standards on hate speech employed by Facebook, and is intended to provide an actionable classification label: if something is hate speech according to this definition, it should be taken down; if not, even if it is distasteful or objectionable, it is allowed to stay.

Starter code from Facebook AI's MMF is also available on GitHub to help you get started.

This challenge provides an opportunity to help advance multimodal machine learning systems and push forward state-of-the-art performance on a novel data set. We can't wait to see what you come up with!

Note on Annotator Accuracy

As is to be expected with a dataset of this size and nature, some of the examples in the training set have been misclassified. We are not claiming that our dataset labels are completely accurate, or even that all annotators would agree on a particular label. Misclassifications, although possible, should be very rare in the dev and seen test set, however, and we will take extra care with the unseen test set.

As a reminder, the annotations collected for this dataset were not collected using Facebook annotators and we did not employ Facebook’s hate speech policy. As such, the dataset labels do not in any way reflect Facebook’s official stance on this matter.

Why detecting hateful memes requires a multimodal approach.


There will be two phases to this challenge:

  • Phase 1: Data exploration and model building (May - Oct 2020): Participants can get access to the research dataset. Submissions may be made to the public leaderboard. These scores will not determine final rankings for prizes.

  • Phase 2: Submissions open against final test set (12:00 am UTC on Oct. 1, 2020 - 11:59 pm UTC on Oct. 31, 2020): Participants will have the opportunity to make three submissions against a new, unseen test set. All teams must be formed prior to the beginning of Phase II. Performance against this new test set will be used to determine prizes.

Note: Pre-trained models and external data are explicitly allowed in this competition as long as the participant has a valid license for use in accordance with the Competition Rules. Top-performing participants will be required to certify in writing that they have permission to use all external data used to develop their submissions, and may be required to provide documentation demonstrating such permission to the satisfaction of the competition sponsor.

For more information see the Problem Description. Upon entry you can also revisit the Competition Rules for full details.


Competition End Date:

April 30, 2021, 11:59 p.m. UTC

Place Prize Amount
1st $50,000
2nd $25,000
3rd $10,000
4th $8,000
5th $7,000


Prize generously supplied by Facebook AI.


NO PURCHASE NECESSARY TO ENTER/WIN. A PURCHASE WILL NOT INCREASE YOUR CHANCES OF WINNING. The Competition consists of two (2) Phases, with winners determined based upon Submissions using the Phase II dataset. The start and end dates and times for each Phase will be set forth on this Competition Website. Open to legal residents of the Territory, 18+ & age of majority. "Territory" means any country, state, or province where the laws of the US or local law do not prohibit participating or receiving a prize in the Challenge and excludes any area or country designated by the United States Treasury's Office of Foreign Assets Control (e.g. Cuba, Sudan, Crimea, Iran, North Korea, Syria, Venezuela). Any Participant use of External Data must be pursuant to a valid license. Void outside the Territory and where prohibited by law. Participation subject to official Competition Rules. Prizes: $50,000 USD (1st), $25,000 (2nd), $10,000 USD (3rd), $8,000 USD (4th), $7,000 USD (5th). See Official Rules and Competition Website for submission requirements, evaluation metrics and full details. Sponsor: Facebook, Inc., 1 Hacker Way, Menlo Park, CA 94025 USA.