What's Up, Docs? Document Summarization with LLMs

Looking for a great way to start working with LLMs? See if you can summarize research papers from an open archive of the social sciences. #development #science

beginner practice
11 months left
96 joined

Overview

Welcome! We're happy you're here!

Can you write abstracts for academic social science papers using a large language model that runs on your computer? OK, but can you do it well?

Scientific papers describing research methodology and findings are the gold standard for sharing scientific work. These papers typically start with an abstract that summarizes the entire paper in a single paragraph. Your goal in this practice competition is to generate these abstracts from SocArXiv papers using large language models, or LLMs.

While this task isn't easy to do well, it's easy to start! This is a practice competition designed to be accessible to participants at all levels. That makes it a great place to dive into the world of data science competitions or explore LLMs for the first time.


Competition End Date:

April 1, 2026, 11:59 p.m. UTC

This competition is for learning and exploring, so the deadline may be extended in the future.

How to compete

  1. Click the big "Compete!" button at the top of the sidebar to enroll in the competition.
  2. Get familiar with the problem through the Overview (above) and the Problem Description. You might also want to reference additional resources available on the About page.
  3. Download the data from the Data download tab.
  4. Create your own model. The benchmark blog post is a good place to start.
  5. Use your model to generate predictions that match the submission format (available here).
  6. Click “Submit” in the sidebar, and then “Make new submission”. You’re in!
  7. Bonus: share your work! Click the "+" icon on the Submissions page and add a link to your approach.

Data were derived from SocArXiv, a repository for scientific documents in the Social Sciences hosted on the OSF Preprints platform.

Image courtesy of Mauricio Mendez licensed under CC BY-SA 2.0.