SNOMED CT Entity Linking Benchmark

A benchmark for linking text in medical notes to entities in SNOMED Clinical Taxonomy. #health

Benchmark
Open
1 joined

Data access instructions

Real-world patient data is highly sensitive and difficult to share safely. The data for this benchmark has identifying factors removed, but still details the care of vulnerable and still-living people.

Accessing SNOMED CT

To access SNOMED CT, you will need to create an account with SNOMED International and agree to the SNOMED CT license terms. SNOMED CT is available for free in many countries, but in some countries a license fee may apply.

  1. Go to the SNOMED International website
  2. Click on the "Get SNOMED CT" button
  3. Select your country from the list to see the specific instructions for obtaining SNOMED CT in your region
  4. Follow the instructions to create an account and agree to the license terms
  5. Once you have completed the registration process, you will be able to download the SNOMED CT files needed for the benchmark.

The version of SNOMED CT used in this benchmark is the International Edition released in November 2025. Participants must independently download this release; a copy will not be provided. See the official release notes for the SNOMED CT November 2025 International Edition.

Accessing training data from PhysioNet

Training data for the benchmark is available via PhysioNet here. To access the benchmark data, each participant will need to register with PhysioNet under MIT’s agreement and complete an online training course. Then you will have access to the world’s largest publicly available repository of patient data!

Detailed steps to register for data access are below:

  1. Create an account on the PhysioNet platform by visiting https://physionet.org/register/
  2. The online training course is provided by the CITI Program. Go to https://about.citiprogram.org/ and create an account there using the following steps:
  3. Click on “Register” in the upper right hand corner
  4. On the registration page, click the “Select your organization affiliation” button on the left. Enter your institution name or select “No Institution” if you are not affiliated with an institution
  5. Agree to the terms of service, the privacy policy, and affirm that you are an affiliate by checking the appropriate checkboxes
  6. Enter your information (name, email) in step 2, select your username/password in step 3, and answer questions in step 4
  7. On the subsequent “Your CE Credit Status” page, you may respond “NO” to the CE Credit Status prompt
  8. On the subsequent “Affiliate with an Institution” page, fill out all required information.
  9. Enter Department and Role as appropriate.
  10. On the “Select Curriculum” page, answer “Basic IRB Data or Specimens Only Research” to question 1 and fill out any other required fields.
  11. You should now see the “Data or Specimens Only Research” course in your active courses. Complete this course, which contains 9 segments. You need to achieve an overall score of 90% or higher.
  12. Once you have completed the course, click “View/Print” and save a copy of the Completion Report (not the certificate).
  13. Go to https://physionet.org/settings/training/ and upload the report, then click “Submit Training”.
  14. Go to https://physionet.org/credential-application/ and fill out the requested information to submit a credential application
  15. At this point the PhysioNet team will process your credentialing and training applications. The process is normally complete within 24-48 hours.
  16. Once you have received email notifications that each of your “credentialing” and “training” applications have been accepted, there is a final step: complete the Data Usage Agreement (DUA). To do this, log into your PhysioNet account and navigate to the PhysioNet page.
  17. Scroll to the very bottom of the page and you will see a red box reading: “sign the data use agreement for the project”. Click that to agree.

At this point, you should be able to download the training notes and annotations files for this benchmark. If you encounter any issues, please post to the forum or send an email to benchmark@snomed.org.