Resources


Database Restricted Access

MIMIC-IV-Ext-DiReCT

Bowen Wang, Jiuyang Chang, Yiming Qian

A diagnostic reasoning dataset designed to evaluate the performance of large language models in aligning with human doctors when making diagnoses from clinical notes.

Published: Jan. 21, 2025. Version: 1.0.0


Database Credentialed Access

CAD-Chest: Comprehensive Annotation of Diseases based on MIMIC-CXR Radiology Report

Mengliang Zhang, Xinyue Hu, Lin Gu, Tatsuya Harada, Kazuma Kobayashi, Ronald Summers, Yingying Zhu

The CAD-Chest dataset provides comprehensive annotations of disease, including disease severity, uncertainty, and location based on the MIMIC-CXR radiologist reports.

chesr x-ray disease label

Published: Dec. 8, 2023. Version: 1.0


Challenge Credentialed Access

SNOMED CT Entity Linking Challenge

Will Hardman, Mark Banks, Rory Davidson, Donna Truran, Nindya Widita Ayuningtyas, Hoa Ngo, Alistair Johnson, Tom Pollard

272 discharge notes from the MIMIC-IV-Note dataset annotated with SNOMED CT concepts.

snomed entity linking clinical annotation

Published: July 22, 2025. Version: 1.1.0


Challenge Credentialed Access

ShAReCLEF eHealth Evaluation Lab 2014 (Task 2): Disorder Attributes in Clinical Reports

Danielle Mowery

The ShARe/CLEF eHealth 2014 Challenge (Task 2) on Disorder Attributes in Clinical Reports

Published: Nov. 1, 2013. Version: 1.0


Database Open Access

PTB-XL, a large publicly available electrocardiography dataset

Patrick Wagner, Nils Strodthoff, Ralf-Dieter Bousseljot, Wojciech Samek, Tobias Schaeffter

The PTB-XL ECG dataset is a large dataset of 21801 clinical 12-lead ECGs from 18869 patients of 10 second length. The raw signal data has been annotated by up to two cardiologists with 71 different ECG statements and is supplemented by rich metadata.

ptb-xl ptb ecg electrocardiography

Published: Nov. 9, 2022. Version: 1.0.3

Visualize waveforms

Database Credentialed Access

MedDec: Medical Decisions for Discharge Summaries in the MIMIC-III Database

Mohamed Elgaar, Jiali Cheng, Nidhi Vakil, Hadi Amiri, Leo Anthony Celi

Annotations of ten types of medical decisions from discharge summaries in the MIMIC-III database.

natural language processing medical decisions span classification discharge summary mimic

Published: Oct. 16, 2024. Version: 1.0.0


Database Credentialed Access

Annotated Social Determinants of Health Dataset for Adverse Pregnancy Outcomes

Nidhi Soley, MaKhaila Bentil, Jash Shah, Masoud Rouhizadeh, Casey Taylor

This project provides a manually annotated dataset of social determinants of health—social support, occupation, and substance use—linked to pregnancy outcomes, extracted from MIMIC-III and MIMIC-IV discharge summary notes.

Published: Aug. 4, 2025. Version: 1.0.0


Database Credentialed Access

Annotated MIMIC-IV discharge summaries for a study on deidentification of names

Shulammite Lim, Yuxin Xiao, Alistair Johnson, Dana Moukheiber, Lama Moukheiber, Mira Moukheiber, Marzyeh Ghassemi, Tom Pollard

Annotated MIMIC-IV discharge summaries used to explore deidentification of names

deidentification fairness

Published: July 5, 2023. Version: 1.0


Database Credentialed Access

RadGraph: Extracting Clinical Entities and Relations from Radiology Reports

Saahil Jain, Ashwin Agrawal, Adriel Saporta, Steven QH Truong, Du Nguyen Duong, Tan Bui, Pierre Chambon, Matthew Lungren, Andrew Ng, Curtis Langlotz, Pranav Rajpurkar

RadGraph is a dataset of entities and relations in full-text chest X-ray radiology reports, which are obtained using a novel information extraction (IE) schema to capture clinically relevant information in a radiology report.

entity and relation extraction graph multi-modal natural language processing radiology

Published: June 3, 2021. Version: 1.0.0


Challenge Credentialed Access

SNOMED CT Entity Linking Challenge

Will Hardman, Mark Banks, Rory Davidson, Donna Truran, Nindya Widita Ayuningtyas, Hoa Ngo, Alistair Johnson, Tom Pollard

272 discharge notes from the MIMIC-IV-Note dataset annotated with SNOMED CT concepts.

snomed entity linking clinical annotation

Published: July 22, 2025. Version: 1.1.0