Resources


Database Restricted Access

Hospitalized patients with heart failure: integrating electronic healthcare records and external outcome data

Zhongheng Zhang, Linghong Cao, Yan Zhao, Ziyin Xu, Rangui Chen, Lukai Lv, Ping Xu

The new version added beta blockers in the dat_md.csv file. Dataset comprising hospital-level data on patients who were admitted with heart failure to Zigong Fourth People’s Hospital, Sichuan, China between 2016 and 2019.

china heart failure electronic health record

Published: May 22, 2022. Version: 1.3


Database Credentialed Access

RadGraph: Extracting Clinical Entities and Relations from Radiology Reports

Saahil Jain, Ashwin Agrawal, Adriel Saporta, Steven QH Truong, Du Nguyen Duong, Tan Bui, Pierre Chambon, Matthew Lungren, Andrew Ng, Curtis Langlotz, Pranav Rajpurkar

RadGraph is a dataset of entities and relations in full-text chest X-ray radiology reports, which are obtained using a novel information extraction (IE) schema to capture clinically relevant information in a radiology report.

entity and relation extraction graph multi-modal natural language processing radiology

Published: June 3, 2021. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-Ext clinical decision support for referral, triage and diagnosis

Farieda Gaber, Altuna Akalin

This MIMIC-IV extended dataset is designed to evaluate and improve LLMs' ability to assist with triage, specialist referral, and diagnosis, using critical patient information such as history of present illness,vitals signs and other relevant data.

Published: Oct. 8, 2025. Version: 1.0.2


Database Open Access

MIMIC-IV demo data in the Medical Event Data Standard (MEDS)

Robin Philippus van de Water, Ethan Steinberg, Michael Wornow, Patrick Rockenschaub, Matthew McDermott

MIMIC-IV Clinical Database Demo in MEDS (Medical Event Data Standard) format.

ehr critical care electronic health record mimic machine learning meds medical event data standard

Published: Sept. 29, 2025. Version: 0.0.1


Database Credentialed Access

MIMIC-Ext-DrugDetection

Fabrice Harel-Canada, Nanyun Peng, David Goodman, Ruby Romero, Allan Nguyen, Brandon Moghanian, Anabel Salimian

This project offers a multilabel annotated dataset of clinical note sentences from MIMIC-III/IV for substance use detection. It supports NLP research for identifying various co-occurring drug use mentions in patient records.

ehr mimic-iv substance use clinical notes methamphetamine multi-label cocaine drug detection polysubstance use prescription opioid misuse cannabis benzodiazepine misuse injection drug use heroin mimic-iii

Published: Sept. 25, 2025. Version: 1.0.0


Database Restricted Access

Community-Acquired Pneumonia, Endotypes and Phenotypes (NACef): Prospective, observational cohort study of Translational Medicine

Natalia Sanabria-Herrera, Esteban Garcia Gallo, Luis Felipe Reyes

Community-Acquired Pneumonia (CAP) poses a significant health risk, linked to high in-hospital morbidity and mortality rates. The dataset includes clinical details of 768 CAP patients at Clinica Universidad de La Sabana, Colombia.

Published: Aug. 21, 2025. Version: 2.0.1


Database Credentialed Access

CXR-Align: A Benchmark for CXR-Report Alignment with Negations

Hanbin Ko

CXR-Align is a benchmark dataset created to evaluate vision-language models' capability to interpret negations in chest X-ray (CXR) reports, featuring systematically modified reports from MIMIC-CXR.

Published: Aug. 21, 2025. Version: 1.0.0


Database Credentialed Access

Immunosuppressive Condition and Medication Annotations for Admission Notes in the MIMIC-III Database

Vijeeth Guggilla, Melissa Bak, Mengjia Kang, Theresa Walunas, Catherine A Gao

This database contains 200 MIMIC-III admission notes with adjudicated labels for histories of various immunosuppressive conditions and usage of various immunosuppressive medications.

Published: Aug. 4, 2025. Version: 1.0.0


Database Open Access

tOLIet: Single-lead Thigh-based Electrocardiography Using Polimeric Dry Electrodes

Aline Santos Silva, Hugo Plácido da Silva, Miguel Correia, Andreia Cristina Gonçalves da Costa, Sérgio Laranjo

We present tOLIet, the first thigh ECG dataset with real signals captured by a toilet seat with electrodes. There are 149 recordings from 86 people, useful for research into cardiovascular assessment using "invisible" ECG.

Published: June 24, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-Ext Cardiac Disease

Jiawei Cao, Sendong Zhao

The subset of the MIMIC-IV dataset includes the examination results and diagnostic information of 4,761 cardiac disease patients. The examination results for each patient are listed separately as evidence for the final diagnosis.

Published: May 6, 2025. Version: 1.0.0