Resources


Database Credentialed Access

Embedding-Based Representations for BRSET and mBRSET

David Restrepo, Chenwei Wu, Michael Morley, et al.

Precomputed image embeddings for the BRSET and mBRSET Brazilian retinal datasets to support efficient, secure, and equitable ophthalmic AI research, enabling tasks such as classification, clustering, multimodal modeling, and fairness analysis.

computer vision ophthalmology vector embeddings

Published: March 30, 2026. Version: 1.0.0


Database Credentialed Access

MIMIC-CXR-Ext-ILS: Lesion Segmentation Masks and Instruction-Answer Pairs for Chest X-rays

Geon Choi, Hangyul Yoon, Hyunju Shin, et al.

Instruction-guided lesion segmentation data for chest X-rays, including 1.1M instruction-answer pairs and 91K segmentation masks covering seven major lesion types.

chest x-ray segmentation text-guided segmentation lesion segmentation

Published: March 25, 2026. Version: 1.0.0


Database Credentialed Access

Clinical Time Series Datasets for Trajectory Flow Matching Evaluation: ICU Sepsis, ICU Cardiac Arrest, and ICU GIB Cohorts

Yuan Pu, Dennis Shung, Alexander Tong, et al.

This resource comprises three clinical time series datasets used in the paper Trajectory Flow Matching with Applications to Clinical Time Series Modeling to evaluate models for handling irregularly sampled data in critical care settings.

clinical time series

Published: March 23, 2026. Version: 1.0.0


Database Credentialed Access

MIMIC-III-Ext-PPG: A PPG Benchmark Dataset for Cardiorespiratory Analysis

Mohammad Moulaeifard, Peter H Charlton, Nils Strodthoff

Large-Scale, Quality-Assessed PPG-based Benchmark Dataset for Cardiovascular and Respiratory Signal Analysis based on MIMIC-III

blood pressure critical care photoplethysmography signal quality heart rhythm respiratory rate electrocardiogram

Published: March 17, 2026. Version: 1.1.0


Challenge Credentialed Access

SNOMED CT Entity Linking Challenge

Will Hardman, Mark Banks, Rory Davidson, et al.

272 discharge notes from the MIMIC-IV-Note dataset annotated with SNOMED CT concepts.

snomed entity linking clinical annotation

Published: Feb. 17, 2026. Version: 1.2.1


Database Credentialed Access

CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays

Hyungyung Lee, Geon Choi, Jung Oh Lee, et al.

CheXStruct is an automated pipeline that derives structured diagnostic reasoning steps from chest X-rays. CXReasonBench builds on this to evaluate whether models perform clinically grounded, multi-step reasoning beyond final diagnoses.

evaluation chest x-ray benchmark structured chest x-ray qa intermediate reasoning steps structured reasoning grounded reasoning diagnostic reasoning structured diagnostic pipeline

Published: Oct. 23, 2025. Version: 1.0.1


Database Credentialed Access

MIMIC-Ext-DrugDetection

Fabrice Harel-Canada, Nanyun Peng, David Goodman, et al.

This project offers a multilabel annotated dataset of clinical note sentences from MIMIC-III/IV for substance use detection. It supports NLP research for identifying various co-occurring drug use mentions in patient records.

ehr mimic-iv substance use clinical notes mimic-iii methamphetamine multi-label cocaine drug detection polysubstance use prescription opioid misuse cannabis benzodiazepine misuse injection drug use heroin

Published: Sept. 25, 2025. Version: 1.0.0


Database Credentialed Access

RadGraph-XL: A Large-Scale Expert-Annotated Dataset for Entity and Relation Extraction from Radiology Reports

Jean-Benoit Delbrouck

RadGraph-XL is a large, expert-annotated dataset of 2,300 radiology reports covering multiple modalities and anatomies. It enables accurate extraction of clinical entities and relations for downstream medical AI tasks.

Published: Sept. 12, 2025. Version: 1.0.0


Database Restricted Access

HYAMD High-Resolution Fundus Image Dataset for age related macular degeneration (AMD) Diagnosis

Meishar Meisel, Benjamin Alfred Cohen, Meital Baskin, et al.

The HYAMD dataset comprises 1,560 high-resolution fundus images from 325 patients, aimed at validating machine learning models for age-related macular degeneration (AMD) diagnosis.

Published: Sept. 9, 2025. Version: 1.0.0


Database Open Access

MIMIC-IV Clinical Database Demo on FHIR

Alex Bennett, Hannes Ulrich, Joshua Wiedekopf, et al.

The MIMIC-IV Clinical Database Demo on FHIR is a 100 patient subset of the MIMIC-IV v2.2 and MIMIC-IV-ED v2.2 clinical databases converted into the Fast Healthcare Interoperability Resources (FHIR) format.

fhir electronic health records mimic

Published: Aug. 27, 2025. Version: 2.1.0