Resources


Database Restricted Access

HYAMD High-Resolution Fundus Image Dataset for age related macular degeneration (AMD) Diagnosis

Meishar Meisel, Benjamin Alfred Cohen, Meital Baskin, Beatrice Tiosano, Joachim Behar, Eran Berkowitz

The HYAMD dataset comprises 1,560 high-resolution fundus images from 325 patients, aimed at validating machine learning models for age-related macular degeneration (AMD) diagnosis.

Published: Sept. 9, 2025. Version: 1.0.0


Database Open Access

MIMIC-IV Clinical Database Demo on FHIR

Alex Bennett, Hannes Ulrich, Joshua Wiedekopf, Piotr Szul, John Grimes, Alistair Johnson

The MIMIC-IV Clinical Database Demo on FHIR is a 100 patient subset of the MIMIC-IV v2.2 and MIMIC-IV-ED v2.2 clinical databases converted into the Fast Healthcare Interoperability Resources (FHIR) format.

fhir electronic health records mimic

Published: Aug. 27, 2025. Version: 2.1.0


Database Restricted Access

EchoNext: A Dataset for Detecting Echocardiogram-Confirmed Structural Heart Disease from ECGs

Pierre Elias, Joshua Finer

EchoNext is a curated dataset of electrocardiograms (ECGs) paired with echocardiogram-confirmed structural heart disease labels, designed to support the development and validation of machine learning models.

clinical decision support heart failure artificial intelligence ecg health equity machine learning electrocardiogram deep learning ai model deployment population health transthoracic echocardiogram left ventricular dysfunction structural heart disease aortic stenosis cardiovascular screening digital health ai in healthcare valvular heart disease

Published: Aug. 5, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-Ext-CXR-QBA: A Structured, Tagged, and Localized Visual Question Answering Dataset with Question-Box-Answer Triplets and Scene Graphs for Chest X-ray Images

Philip Müller, Friederike Jungmann, Georgios Kaissis, Daniel Rueckert

We present a large-scale CXR VQA dataset derived from MIMIC-CXR with 42M QA pairs, featuring hierarchical answers, bounding boxes, and structured tags. We generated QA-pairs using LLM-based extraction from radiology reports and localization models.

chest x-rays vqa localization scene graphs

Published: July 22, 2025. Version: 1.0.0


Challenge Credentialed Access

SNOMED CT Entity Linking Challenge

Will Hardman, Mark Banks, Rory Davidson, Donna Truran, Nindya Widita Ayuningtyas, Hoa Ngo, Alistair Johnson, Tom Pollard

272 discharge notes from the MIMIC-IV-Note dataset annotated with SNOMED CT concepts.

snomed entity linking clinical annotation

Published: July 22, 2025. Version: 1.1.0


Database Open Access

bigP3BCI: An Open, Diverse and Machine Learning Ready P300-based Brain-Computer Interface Dataset

Boyla Mainsah, Chance Fleeting, Thomas Balmat, Eric Sellers, Leslie Collins

A collection of data from P300-based brain-computer interface studies.

brain-computer interface electroencephalography ieee p2731 working group standard amyotrophic lateral sclerosis p300 speller p300 event related potential oddball paradigm error-related potential

Published: May 19, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-Ext Cardiac Disease

Jiawei Cao, Sendong Zhao

The subset of the MIMIC-IV dataset includes the examination results and diagnostic information of 4,761 cardiac disease patients. The examination results for each patient are listed separately as evidence for the final diagnosis.

Published: May 6, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-Ext-CEKG: A Process-Oriented Dataset Derived from MIMIC-IV for Enhanced Clinical Insights

Milad Naeimaei Aali, Felix Mannhardt, Pieter Jelle Toussaint

The MIMIC-IV-Ext-CEKG dataset is crafted for object-centric process mining in healthcare, specifically to create clinical event knowledge graphs for patients with multimorbidity, as well as for data mining and machine learning tasks.

mimic process mining multi entity process mining object centric event log clinical event knowledge graph

Published: April 8, 2025. Version: 1.0.0


Database Open Access

ReXErr-v1: Clinically Meaningful Chest X-Ray Report Errors Derived from MIMIC-CXR

Vishwanatha Rao, Serena Zhang, Julian Acosta, Subathra Adithan, Pranav Rajpurkar

Chest X-Ray reports containing synthetic errors based upon the MIMIC-CXR database. Errors were injected using LLMs and sampled across common human and AI model errors.

Published: March 19, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-Ext Triage Instruction Corpus

Qingyang Shen, Quan Guo

MIMIC-IV-Ext Triage Instruction Corpus includes 9,629 ED triage cases organized by the five-level ESI, enabling LLMs to improve triage accuracy. It provides CSV data, generation prompts, expert validation samples, and SQL QC scripts.

clinical decision support nlp large language models machine learning emergency severity index emergency triage

Published: March 4, 2025. Version: 1.0.0