Resources


Database Credentialed Access

MIMIC-IV-Ext-MDS-ED: Multimodal Decision Support in the Emergency Department - a Benchmark Dataset for Diagnoses and Deterioration Prediction in Emergency Medicine

Juan Miguel Lopez Alcaraz, Nils Strodthoff

MIMIC-IV-ext-MDS-ED proposes a dataset to benchmark multimodal decision support in the emergency department. It features multimodal input (including ECG waveforms) and a comprehensive set of prediction targets (diagnoses and deterioration prediction)

emergency department ecg benchmark diagnoses prediction deterioration prediction multimodal

Published: Sept. 12, 2024. Version: 1.0.0


Database Contributor Review

COVID Data for Shared Learning (CDSL): A comprehensive, multimodal COVID-19 dataset from HM Hospitales

Álvaro Ritoré, Andreea M Oprescu, Alberto Estirado Bronchalo, Miguel Ángel Armengol de la Hoz

COVID Data for Shared Learning (CDSL) is a multimodal database comprising de-identified structured health data and radiological images from 4,479 patients with COVID-19, as a comprehensive toolkit for developing predictive models.

covid-19 multimodal database radiological images open data healthcare data machine learning and ai

Published: Oct. 25, 2024. Version: 1.0.0


Database Credentialed Access

RadVLM Instruction Dataset

Nicolas Deperrois, Hidetoshi Matsuo, Samuel Ruiperez-Campillo, Moritz Vandenhirtz, Sonia Laguna, Alain Ryser, Koji Fujimoto, Mizuho Nishio, Thomas Sutter, Julia Vogt, Jonas Kluckert, Thomas Frauenfelder, Christian Bluethgen, Farhad Nooralahzadeh, Michael Krauthammer

This dataset is designed to construct RadVLM, a vision–language model for chest X-ray interpretation. It includes instruction data for tasks such as report generation, abnormality detection, and region grounding, and multitask conversation.

chest x-rays vision-language models medical ai

Published: Sept. 25, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-Ext-Instr: A Dataset of 450K+ EHR-Grounded Instruction-Following Examples

Zhenbang Wu, Anant Dadu, Mike Nalls, Faraz Faghri, Jimeng Sun

This dataset contains 450K open-ended instruction-following examples generated using GPT-3.5 based on the MIMIC-IV EHR database.

large language models medical question answering instruction tuning

Published: Sept. 9, 2025. Version: 1.0.0


Database Contributor Review

COVID Data for Shared Learning (CDSL): A comprehensive, multimodal COVID-19 dataset from HM Hospitales

Álvaro Ritoré, Andreea M Oprescu, Alberto Estirado Bronchalo, Miguel Ángel Armengol de la Hoz

COVID Data for Shared Learning (CDSL) is a multimodal database comprising de-identified structured health data and radiological images from 4,479 patients with COVID-19, as a comprehensive toolkit for developing predictive models.

covid-19 multimodal database radiological images open data healthcare data machine learning and ai

Published: Oct. 25, 2024. Version: 1.0.0