Resources


Database Credentialed Access

CovIdentify Dataset

Peter Cho, Md Mobashir Hasan Shandhi, Ali Roghanizad, et al.

This contains wearable device data from Fitbit, Garmin, and Apple Watch users. The data is from April 2nd, 2020 to March 21st, 2021 and has been date-shifted. An appropriate amount has also shifted test dates for each user.

Published: Nov. 25, 2024. Version: 1.0.0


Database Credentialed Access

ReFiSco: Report Fix and Score Dataset for Radiology Report Generation

Katherine Tian, Sina J Hartung, Andrew A Li, et al.

Preliminary human expert evaluation study on 60 MIMIC-CXR radiology reports

Published: Aug. 23, 2023. Version: 1.0


Challenge Credentialed Access

MIT Critical Datathon 2023: a MIMIC-IV Derived Dataset for Pulse Oximetry Correction Models

João Matos, Tristan Struja, David S Restrepo, et al.

A SaO2-SpO2 Pairs Dataset derived from MIMIC-IV

pulse oximetry health equity machine learning

Published: May 8, 2023. Version: 1.0.0


Software Credentialed Access

Code for generating the HAIM multimodal dataset of MIMIC-IV clinical data and x-rays

Luis R Soenksen, Yu Ma, Cynthia Zeng, et al.

Code for generating the HAIM multimodal dataset of MIMIC-IV clinical data and x-rays

database code multimodality

Published: Aug. 23, 2022. Version: 1.0.1


Database Open Access

Icentia11k Single Lead Continuous Raw Electrocardiogram Dataset

Shawn Tan, Satya Ortiz-Gagné, Nicolas Beaudoin-Gagnon, et al.

This is a dataset of continuous raw electrocardiogram (ECG) signals for representation learning containing 11 thousand patients and 2 billion labelled beats.

representation learning ecg

Published: April 12, 2022. Version: 1.0

Visualize waveforms

Database Credentialed Access

MedNLI - A Natural Language Inference Dataset For The Clinical Domain

Chaitanya Shivade

This is a resource for training machine learning models for language inference in the medical domain.

natural language inference recognizing textual entailment

Published: Oct. 1, 2019. Version: 1.0.0


Database Open Access

Term-Preterm EHG DataSet with Tocogram

Electrohysterogram signals accompanied by a simultaneously recorded external tocogram.

neuroelectric pregnancy electrohysterogram

Published: Aug. 29, 2018. Version: 1.0.0

Visualize waveforms

Database Open Access

BIDMC PPG and Respiration Dataset

ECG signals extracted from the MIMIC-II Matched Waveform Database, with manual breath annotations added by annotators using impedance respiratory signal.

multiparameter photoplethysmogram ecg

Published: June 20, 2018. Version: 1.0.0

Visualize waveforms

Database Open Access

neuroQWERTY MIT-CSXPD Dataset

Keystroke logs collected from 85 subjects with and without Parkinson's disease.

parkinsons neuroelectric brain

Published: Dec. 20, 2016. Version: 1.0.0


Database Open Access

VitalDB Arrhythmia Database: An Anesthesiologist-Validated Large-Scale Intraoperative Arrhythmia Dataset with Beat and Rhythm Labels

Dain Eun, Kayoung Shim, Hyunsoo Lee, et al.

We present a comprehensive intraoperative arrhythmia dataset with 734,528 seconds of ECG recordings from 482 patients, featuring over 660,000 beats annotated and validated by five anesthesiologists.

ppg vitaldb arterial waveform intraoperative dataset ecg

Published: Feb. 26, 2026. Version: 1.0.0