Resources


Database Open Access

Icentia11k Single Lead Continuous Raw Electrocardiogram Dataset

Shawn Tan, Satya Ortiz-Gagné, Nicolas Beaudoin-Gagnon, Pierre Fecteau, Aaron Courville, Yoshua Bengio, Joseph Paul Cohen

This is a dataset of continuous raw electrocardiogram (ECG) signals for representation learning containing 11 thousand patients and 2 billion labelled beats.

representation learning ecg

Published: April 12, 2022. Version: 1.0

Visualize waveforms

Database Credentialed Access

MedNLI - A Natural Language Inference Dataset For The Clinical Domain

Chaitanya Shivade

This is a resource for training machine learning models for language inference in the medical domain.

natural language inference recognizing textual entailment

Published: Oct. 1, 2019. Version: 1.0.0


Database Open Access

Term-Preterm EHG DataSet with Tocogram

Electrohysterogram signals accompanied by a simultaneously recorded external tocogram.

neuroelectric pregnancy electrohysterogram

Published: Aug. 29, 2018. Version: 1.0.0

Visualize waveforms

Database Open Access

BIDMC PPG and Respiration Dataset

ECG signals extracted from the MIMIC-II Matched Waveform Database, with manual breath annotations added by annotators using impedance respiratory signal.

multiparameter photoplethysmogram ecg

Published: June 20, 2018. Version: 1.0.0

Visualize waveforms

Database Open Access

neuroQWERTY MIT-CSXPD Dataset

Keystroke logs collected from 85 subjects with and without Parkinson's disease.

parkinsons neuroelectric brain

Published: Dec. 20, 2016. Version: 1.0.0


Database Contributor Review

COVID Data for Shared Learning (CDSL): A comprehensive, multimodal COVID-19 dataset from HM Hospitales

Álvaro Ritoré, Andreea M Oprescu, Alberto Estirado Bronchalo, Miguel Ángel Armengol de la Hoz

COVID Data for Shared Learning (CDSL) is a multimodal database comprising de-identified structured health data and radiological images from 4,479 patients with COVID-19, as a comprehensive toolkit for developing predictive models.

covid-19 multimodal database radiological images open data healthcare data machine learning and ai

Published: Oct. 25, 2024. Version: 1.0.0


Database Credentialed Access

ReXPref-Prior: A MIMIC-CXR Preference Dataset for Reducing Hallucinated Prior Exams in Radiology Report Generation

Oishi Banerjee, Hong-Yu Zhou, Subathra Adithan, Stephen Kwak, Kay Wu, Pranav Rajpurkar

We propose ReXPref-Prior, an adapted version of MIMIC-CXR where GPT-4 has removed references to prior exams from both findings and impression sections of chest X-ray reports.

chest x-rays reinforcement learning hallucination

Published: Aug. 14, 2024. Version: 1.0.0


Database Open Access

Respiratory and heart rate monitoring dataset from aeration study

Ella Frances Sophia Guy, Isaac Flett, Jaimey Anne Clifton, Trudy Caljé-van der Klei, Rongqing Chen, Jennifer Knopp, Knut Moeller, James Geoffrey Chase

Respiratory and cardiovascular data collected from 20 subjects. Pressure, flow, aeration, and heart-rate data were collected during trials which included resting breathing, CPAP at varied PEEP settings, breath-holds, and forced expiratory manoeuvres.

Published: March 20, 2024. Version: 1.0.0


Database Credentialed Access

Annotation dataset of social determinants of health from MIMIC-III Clinical Care Database

Marco Guevara, Shan Chen, Spencer Thomas, Danielle Bitterman

Annotation dataset of social determinants of health from MIMC-III Clinical Care Database notes.

natural language processing social determinants of health

Published: Jan. 24, 2024. Version: 1.0.1


Database Credentialed Access

Generalized Image Embeddings for the MIMIC Chest X-Ray dataset

Andrew Sellergren, Atilla Kiraly, Tom Pollard, Wei-Hung Weng, Yun Liu, Akib Uddin, Christina Chen

This database contains compact information-rich embeddings of the MIMIC-CXR Database v2.0.0 using the CXR Foundation API v1.0.

Published: Feb. 22, 2023. Version: 1.0