Resources


Database Credentialed Access

EHR-DS-QA: A Synthetic QA Dataset Derived from Medical Discharge Summaries for Enhanced Medical Information Retrieval Systems

Konstantin Kotschenreuther

Dataset consisting of question and answer pairs synthetically generated from medical discharge summaries, designed to facilitate the training and development of large language models specifically tailored for healthcare applications

mimic-iv clinical question-answering medical discharge summaries large language models

Published: Jan. 11, 2024. Version: 1.0.0


Database Restricted Access

EchoNext: A Dataset for Detecting Echocardiogram-Confirmed Structural Heart Disease from ECGs

Pierre Elias, Joshua Finer

EchoNext is a curated dataset of electrocardiograms (ECGs) paired with echocardiogram-confirmed structural heart disease labels, designed to support the development and validation of machine learning models.

heart failure clinical decision support artificial intelligence health equity ecg machine learning deep learning electrocardiogram aortic stenosis cardiovascular screening valvular heart disease digital health ai model deployment left ventricular dysfunction ai in healthcare population health transthoracic echocardiogram structural heart disease

Published: Sept. 16, 2025. Version: 1.1.0


Database Contributor Review

COVID Data for Shared Learning (CDSL): A comprehensive, multimodal COVID-19 dataset from HM Hospitales

Álvaro Ritoré, Andreea M Oprescu, Alberto Estirado Bronchalo, et al.

COVID Data for Shared Learning (CDSL) is a multimodal database comprising de-identified structured health data and radiological images from 4,479 patients with COVID-19, as a comprehensive toolkit for developing predictive models.

covid-19 multimodal database radiological images open data healthcare data machine learning and ai

Published: Oct. 25, 2024. Version: 1.0.0


Database Credentialed Access

Comprehensive Polysomnography (CPS) Dataset: A Resource for Sleep-Related Arousal Research

Stefan Kraft, Andreas Theissler, Vera Wienhausen-Wilke, et al.

This dataset includes polysomnographic sleep recordings from a study on sleep-related arousal diagnostics, featuring raw and derived data channels, annotated event types, and questionnaire data.

polysomnography sleep disorders machine learning in healthcare sleep arousal diagnostics pulse wave analysis

Published: Sept. 18, 2024. Version: 1.0.0