Resources


Database Credentialed Access

NCH Sleep DataBank: A Large Collection of Real-world Pediatric Sleep Studies with Longitudinal Clinical Data

Harlin Lee, Boyue Li, Yungui Huang, et al.

The NCH Sleep DataBank includes 3,984 pediatric sleep studies on 3,673 unique patients conducted at Nationwide Children's Hospital between 2017 and 2019. It contains polysomnography (PSG), clinical annotations, and longitudinal clinical data.

eeg ehr pediatrics polysomnography clinical decision support sleep study ecg electronic health records sleep disorders

Published: Oct. 27, 2021. Version: 3.1.0


Database Credentialed Access

Annotation dataset of problematic opioid use and related contexts from MIMIC-III Critical Care Database discharge summaries

Melissa Poulsen, Vanessa Troiani, Philip Freda, et al.

The database contains a corpus of annotated data from the MIMIC-III Critical Care Database from a study that aimed to develop and apply an annotation schema to characterize opioid use disorder and related contextual factors.

opioid use disorder substance use natural language processing clinical notes

Published: Feb. 8, 2023. Version: 1.0.0


Database Credentialed Access

MIMIC-III and eICU-CRD: Feature Representation by FIDDLE Preprocessing

Shengpu Tang, Parmida Davarmanesh, Yanmeng Song, et al.

Features and labels from MIMIC-III and eICU-CRD produced by FIDDLE, an EHR preprocessing pipeline.

preprocessing electronic health record machine learning

Published: April 28, 2021. Version: 1.0.0


Database Credentialed Access

Annotation dataset of social determinants of health from MIMIC-III Clinical Care Database

Marco Guevara, Shan Chen, Spencer Thomas, et al.

Annotation dataset of social determinants of health from MIMC-III Clinical Care Database notes.

natural language processing social determinants of health

Published: Jan. 24, 2024. Version: 1.0.1


Database Credentialed Access

National Institutes of Health Stroke Scale (NIHSS) Annotations for the MIMIC-III Database

Jiayang Wang, Xiaoshuo Huang, Lin Yang, et al.

A dataset of annotated NIHSS scale items and corresponding scores from stroke patients discharge summaries in MIMIC-III.

Published: Jan. 25, 2021. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-Ext-CLIF: MIMIC-IV in the Common Longitudinal ICU data Format (CLIF)

Zewei Liao, Shan Guleria, Kevin Smith, et al.

Transforming the MIMIC-IV 3.1 database into the Common Longitudinal ICU data Format (CLIF)

critical care mimic clif the common longitudinal icu data format

Published: March 23, 2026. Version: 1.1.0


Database Credentialed Access

MIMIC-IV-Ext-MedicalBench: Evaluating Large Language Models Towards Improved Medical Concept Extraction

Zhichao Yang, Gregory Lyng, Sanjit Batra, et al.

This dataset is an evidence‑grounded benchmark built on MIMIC‑IV discharge summaries that evaluates how well large language models can verify ICD‑10 medical concepts, including implicitly documented diagnoses, by identifying supporting text evidence.

Published: March 23, 2026. Version: 1.0.0


Challenge Credentialed Access

ArchEHR-QA: A Dataset for Addressing Patient's Information Needs related to Clinical Course of Hospitalization

Sarvesh Soni, Dina Demner-Fushman

A dataset for grounded question answering (QA) from electronic health records (EHRs).

question answering electronic health record patient portals clinicians

Published: Jan. 1, 2026. Version: 1.3


Database Credentialed Access

Antibiotic Resistance Microbiology Dataset Mass General Brigham (ARMD-MGB)

Ziming Wei, Sanjat Kanjilal

ARMD-MGB contains detailed microbiology and clinical metadata for >225,000 patients and >970,000 cultures collected over 10 years

medical informatics antimicrobial resistance electronic health records

Published: Dec. 5, 2025. Version: 1.0.0


Database Credentialed Access

Multimodal Clinical Monitoring in the Emergency Department (MC-MED)

Aman Kansal, Emma Chen, Tom Jin, et al.

A multimodal dataset of deidentified clinical and physiological data from emergency department visits, supporting research on patient outcomes, care processes, and the effects of continuous monitoring during and after the COVID-19 pandemic.

Published: Sept. 25, 2025. Version: 1.0.1