Resources


Database Credentialed Access

Curated Data for Describing Blood Glucose Management in the Intensive Care Unit

Aldo Robles Arévalo, Roselyn Mateo-Collado, Leo Anthony Celi

The data subsets consist of time series files that includes all the curated entries of glucose readings and insulin inputs from MIMIC-III database.

insulin replacement therapy glycemic control critical care

Published: April 19, 2021. Version: 1.0.1


Database Open Access

MIMIC-IV demo data in the Medical Event Data Standard (MEDS)

Robin Philippus van de Water, Ethan Steinberg, Michael Wornow, et al.

MIMIC-IV Clinical Database Demo in MEDS (Medical Event Data Standard) format.

ehr critical care electronic health record machine learning mimic meds medical event data standard

Published: Sept. 29, 2025. Version: 0.0.1


Database Credentialed Access

NCH Sleep DataBank: A Large Collection of Real-world Pediatric Sleep Studies with Longitudinal Clinical Data

Harlin Lee, Boyue Li, Yungui Huang, et al.

The NCH Sleep DataBank includes 3,984 pediatric sleep studies on 3,673 unique patients conducted at Nationwide Children's Hospital between 2017 and 2019. It contains polysomnography (PSG), clinical annotations, and longitudinal clinical data.

eeg ehr pediatrics polysomnography clinical decision support sleep study ecg electronic health records sleep disorders

Published: Oct. 27, 2021. Version: 3.1.0


Database Credentialed Access

MIMIC-III and eICU-CRD: Feature Representation by FIDDLE Preprocessing

Shengpu Tang, Parmida Davarmanesh, Yanmeng Song, et al.

Features and labels from MIMIC-III and eICU-CRD produced by FIDDLE, an EHR preprocessing pipeline.

preprocessing electronic health record machine learning

Published: April 28, 2021. Version: 1.0.0


Database Credentialed Access

BOLD, a blood-gas and oximetry linked dataset

João Matos, Tristan Struja, Jack Gallifant, et al.

An open-source pulse oximetry and arterial blood gas dataset, derived from MIMIC-III, MIMIC-IV, and eICU-CRD

pulse oximetry intensive care unit health equity electronic health records

Published: Nov. 8, 2023. Version: 1.0


Database Credentialed Access

NCH Sleep DataBank: A Large Collection of Real-world Pediatric Sleep Studies with Longitudinal Clinical Data

Harlin Lee, Boyue Li, Yungui Huang, et al.

The NCH Sleep DataBank includes 3,984 pediatric sleep studies on 3,673 unique patients conducted at Nationwide Children's Hospital between 2017 and 2019. It contains polysomnography (PSG), clinical annotations, and longitudinal clinical data.

eeg ehr pediatrics polysomnography clinical decision support sleep study ecg electronic health records sleep disorders

Published: Oct. 27, 2021. Version: 3.1.0


Challenge Credentialed Access

ArchEHR-QA: A Dataset for Addressing Patient's Information Needs related to Clinical Course of Hospitalization

Sarvesh Soni, Dina Demner-Fushman

A dataset for grounded question answering (QA) from electronic health records (EHRs).

question answering electronic health record patient portals clinicians

Published: Jan. 1, 2026. Version: 1.3


Database Credentialed Access

Antibiotic Resistance Microbiology Dataset Mass General Brigham (ARMD-MGB)

Ziming Wei, Sanjat Kanjilal

ARMD-MGB contains detailed microbiology and clinical metadata for >225,000 patients and >970,000 cultures collected over 10 years

medical informatics antimicrobial resistance electronic health records

Published: Dec. 5, 2025. Version: 1.0.0


Database Credentialed Access

TherLid: A Thermometry Linked Dataset

Jeremy Tan, Inês Martins, João Matos, et al.

TherLiD is an open-source dataset of 13,251 paired temperature readings (contact and infrared) from MIMIC-IV and eICU databases. With added demographics and derived data, it supports research on racial and ethnic disparities in infrared thermometry.

thermometry intensive care unit health equity electronic health records

Published: Jan. 21, 2025. Version: 1.0.0


Challenge Credentialed Access

BioNLP Workshop 2023 Shared Task 1A: Problem List Summarization

Yanjun Gao, Dmitriy Dligach, Timothy Miller, et al.

This is the data storage for BioNLP Workshop Shared Task 1A: Problem List Summarization.

bionlp clinical natural language processing electronic health record summarization

Published: Nov. 12, 2023. Version: 2.0.0