Resources


Database Credentialed Access

TherLid: A Thermometry Linked Dataset

Jeremy Tan, Inês Martins, João Matos, et al.

TherLiD is an open-source dataset of 13,251 paired temperature readings (contact and infrared) from MIMIC-IV and eICU databases. With added demographics and derived data, it supports research on racial and ethnic disparities in infrared thermometry.

thermometry intensive care unit health equity electronic health records

Published: Jan. 21, 2025. Version: 1.0.0


Challenge Credentialed Access

BioNLP Workshop 2023 Shared Task 1A: Problem List Summarization

Yanjun Gao, Dmitriy Dligach, Timothy Miller, et al.

This is the data storage for BioNLP Workshop Shared Task 1A: Problem List Summarization.

bionlp clinical natural language processing electronic health record summarization

Published: Nov. 12, 2023. Version: 2.0.0


Database Open Access

MIMIC-IV Clinical Database Demo

Alistair Johnson, Lucas Bulgarelli, Tom Pollard, et al.

An openly available subset of patients in the MIMIC-IV database.

critical care electronic health record mimic

Published: Jan. 31, 2023. Version: 2.2


Database Credentialed Access

Annotation dataset of problematic opioid use and related contexts from MIMIC-III Critical Care Database discharge summaries

Melissa Poulsen, Vanessa Troiani, Philip Freda, et al.

The database contains a corpus of annotated data from the MIMIC-III Critical Care Database from a study that aimed to develop and apply an annotation schema to characterize opioid use disorder and related contextual factors.

opioid use disorder substance use natural language processing clinical notes

Published: Feb. 8, 2023. Version: 1.0.0


Database Credentialed Access

MIMIC-III and eICU-CRD: Feature Representation by FIDDLE Preprocessing

Shengpu Tang, Parmida Davarmanesh, Yanmeng Song, et al.

Features and labels from MIMIC-III and eICU-CRD produced by FIDDLE, an EHR preprocessing pipeline.

preprocessing electronic health record machine learning

Published: April 28, 2021. Version: 1.0.0


Database Credentialed Access

Annotation dataset of social determinants of health from MIMIC-III Clinical Care Database

Marco Guevara, Shan Chen, Spencer Thomas, et al.

Annotation dataset of social determinants of health from MIMC-III Clinical Care Database notes.

natural language processing social determinants of health

Published: Jan. 24, 2024. Version: 1.0.1


Database Credentialed Access

National Institutes of Health Stroke Scale (NIHSS) Annotations for the MIMIC-III Database

Jiayang Wang, Xiaoshuo Huang, Lin Yang, et al.

A dataset of annotated NIHSS scale items and corresponding scores from stroke patients discharge summaries in MIMIC-III.

Published: Jan. 25, 2021. Version: 1.0.0


Challenge Credentialed Access

ArchEHR-QA: A Dataset for Addressing Patient's Information Needs related to Clinical Course of Hospitalization

Sarvesh Soni, Dina Demner-Fushman

A dataset for grounded question answering (QA) from electronic health records (EHRs).

question answering electronic health record patient portals clinicians

Published: Jan. 1, 2026. Version: 1.3


Database Credentialed Access

Antibiotic Resistance Microbiology Dataset Mass General Brigham (ARMD-MGB)

Ziming Wei, Sanjat Kanjilal

ARMD-MGB contains detailed microbiology and clinical metadata for >225,000 patients and >970,000 cultures collected over 10 years

medical informatics antimicrobial resistance electronic health records

Published: Dec. 5, 2025. Version: 1.0.0


Database Credentialed Access

Multimodal Clinical Monitoring in the Emergency Department (MC-MED)

Aman Kansal, Emma Chen, Tom Jin, et al.

A multimodal dataset of deidentified clinical and physiological data from emergency department visits, supporting research on patient outcomes, care processes, and the effects of continuous monitoring during and after the COVID-19 pandemic.

Published: Sept. 25, 2025. Version: 1.0.1