Resources


Database Credentialed Access

CLIP: A Dataset for Extracting Action Items for Physicians from Hospital Discharge Notes

James Mullenbach, Yada Pruksachatkun, Sean Adler, Jennifer Seale, Jordan Swartz, T Greg McKelvey, Yi Yang, David Sontag

Clinical action items annotated over MIMIC-III. 718 discharge summaries are labeled at a sentence- and character-level with multiple action labels including Appointment, Lab, Procedure, Medication, Imaging, Patient Instructions, and Other.

Published: June 21, 2021. Version: 1.0.0


Database Credentialed Access

BOLD, a blood-gas and oximetry linked dataset

João Matos, Tristan Struja, Jack Gallifant, Luis Filipe Nakayama, Marie Charpignon, Xiaoli Liu, Jaime dos Santos Cardoso, Leo Anthony Celi, An Kwok Wong

An open-source pulse oximetry and arterial blood gas dataset, derived from MIMIC-III, MIMIC-IV, and eICU-CRD

pulse oximetry intensive care unit health equity electronic health records

Published: Nov. 8, 2023. Version: 1.0


Database Credentialed Access

Annotated Social Determinants of Health Dataset for Adverse Pregnancy Outcomes

Nidhi Soley, MaKhaila Bentil, Jash Shah, Masoud Rouhizadeh, Casey Taylor

This project provides a manually annotated dataset of social determinants of health—social support, occupation, and substance use—linked to pregnancy outcomes, extracted from MIMIC-III and MIMIC-IV discharge summary notes.

Published: Aug. 4, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-Ext-DrugDetection

Fabrice Harel-Canada, Nanyun Peng, David Goodman, Ruby Romero, Allan Nguyen, Brandon Moghanian, Anabel Salimian

This project offers a multilabel annotated dataset of clinical note sentences from MIMIC-III/IV for substance use detection. It supports NLP research for identifying various co-occurring drug use mentions in patient records.

ehr mimic-iv substance use clinical notes methamphetamine multi-label cocaine drug detection polysubstance use prescription opioid misuse cannabis benzodiazepine misuse injection drug use heroin mimic-iii

Published: Sept. 25, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-IV

Alistair Johnson, Lucas Bulgarelli, Tom Pollard, Brian Gow, Benjamin Moody, Steven Horng, Leo Anthony Celi, Roger Mark

Large database of de-identified health information from patients admitted to Beth Israel Deaconess Medical Center

critical care intensive care unit mimic machine learning

Published: Oct. 11, 2024. Version: 3.1


Database Credentialed Access

MIMIC-IV

Alistair Johnson, Lucas Bulgarelli, Tom Pollard, Brian Gow, Benjamin Moody, Steven Horng, Leo Anthony Celi, Roger Mark

Large database of de-identified health information from patients admitted to Beth Israel Deaconess Medical Center

critical care intensive care unit mimic machine learning

Published: Oct. 11, 2024. Version: 3.1


Database Credentialed Access

MIMIC-Ext-DrugDetection

Fabrice Harel-Canada, Nanyun Peng, David Goodman, Ruby Romero, Allan Nguyen, Brandon Moghanian, Anabel Salimian

This project offers a multilabel annotated dataset of clinical note sentences from MIMIC-III/IV for substance use detection. It supports NLP research for identifying various co-occurring drug use mentions in patient records.

ehr mimic-iv substance use clinical notes methamphetamine multi-label cocaine drug detection polysubstance use prescription opioid misuse cannabis benzodiazepine misuse injection drug use heroin mimic-iii

Published: Sept. 25, 2025. Version: 1.0.0


Database Restricted Access

MIMIC-IV-Ext-Apixaban-Trial-Criteria-Questions

Elizabeth Woo, Michael Craig Burkhart, Emily Alsentzer, Brett Beaulieu-Jones

We created 23 questions resembling eligibility criteria from the apixaban clinical trial and evaluated them on a random sample of 100 patient notes from MIMIC-IV. We release the 2300 total question-answer pairs as a dataset here.

clinical q and a evaluation set clinical trial eligibility

Published: April 30, 2025. Version: 1.0.0


Database Open Access

MIMIC-IV demo data in the OMOP Common Data Model

Michael Kallfelz, Anna Tsvetkova, Tom Pollard, Manlik Kwong, Gigi Lipori, Vojtech Huser, Jeffrey Osborn, Sicheng Hao, Andrew Williams

Preliminary work to transform a MIMIC-IV demo dataset to the OMOP Common Data Model

omop common data model

Published: June 21, 2021. Version: 0.9


Challenge Open Access

2018 IEEE BHI and BSN Data Challenge

Tom Pollard, Alistair Johnson, Jesse Raffa

A challenge to explore real clinical questions in critically ill patients using the MIMIC. A collaboration with the IEEE Conference on Biomedical and Health Informatics 2018 and the IEEE Conference on Body Sensor Networks.

weekend effect ieee challenge mimic

Published: Feb. 5, 2018. Version: 1.0