Resources


Database Credentialed Access

Annotation dataset of problematic opioid use and related contexts from MIMIC-III Critical Care Database discharge summaries

Melissa Poulsen, Vanessa Troiani, Philip Freda, et al.

The database contains a corpus of annotated data from the MIMIC-III Critical Care Database from a study that aimed to develop and apply an annotation schema to characterize opioid use disorder and related contextual factors.

opioid use disorder substance use natural language processing clinical notes

Published: Feb. 8, 2023. Version: 1.0.0


Database Credentialed Access

AMR-UTI: Antimicrobial Resistance in Urinary Tract Infections

Michael Oberst, Soorajnath Boominathan, Helen Zhou, et al.

AMR-UTI is a freely accessible dataset, derived from electronic health record (EHR) information on over 100,000 urinary tract infections (UTI) treated at Massachusetts General Hospital and Brigham & Women's Hospital in Boston, MA, USA.

antibiotic resistance causal inference policy learning urinary tract infection antimicrobial resistance clinical decision support machine learning

Published: Nov. 4, 2020. Version: 1.0.0


Challenge Credentialed Access

SNOMED CT Entity Linking Challenge

Will Hardman, Mark Banks, Rory Davidson, et al.

272 discharge notes from the MIMIC-IV-Note dataset annotated with SNOMED CT concepts.

snomed clinical annotation entity linking

Published: Jan. 12, 2026. Version: 1.2.0


Database Contributor Review

InReDD-Dataset-PAN924

Caio Uehara Martins, Camila Tirapelli, Hugo Gaêta-Araujo, et al.

InReDD‑Dataset-V1 is a collection of 924 anonymised panoramic dental radiographs curated by the Interdisciplinary Research Group in Digital Dentistry (InReDD) at the University of São Paulo.

Published: Nov. 22, 2025. Version: 1.0.0


Database Credentialed Access

Predictors of Hospital Onset Infection: A Matched Retrospective Cohort Dataset

Ziming Wei, Luke Sagers, Caroline McKenna, et al.

NPA-CP is a freely accessible dataset derived from electronic health record (EHR) information at MGB between 2015 and 2024. The dataset includes 11 different pathogens and can be used to predict hospital-onset infections for these pathogens.

electronic health records infection control clinical machine learning infectious diseases hospital onset infection colonization pressure

Published: Nov. 4, 2025. Version: 1.0.0


Model Credentialed Access

RadVLM model

Nicolas Deperrois, Hidetoshi Matsuo, Samuel Ruiperez-Campillo, et al.

RadVLM is a 7B-parameter vision-language model fine-tuned on public chest-X-ray data that drafts reports, lists abnormalities, grounds findings, and chats about a CXR through a single image-to-text interface.

Published: Oct. 8, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-ECHO-Ext-MIMICEchoQA: A Benchmark Dataset for Echocardiogram-Based Visual Question Answering

Rahul Thapa, Andrew Li, Qingyang Wu, et al.

We present MIMICEchoQA, a benchmark dataset for echocardiogram-based question answering, built from the publicly available MIMIC-IV-ECHO database.

Published: Oct. 7, 2025. Version: 1.0.0


Database Restricted Access

Organ Retrieval and Collection of Health Information for Donation (ORCHID)

Hammaad Adam, Vinith Suriyakumar, Tom Pollard, et al.

Multi-center dataset on organ procurement in the United States

organ procurement organizations organ transplantation

Published: Sept. 29, 2025. Version: 2.1.1


Database Open Access

MIMIC-IV demo data in the Medical Event Data Standard (MEDS)

Robin Philippus van de Water, Ethan Steinberg, Michael Wornow, et al.

MIMIC-IV Clinical Database Demo in MEDS (Medical Event Data Standard) format.

ehr critical care electronic health record machine learning mimic meds medical event data standard

Published: Sept. 29, 2025. Version: 0.0.1


Database Credentialed Access

MIMIC-Ext-DrugDetection

Fabrice Harel-Canada, Nanyun Peng, David Goodman, et al.

This project offers a multilabel annotated dataset of clinical note sentences from MIMIC-III/IV for substance use detection. It supports NLP research for identifying various co-occurring drug use mentions in patient records.

ehr mimic-iv substance use clinical notes methamphetamine multi-label cocaine drug detection polysubstance use prescription opioid misuse cannabis benzodiazepine misuse injection drug use heroin mimic-iii

Published: Sept. 25, 2025. Version: 1.0.0