Resources
Model Credentialed Access
Asclepius-R : Clinical Large Language Model Built On MIMIC-III Discharge Summaries
Sunjun Kweon, Junu Kim, Jiyoun Kim, Sujeong Im, Eunbyeol Cho, Seongsu Bae, Jungwoo Oh, Gyubok Lee, Jong Hak Moon, Seng Chan You, Seungjin Baek, Chang Hoon Han, Yoon Bin Jung, Yohan Jo, Edward Choi
clinical notes synthetic clinical notes synthetic notes asclepius open-source llm clinical llm large language model
Published: March 25, 2024. Version: 1.1.0
Database Credentialed Access
Annotation dataset of problematic opioid use and related contexts from MIMIC-III Critical Care Database discharge summaries
Melissa Poulsen, Vanessa Troiani, Philip Freda, Danielle Mowery, Anahita Davoudi
opioid use disorder substance use natural language processing clinical notes
Published: Feb. 8, 2023. Version: 1.0.0
Database Credentialed Access
Nosocomial Risk Datasets from MIMIC-III
Travis Goodwin
pressure injury risk prediction acute kidney injury anemia forecasting natural language processing deep learning
Published: Sept. 15, 2022. Version: 1.0
Database Credentialed Access
Synthetic Acute Hypotension and Sepsis Datasets Based on MIMIC-III and Published as Part of the Health Gym Project
Nicholas Kuo, Simon Finfer, Louisa Jorm, Sebastiano Barbieri
sepsis acute hypotension synthetic dataset generative modelling wasserstein generative adversarial network reinforcement learning machine learning
Published: Feb. 23, 2022. Version: 1.0.0
Model Credentialed Access
Transformer models trained on MIMIC-III to generate synthetic patient notes
Ali Amin-Nejad, Julia Ive, Sumithra Velupillai
Published: May 27, 2020. Version: 1.0.0
Database Credentialed Access
Phenotype Annotations for Patient Notes in the MIMIC-III Database
Edward Moseley, Leo Anthony Celi, Joy Wu, Franck Dernoncourt
patient classification natural language processing
Published: March 5, 2020. Version: 1.20.03
Database Restricted Access
MIMIC-III-Ext-Synthetic-Clinical-Trial-Questions
Elizabeth Woo, Michael Craig Burkhart, Emily Alsentzer, Brett Beaulieu-Jones
large language models synthetic data distillation clinical trial eligibility
Published: April 22, 2025. Version: 1.0.0
Database Credentialed Access
DrugEHRQA: A Question Answering Dataset on Structured and Unstructured Electronic Health Records For Medicine Related Queries
Jayetri Bardhan, Anthony Colas, Kirk Roberts, Daisy Zhe Wang
Published: April 12, 2022. Version: 1.0.0
Database Credentialed Access
Northwestern ICU (NWICU) database
Dana Moukheiber, William Temps, Bhadrappa Molgi, Yikuan Li, Alice Lu, Prasanth Nannapaneni, Abdulrahman Chahin, Sicheng Hao, Felipe Torres Fabregas, Leo Anthony Celi, Adrian Wong, Maxwell Lloyd, Xavier Borrat Frigola, Hyung-Chul Lee, Daniel Schneider, Tom Pollard, Yuan Luo, Abel Kho, Roger Mark
Published: Nov. 19, 2024. Version: 0.1.0
Database Credentialed Access
EHRCon: Dataset for Checking Consistency between Unstructured Notes and Structured Tables in Electronic Health Records
Yeonsu Kwon, Jiho Kim, Gyubok Lee, Seongsu Bae, Daeun Kyung, Wonchul Cha, Tom Pollard, Alistair Johnson, Edward Choi
Published: March 19, 2025. Version: 1.0.1