Resources


Model Credentialed Access

Me-LLaMA: Foundation Large Language Models for Medical Applications

Qianqian Xie, Qingyu Chen, Aokun Chen, Cheng Peng, Yan Hu, Fongci Lin, Xueqing Peng, Jimin Huang, Jeffrey Zhang, Vipina Keloth, Xinyu Zhou, Huan He, Lucila Ohno-Machado, Yonghui Wu, Hua Xu, Jiang Bian

Me-LLaMA is a family of large language models for medical applications trained using clinical text with LLaMA2 models as the base. We release model weights for the foundation models as well as the chat-enhanced models.

large language models

Published: June 5, 2024. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-Ext-Instr: A Dataset of 450K+ EHR-Grounded Instruction-Following Examples

Zhenbang Wu, Anant Dadu, Mike Nalls, Faraz Faghri, Jimeng Sun

This dataset contains 450K open-ended instruction-following examples generated using GPT-3.5 based on the MIMIC-IV EHR database.

large language models medical question answering instruction tuning

Published: Sept. 9, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-Ext-MIMIC-CXR-VQA: A Complex, Diverse, And Large-Scale Visual Question Answering Dataset for Chest X-ray Images

Seongsu Bae, Daeun Kyung, Jaehee Ryu, Eunbyeol Cho, Gyubok Lee, Sunjun Kweon, Jungwoo Oh, Lei JI, Eric Chang, Tackeun Kim, Edward Choi

We introduce MIMIC-Ext-MIMIC-CXR-VQA, a complex, diverse, and large-scale dataset designed for Visual Question Answering (VQA) tasks within the medical domain, focusing primarily on chest radiographs.

question answering electronic health records evaluation chest x-ray radiology benchmark machine learning multimodal deep learning visual question answering

Published: July 19, 2024. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-Ext-Instr: A Dataset of 450K+ EHR-Grounded Instruction-Following Examples

Zhenbang Wu, Anant Dadu, Mike Nalls, Faraz Faghri, Jimeng Sun

This dataset contains 450K open-ended instruction-following examples generated using GPT-3.5 based on the MIMIC-IV EHR database.

large language models medical question answering instruction tuning

Published: Sept. 9, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-Ext Triage Instruction Corpus

Qingyang Shen, Quan Guo

MIMIC-IV-Ext Triage Instruction Corpus includes 9,629 ED triage cases organized by the five-level ESI, enabling LLMs to improve triage accuracy. It provides CSV data, generation prompts, expert validation samples, and SQL QC scripts.

nlp clinical decision support large language models machine learning emergency severity index emergency triage

Published: March 4, 2025. Version: 1.0.0


Model Credentialed Access

RadVLM model

Nicolas Deperrois, Hidetoshi Matsuo, Samuel Ruiperez-Campillo, Moritz Vandenhirtz, Sonia Laguna, Alain Ryser, Koji Fujimoto, Mizuho Nishio, Thomas Sutter, Julia Vogt, Jonas Kluckert, Thomas Frauenfelder, Christian Bluethgen, Farhad Nooralahzadeh, Michael Krauthammer

RadVLM is a 7B-parameter vision-language model fine-tuned on public chest-X-ray data that drafts reports, lists abnormalities, grounds findings, and chats about a CXR through a single image-to-text interface.

Published: Oct. 8, 2025. Version: 1.0.0


Database Restricted Access

MIMIC-IV-Ext-Apixaban-Trial-Criteria-Questions

Elizabeth Woo, Michael Craig Burkhart, Emily Alsentzer, Brett Beaulieu-Jones

We created 23 questions resembling eligibility criteria from the apixaban clinical trial and evaluated them on a random sample of 100 patient notes from MIMIC-IV. We release the 2300 total question-answer pairs as a dataset here.

clinical q and a evaluation set clinical trial eligibility

Published: April 30, 2025. Version: 1.0.0


Database Restricted Access

MIMIC-III-Ext-Synthetic-Clinical-Trial-Questions

Elizabeth Woo, Michael Craig Burkhart, Emily Alsentzer, Brett Beaulieu-Jones

In our recent study, we used Llama-3.1-70B-Instruct to generate synthetic training examples resembling clinical trial eligibility criteria. We manually reviewed 1000 of these examples and release them here.

large language models synthetic data distillation clinical trial eligibility

Published: April 22, 2025. Version: 1.0.0