Featured Resources


Challenge Credentialed Access

SNOMED CT Entity Linking Challenge

Will Hardman, Mark Banks, Rory Davidson, Donna Truran, Nindya Widita Ayuningtyas, Hoa Ngo, Alistair Johnson, Tom Pollard

272 discharge notes from the MIMIC-IV-Note dataset annotated with SNOMED CT concepts.

snomed entity linking clinical annotation

Published: Dec. 19, 2023. Version: 1.0.0


Database Open Access

VitalDB, a high-fidelity multi-parameter vital signs database in surgical patients

Hyung-Chul Lee, Chul-Woo Jung

VitalDB, a high-fidelity multi-parameter vital signs database in surgical patients

waveform anesthesia vitaldb intraoperative biosignal ecg

Published: Sept. 21, 2022. Version: 1.0.0


Database Credentialed Access

MIMIC-IV

Alistair Johnson, Lucas Bulgarelli, Tom Pollard, Steven Horng, Leo Anthony Celi, Roger Mark

Large database of de-identified health information from patients admitted to Beth Israel Deaconess Medical Center

mimic critical care machine learning intensive care unit

Published: Jan. 6, 2023. Version: 2.2


Database Credentialed Access

MIMIC-CXR Database

Alistair Johnson, Tom Pollard, Roger Mark, Seth Berkowitz, Steven Horng

Chest radiographs in DICOM format with associated free-text reports.

mimic computer vision machine learning chest x-rays radiology natural language processing

Published: Sept. 19, 2019. Version: 2.0.0


Database Credentialed Access

BRAX, a Brazilian labeled chest X-ray dataset

Eduardo Pontes Reis, Joselisa Paiva, Maria Carolina Bueno da Silva, Guilherme Alberto Sousa Ribeiro, Victor Fornasiero Paiva, Lucas Bulgarelli, Henrique Lee, Paulo Victor dos Santos, vanessa brito, Lucas Amaral, Gabriel Beraldo, Jorge Nebhan Haidar Filho, Gustavo Teles, Gilberto Szarf, Tom Pollard, Alistair Johnson, Leo Anthony Celi, Edson Amaro

BRAX contains 24,959 chest radiography exams and 40,967 images acquired in a large general Brazilian hospital. All images have been read by trained radiologists and 14 labels were derived from Brazilian Portuguese reports using NLP.

chest x-ray dataset artificial intelligence

Published: June 17, 2022. Version: 1.1.0


Database Credentialed Access

eICU Collaborative Research Database

Tom Pollard, Alistair Johnson, Jesse Raffa, Leo Anthony Celi, Omar Badawi, Roger Mark

Multi-center database comprising deidentified health data associated with over 200,000 admissions to ICUs across the United States between 2014-2015.

telemedicine icu critical care

Published: April 15, 2019. Version: 2.0


Latest Resources


Database Credentialed Access

mBRSET, a Mobile Brazilian Retinal Dataset

Luis Filipe Nakayama, Lucas Zago Ribeiro, David Restrepo, Nathan Santos Barboza, Raul Dias Fiterman, Maria luiza Vieira Sousa, Alexandre Durao Alves Pereira, Caio Regatieri, Fernando Korn Malerbi, Rafael Andrade

mBRSET - a Mobile Brazilian Retinal Dataset

retina ophthalmology

Published: June 26, 2024. Version: 1.0


Database Credentialed Access

EHRNoteQA: An LLM Benchmark for Real-World Clinical Practice Using Discharge Summaries

Sunjun Kweon, Jiyoun Kim, Heeyoung Kwak, Dongchul Cha, Hangyul Yoon, Kwang Hyun Kim, Jeewon Yang, Seunghyun Won, Edward Choi

An LLM Benchmark for Real-World Clinical Practice Using Discharge Summaries

Published: June 26, 2024. Version: 1.0.1


Database Open Access

Radiology Report Generation Models Evaluation Dataset For Chest X-rays (RadEvalX)

Amos Rubin Calamida, Farhad Nooralahzadeh, Morteza Rohanian, Mizuho Nishio, Koji Fujimoto, Michael Krauthammer

The RadEvalX is a publicly available dataset developed similarly to the ReXVal dataset. RedEvalX focuses on radiologist evaluations of errors found in automatically generated radiology reports.

Published: June 18, 2024. Version: 1.0.0


Database Open Access

Gesture Recognition and Biometrics ElectroMyogram (GRABMyo)

Ning Jiang, Ashirbad Pradhan, Jiayuan He

Open-access dataset of electromyogram (EMG) recordings collected from the wrist and forearm muscles of 43 people while they performed hand gestures.

Published: June 7, 2024. Version: 1.1.0

Visualize waveforms

Model Credentialed Access

Me-LLaMA: Foundation Large Language Models for Medical Applications

Qianqian Xie, Qingyu Chen, Aokun Chen, Cheng Peng, Yan Hu, Fongci Lin, Xueqing Peng, Jimin Huang, Jeffrey Zhang, Vipina Keloth, Xinyu Zhou, Huan He, Lucila Ohno-Machado, Yonghui Wu, Hua Xu, Jiang Bian

Me-LLaMA is a family of large language models for medical applications trained using clinical text with LLaMA2 models as the base. We release model weights for the foundation models as well as the chat-enhanced models.

large language models

Published: June 5, 2024. Version: 1.0.0


Database Credentialed Access

A Temporal Dataset for Respiratory Support in Critically Ill Patients

Mira Moukheiber, Lama Moukheiber, Dana Moukheiber, Sicheng Hao, Leo Anthony Celi, Hyung-Chul Lee

A benchmark dataset offering hourly records over a 90-day period for 50,920 ICU subjects, including dynamic pulmonary function data and a spectrum of covariates for respiratory intervention analyses.

oberservational data time-series

Published: May 31, 2024. Version: 1.0.0