Resources
Database
Credentialed Access
Yael Bensoussan, Alexandros Sigaras, Anais Rameau, et al.
A dataset of questionnaire responses, spectrograms, and other information for pediatric participants collected for the Bridge2AI voice as a biomarker of health project.
voice
bridge2ai
Published: Dec. 17, 2025.
Version: 1.0.0
Database
Credentialed Access
Yael Bensoussan, Alexandros Sigaras, Anais Rameau, et al.
A dataset of features from voice recordings and metadata to enable the development, benchmarking, and validation of clinically applicable machine-learning models for diagnosing a wide range of health conditions.
voice
bridge2ai
Published: Dec. 16, 2025.
Version: 3.0.0
Database
Restricted Access
Natalia Sanabria-Herrera, Ingrid Gisell Bustos Moya, Luis Felipe Reyes
This study explores the respiratory microbiome's role in nosocomial lower respiratory tract infections in ICU patients. Conducted in Chía, Colombia, revealing the microbiome's impact on disease progression.
Published: Dec. 5, 2025.
Version: 1.1.1
Database
Contributor Review
Mel Molina, Nikita Mehandru, Niloufar Golchini, et al.
The ER-REASON dataset is a longitudinal collection of 25,174 de-identified clinical notes for 3,437 patients admitted to the emergency room (ER) at a large academic medical center between March 1, 2022, and March 31, 2024.
Published: Oct. 23, 2025.
Version: 1.0.0
Database
Credentialed Access
Farieda Gaber, Altuna Akalin
This MIMIC-IV extended dataset is designed to evaluate and improve LLMs' ability to assist with triage, specialist referral, and diagnosis, using critical patient information such as history of present illness,vitals signs and other relevant data.
Published: Oct. 8, 2025.
Version: 1.0.2
Database
Credentialed Access
Aman Kansal, Emma Chen, Tom Jin, et al.
A multimodal dataset of deidentified clinical and physiological data from emergency department visits, supporting research on patient outcomes, care processes, and the effects of continuous monitoring during and after the COVID-19 pandemic.
Published: Sept. 25, 2025.
Version: 1.0.1
Database
Restricted Access
Blue Lin, Jin Yi Li, Kaavya Kalani, et al.
This initial version of the PHASES dataset includes multimodal menstrual health data—hormone levels, wearable sensor metrics, and self-reported symptoms—collected across two study intervals from 42 young adults.
wearables
hormones
menstrual health
multimodal health
health sensor data
womens health
Published: Sept. 9, 2025.
Version: 1.0.0
Database
Credentialed Access
Hanbin Ko
CXR-Align is a benchmark dataset created to evaluate vision-language models' capability to interpret negations in chest X-ray (CXR) reports, featuring systematically modified reports from MIMIC-CXR.
Published: Aug. 21, 2025.
Version: 1.0.0
Database
Credentialed Access
Amin Dada, Osman Alperen Koras, Marie Bauer, et al.
MeDiSumQA is a dataset of patient-oriented QA pairs from MIMIC-IV discharge summaries, designed to evaluate LLMs in generating safe, patient-friendly medical responses for clinical QA and healthcare communication.
Published: May 5, 2025.
Version: 1.0.0
Database
Credentialed Access
Lizhou Fan, Huizi Yu
This project integrates MIMIC-III and CORAL electronic health records into knowledge graphs to improve medical analysis and enhance decision-making capabilities. Resources include two knowledge graph snapshots and two question-and-answering datasets.
Published: April 15, 2025.
Version: 1.0.0