Resources


Database Credentialed Access

MIMIC-IV-Ext Clinical Decision Making: A MIMIC-IV Derived Dataset for Evaluation of Large Language Models on the Task of Clinical Decision Making for Abdominal Pathologies

Paul Hager, Friederike Jungmann, Daniel Rueckert

A curated set of ED clinical decision making cases for four abdominal pathologies. Each case contains the exams required to diagnose including HPI, physical examination, laboratory tests, and imaging. Relevant treatment information is also included.

clinical decision making abdominal pathologies treatment plan emergency room diagnosis large language models

Published: July 8, 2024. Version: 1.1


Database Restricted Access

Visual Question Answering evaluation dataset for MIMIC CXR

Timo Kohlberger, Charles Lau, Tom Pollard, Andrew Sellergren, Atilla Kiraly, Fayaz Jamil

This dataset provides 224 VQAs for 40 test set cases, and 111 VQAs for 23 validation set cases of the MIMIC CXR dataset.

Published: Jan. 28, 2025. Version: 1.0.0


Database Open Access

Human Balance Evaluation Database

Force platform recordings from 163 subjects undergoing stabilography tests.

stability gait

Published: May 19, 2016. Version: 1.0.0

Visualize waveforms

Database Credentialed Access

MedVAL-Bench: Expert-Annotated Medical Text Validation Benchmark

Asad Aali, Vasiliki Bikia, Maya Varma, Nicole Chiou, Sophie Ostmeier, Arnav Singhvi, Magdalini Paschali, Ashwin Kumar, Andrew Johnston, Karimar Amador Martinez, Eduardo Perez Guerrero, Paola Cruz Rivera, Sergios Gatidis, Christian Bluethgen, Eduardo Pontes Reis, Eddy Zandee van Rilland, Poonam Hosamani, Kevin Keet, Minjoung Go, Evelyn Ling, David Larson, Curtis Langlotz, Roxana Daneshjou, Jason Hom, Sanmi Koyejo, Emily Alsentzer, Akshay Chaudhari

MedVAL-Bench is the first large-scale physician-validated benchmark for medical text validation, spanning 6 diverse medical tasks and containing 840 language model-generated outputs annotated by 12 physicians with error assessments and risk grades.

Published: Nov. 14, 2025. Version: 1.0.1


Database Credentialed Access

CORAL: expert-Curated medical Oncology Reports to Advance Language model inference

Madhumita Sushil, Vanessa Kennedy, Divneet Mandair, Brenda Miao, Travis Zack, Atul Butte

Medical oncology progress notes annotated with advanced, comprehensive oncology-relevant concepts and relationships.

artificial intelligence information extraction oncology natural language processing large language models electronic health records

Published: Feb. 7, 2024. Version: 1.0


Challenge Credentialed Access

Analysis of Clinical Text: Task 14 of SemEval 2015

Guergana Savova

This is the dataset for SemEval 2014 and 2015, Analysis of Clinical Text

semeval nlp

Published: Dec. 28, 2014. Version: 2.0


Database Credentialed Access

PatientSim: A Persona-Driven Simulator for Realistic Doctor-Patient Interactions

Daeun Kyung, Hyunseung Chung, Seongsu Bae, Jiho Kim, Jae Ho Sohn, Taerim Kim, Soo Kim, Edward Choi

PatientSim is a patient simulator that simulates realistic and diverse personas for clinical scenarios, enabling robust training and evaluation of doctor-patient interactions in multi-turn dialogues.

electronic health records multi-turn dialogue llm simulation doctor-patient consultation

Published: Oct. 18, 2025. Version: 1.0.0


Database Credentialed Access

MeDiSumQA: Patient-Oriented Question-Answer Generation from Discharge Letters

Amin Dada, Osman Alperen Koras, Marie Bauer, Amanda Butler, Kaleb Smith, Jens Kleesiek, Julian Friedrich

MeDiSumQA is a dataset of patient-oriented QA pairs from MIMIC-IV discharge summaries, designed to evaluate LLMs in generating safe, patient-friendly medical responses for clinical QA and healthcare communication.

Published: May 5, 2025. Version: 1.0.0


Database Open Access

Facial and oral temperature data from a large set of human subject volunteers

Quanzeng Wang, Yangling Zhou, Pejman Ghassemi, Dwith Chenna, Michelle Chen, Jon Casamento, Joshua Pfefer, David Mcbride

Data for each subject include temperatures measured at 29 facial locations over four rounds with two IRTs, oral temperatures measured with a thermometer in two modes, subject demographics (gender, age, ethnicity), environmental conditions, etc.

clinical accuracy receiver operating characteristic curve infectious disease epidemics thermography fever screening inner canthus elevated body temperature facial maximum temperatures infrared thermograph pearson correlation coefficients thermometry

Published: May 24, 2023. Version: 1.0.0


Database Open Access

Visceral adipose tissue measurements during pregnancy

Alexandre da Silva Rocha, Lisia von Diemen, Daniela Kretzer, Salete Matos, Juliana Rombaldi Bernardi, José Antônio Magalhães

Maternal visceral adipose tissue measurements collected as part of a cohort study of 154 pregnant women.

Published: March 23, 2020. Version: 1.0.0