Resources


Database Open Access

Facial and oral temperature data from a large set of human subject volunteers

Quanzeng Wang, Yangling Zhou, Pejman Ghassemi, et al.

Data for each subject include temperatures measured at 29 facial locations over four rounds with two IRTs, oral temperatures measured with a thermometer in two modes, subject demographics (gender, age, ethnicity), environmental conditions, etc.

clinical accuracy receiver operating characteristic curve infectious disease epidemics thermography fever screening inner canthus elevated body temperature facial maximum temperatures infrared thermograph pearson correlation coefficients thermometry

Published: May 24, 2023. Version: 1.0.0


Database Open Access

Visceral adipose tissue measurements during pregnancy

Alexandre da Silva Rocha, Lisia von Diemen, Daniela Kretzer, et al.

Maternal visceral adipose tissue measurements collected as part of a cohort study of 154 pregnant women.

Published: March 23, 2020. Version: 1.0.0


Challenge Credentialed Access

ShAReCLEF eHealth 2013: Natural Language Processing and Information Retrieval for Clinical Care

Danielle Mowery

2013 ShARe/CLEF eHealth Evaluation Lab: Natural Language Processing and Information Retrieval for Clinical Care (Tasks 1 and 2).

natural language processing

Published: Feb. 15, 2013. Version: 1.0


Database Restricted Access

MIMIC-IV-Ext-Apixaban-Trial-Criteria-Questions

Elizabeth Woo, Michael Craig Burkhart, Emily Alsentzer, et al.

We created 23 questions resembling eligibility criteria from the apixaban clinical trial and evaluated them on a random sample of 100 patient notes from MIMIC-IV. We release the 2300 total question-answer pairs as a dataset here.

clinical q and a evaluation set clinical trial eligibility

Published: April 30, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-III-Ext-Notes

Darren Liu, Monique Bouvier, Delgersuren Bold, et al.

We evaluated general large language models' performance in clinical information extraction on MIMIC-III notes.

Published: Feb. 27, 2026. Version: 1.0.0


Database Credentialed Access

Lunguage: A Benchmark for Structured and Sequential Chest X-ray Interpretation

Jong Hak Moon, Geon Choi, Paloma Rabaey, et al.

A radiologist-annotated benchmark of structured chest X-ray reports at single and sequential levels, comprising 1,473 reports across 18 relation types and 80 longitudinal cases.

fine-grained structured reports attribute-level clinical reasoning medical text structuring longitudinal clinical reasoning chest x-ray report parsing medical information structuring benchmark dataset for radiology report medical information extraction structured radiology reports temporal relation extraction radiology report benchmarking longitudinal clinical understanding

Published: Jan. 11, 2026. Version: 1.0.0


Challenge Credentialed Access

ArchEHR-QA: A Dataset for Addressing Patient's Information Needs related to Clinical Course of Hospitalization

Sarvesh Soni, Dina Demner-Fushman

A dataset for grounded question answering (QA) from electronic health records (EHRs).

question answering electronic health record patient portals clinicians

Published: Jan. 1, 2026. Version: 1.3


Database Contributor Review

ER-REASON: A Benchmark Dataset for LLM-Based Clinical Reasoning in the Emergency Room

Mel Molina, Nikita Mehandru, Niloufar Golchini, et al.

The ER-REASON dataset is a longitudinal collection of 25,174 de-identified clinical notes for 3,437 patients admitted to the emergency room (ER) at a large academic medical center between March 1, 2022, and March 31, 2024.

Published: Oct. 23, 2025. Version: 1.0.0


Database Restricted Access

MIMIC-IV-Ext-Apixaban-Trial-Criteria-Questions

Elizabeth Woo, Michael Craig Burkhart, Emily Alsentzer, et al.

We created 23 questions resembling eligibility criteria from the apixaban clinical trial and evaluated them on a random sample of 100 patient notes from MIMIC-IV. We release the 2300 total question-answer pairs as a dataset here.

clinical q and a evaluation set clinical trial eligibility

Published: April 30, 2025. Version: 1.0.0


Database Restricted Access

CXRGraph: Using Information Extraction to Normalize the Training Data for Automatic Radiology Report Generation

Yuxiang Liao, Hoisang Heung, Hantao Liu, et al.

CXRGraph is a structured radiology report dataset built upon RadGraph and tailored for the Automatic Radiology Report Generation task. It can identify more task-relevant information such as abnormalities and hallucinated prior references.

relation extraction information extraction natural language processing named entity recognition structured radiology report

Published: Feb. 3, 2025. Version: 1.0.0