Resources
Database Credentialed Access
MIMIC-IV-Ext-Instr: A Dataset of 450K+ EHR-Grounded Instruction-Following Examples
large language models medical question answering instruction tuning
Published: Sept. 9, 2025. Version: 1.0.0
Database Credentialed Access
EHR-DS-QA: A Synthetic QA Dataset Derived from Medical Discharge Summaries for Enhanced Medical Information Retrieval Systems
mimic-iv clinical question-answering medical discharge summaries large language models
Published: Jan. 11, 2024. Version: 1.0.0
Database Credentialed Access
EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images
question answering machine learning electronic health records evaluation chest x-ray multi-modal question answering ehr question answering semantic parsing benchmark deep learning visual question answering
Published: July 23, 2024. Version: 1.0.0
Database Credentialed Access
EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images
question answering machine learning electronic health records evaluation chest x-ray multi-modal question answering ehr question answering semantic parsing benchmark deep learning visual question answering
Published: July 23, 2024. Version: 1.0.0
Database Credentialed Access
EHRCon: Dataset for Checking Consistency between Unstructured Notes and Structured Tables in Electronic Health Records
Published: March 19, 2025. Version: 1.0.1
Database Credentialed Access
DrugEHRQA: A Question Answering Dataset on Structured and Unstructured Electronic Health Records For Medicine Related Queries
Published: April 12, 2022. Version: 1.0.0
Model Credentialed Access
Shareable Artificial Intelligence to Extract Cancer Outcomes from Electronic Health Records for Precision Oncology Research
Published: Oct. 24, 2024. Version: 1.0.0
Database Credentialed Access
Annotation dataset of social determinants of health from MIMIC-III Clinical Care Database
natural language processing social determinants of health
Published: Jan. 24, 2024. Version: 1.0.1
Database Credentialed Access
MIMIC-III-Ext-VeriFact-BHC: Labeled Propositions From Brief Hospital Course Summaries for Long-form Clinical Text Evaluation
artificial intelligence natural language processing clinical notes electronic health records large language models brief hospital course long-form text chart review text reranking atomic claim hybrid retrieval clinical informatics clinical medicine fact verification retrieval-augmented generation logical atomism text embedding formal logic llm-as-a-judge llm evaluation
Published: April 9, 2025. Version: 1.0.0
Database Credentialed Access
ENCoDE, mEasuring skiN Color to correct pulse Oximetry DisparitiEs: skin tone and clinical data from a prospective trial on acute care patients.
Published: Aug. 22, 2024. Version: 1.0.0