Resources
Challenge Credentialed Access
ArchEHR-QA: A Dataset for Addressing Patient's Information Needs related to Clinical Course of Hospitalization
question answering electronic health record patient portals clinicians
Published: Jan. 1, 2026. Version: 1.3
Database Restricted Access
LATTE-CXR: Locally Aligned TexT and imagE, Explainable dataset for Chest X-Rays
eye-tracking chest x-ray dataset automatically generated dataset caption-guided object detection image captioning with region-level description grounded radiology report generation phrase grounding xai multi-modal learning local visual-language models localization
Published: Feb. 4, 2025. Version: 1.0.0
Database Restricted Access
LATTE-CXR: Locally Aligned TexT and imagE, Explainable dataset for Chest X-Rays
eye-tracking chest x-ray dataset automatically generated dataset caption-guided object detection image captioning with region-level description grounded radiology report generation phrase grounding xai multi-modal learning local visual-language models localization
Published: Feb. 4, 2025. Version: 1.0.0
Database Credentialed Access
MIMIC-IV-Ext-Instr: A Dataset of 450K+ EHR-Grounded Instruction-Following Examples
large language models medical question answering instruction tuning
Published: Sept. 9, 2025. Version: 1.0.0
Database Credentialed Access
MIMIC-IV-Ext-Instr: A Dataset of 450K+ EHR-Grounded Instruction-Following Examples
large language models medical question answering instruction tuning
Published: Sept. 9, 2025. Version: 1.0.0
Database Credentialed Access
MedVH: Towards Systematic Evaluation of Hallucination for Large Vision Language Models in the Medical Context
Published: Dec. 10, 2025. Version: 1.0.1
Database Restricted Access
CXRGraph: Using Information Extraction to Normalize the Training Data for Automatic Radiology Report Generation
relation extraction information extraction natural language processing named entity recognition structured radiology report
Published: Feb. 3, 2025. Version: 1.0.0
Database Credentialed Access
DrugEHRQA: A Question Answering Dataset on Structured and Unstructured Electronic Health Records For Medicine Related Queries
Published: April 12, 2022. Version: 1.0.0
Database Credentialed Access
MIMIC-IV-Ext-GPT-3_5-Generated-Discharge-Summaries-for-Low-Resource-Codes
icd coding large language model data augmentation
Published: Dec. 16, 2024. Version: 1.0.0
Database Credentialed Access
PatientSim: A Persona-Driven Simulator for Realistic Doctor-Patient Interactions
electronic health records multi-turn dialogue llm simulation doctor-patient consultation
Published: Oct. 18, 2025. Version: 1.0.0