Resources


Database Credentialed Access

RadGraph: Extracting Clinical Entities and Relations from Radiology Reports

Saahil Jain, Ashwin Agrawal, Adriel Saporta, et al.

RadGraph is a dataset of entities and relations in full-text chest X-ray radiology reports, which are obtained using a novel information extraction (IE) schema to capture clinically relevant information in a radiology report.

entity and relation extraction graph multi-modal natural language processing radiology

Published: June 3, 2021. Version: 1.0.0


Database Restricted Access

Smartphone-Captured Chest X-Ray Photographs

Po-Chih Kuo, ChengChe Tsai, Diego M Lopez, et al.

Smartphone-captured CXR images including photographs taken from MIMIC-CXR and CheXpert, photographs taken by resident doctors, and photographs taken with different devices.

smartphone photograph cxr

Published: Sept. 27, 2020. Version: 1.0.0


Database Open Access

Wide-field calcium imaging sleep state database

Eric Landsness, Xiaohui Zhang, Wei Chen, et al.

Wide-field calcium imaging database that consists of annotated sleep recording collected from transgenic mice at Washington University of St Louis School of Medicine.

sleep wide-field calcium imaging sleep state classification sleep staging machine learning

Published: March 17, 2022. Version: 1.0.1


Database Credentialed Access

MIMIC-III - SequenceExamples for TensorFlow modeling

Jonas Kemp, Kun Zhang, Andrew Dai

MIMIC-III data converted into TensorFlow SequenceExample format, for use in modeling pipelines.

tensorflow sequence modeling machine learning deep learning

Published: Sept. 29, 2020. Version: 1.0.0


Database Credentialed Access

MedNLI - A Natural Language Inference Dataset For The Clinical Domain

Chaitanya Shivade

This is a resource for training machine learning models for language inference in the medical domain.

natural language inference recognizing textual entailment

Published: Oct. 1, 2019. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-ECHO-Ext-LVVOLUMES-A4C-ROI: Annotated Subset of Apical Four-Chamber Echocardiography for PoCUS-Style LV Volume and Function Analysis

Kamlin Ekambaram, Anurag Arnab, Philip Herbst, et al.

A curated subset of MIMIC-IV-ECHO providing apical four-chamber cine loops with manual ROI masks, volumetric labels, and ready-to-use MP4/NPZ derivatives for robust LV volume and ejection fraction research.

ultrasound deep learning echocardiography medical imaging dicom lvesv roi segmentation cardiac video analysis left ventricular volume mimic-iv-echo apical four-chamber quantitative cardiology biplane simpson transformer models lvef ejection fraction a4c pocus lvedv domain adaptation

Published: Feb. 26, 2026. Version: 1.0.0


Database Credentialed Access

EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images

Seongsu Bae, Daeun Kyung, Jaehee Ryu, et al.

We present EHRXQA, the first multi-modal EHR QA dataset combining structured patient records with aligned chest X-ray images. EHRXQA contains a comprehensive set of QA pairs covering image-related, table-related, and image+table-related questions.

question answering machine learning electronic health records evaluation chest x-ray multi-modal question answering ehr question answering semantic parsing deep learning benchmark visual question answering

Published: July 23, 2024. Version: 1.0.0


Database Credentialed Access

MIMIC-Ext-MIMIC-CXR-VQA: A Complex, Diverse, And Large-Scale Visual Question Answering Dataset for Chest X-ray Images

Seongsu Bae, Daeun Kyung, Jaehee Ryu, et al.

We introduce MIMIC-Ext-MIMIC-CXR-VQA, a complex, diverse, and large-scale dataset designed for Visual Question Answering (VQA) tasks within the medical domain, focusing primarily on chest radiographs.

question answering machine learning electronic health records evaluation chest x-ray radiology deep learning benchmark multimodal visual question answering

Published: July 19, 2024. Version: 1.0.0


Database Restricted Access

CheXchoNet: A Chest Radiograph Dataset with Gold Standard Echocardiography Labels

Pierre Elias, Shreyas Bhave

Early detection of heart failure is vital for improving outcomes. The dataset contains 71,589 CXRs paired with gold standard labels from echocardiograms to enable the training of models to detect pathologies indicative of early stage heart failure.

chest x-rays heart failure early detection cardiac structural abnormalties deep learning

Published: March 20, 2024. Version: 1.0.0


Database Credentialed Access

MIMIC-CXR-JPG - chest radiographs with structured labels

Alistair Johnson, Matthew Lungren, Yifan Peng, et al.

Chest x-rays in JPG format with structured labels derived from the associated radiology report.

computer vision chest x-ray radiology deep learning mimic

Published: March 12, 2024. Version: 2.1.0