Resources


Database Credentialed Access

CXR-Align: A Benchmark for CXR-Report Alignment with Negations

Hanbin Ko

CXR-Align is a benchmark dataset created to evaluate vision-language models' capability to interpret negations in chest X-ray (CXR) reports, featuring systematically modified reports from MIMIC-CXR.

Published: Aug. 21, 2025. Version: 1.0.0


Database Credentialed Access

CXR-PRO: MIMIC-CXR with Prior References Omitted

Vignav Ramesh, Nathan Chi, Pranav Rajpurkar

CXR-PRO is an adaptation of the MIMIC-CXR dataset (consisting of chest radiographs and their associated free-text radiology reports) with references to non-existent priors removed.

generation free-text radiology reports references to priors retrieval large language models

Published: Nov. 23, 2022. Version: 1.0.0


Challenge Credentialed Access

CXR-LT: Multi-Label Long-Tailed Classification on Chest X-Rays

Gregory Holste, Mingquan Lin, Song Wang, et al.

CXR-LT 2024 was a challenge for long-tailed, multi-label, and zero-shot thorax disease classification on chest X-rays, held at MICCAI 2024. This page contains long-tailed labels for 45 diseases from the CXR-LT 2024 and 2023 challenges.

disease classification artificial intelligence chest x-ray deep learning computer-aided diagnosis long-tailed learning cardiopulmonary disease zero-shot learning

Published: March 19, 2025. Version: 2.0.0


Database Credentialed Access

MIMIC-CXR-JPG - chest radiographs with structured labels

Alistair Johnson, Matthew Lungren, Yifan Peng, et al.

Chest x-rays in JPG format with structured labels derived from the associated radiology report.

computer vision chest x-ray radiology deep learning mimic

Published: March 12, 2024. Version: 2.1.0


Database Restricted Access

Application of Med-PaLM 2 in the refinement of MIMIC-CXR labels

Kendall Park, Rory Sayres, Andrew Sellergren, et al.

This work further refines the labels associated with CheXpert in MIMIC-CXR-JPG 2.0.0 by filtering with Med-PaLM 2 followed by verification by manual review by three US board-certified radiologists.

mimic-cxr labels

Published: Feb. 4, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-CXR-Ext-ILS: Lesion Segmentation Masks and Instruction-Answer Pairs for Chest X-rays

Geon Choi, Hangyul Yoon, Hyunju Shin, et al.

Instruction-guided lesion segmentation data for chest X-rays, including 1.1M instruction-answer pairs and 91K segmentation masks covering seven major lesion types.

chest x-ray segmentation text-guided segmentation lesion segmentation

Published: March 25, 2026. Version: 1.0.0


Database Credentialed Access

MIMIC-Ext-CXR-QBA: A Structured, Tagged, and Localized Visual Question Answering Dataset with Question-Box-Answer Triplets and Scene Graphs for Chest X-ray Images

Philip MĂĽller, Friederike Jungmann, Georgios Kaissis, et al.

We present a large-scale CXR VQA dataset derived from MIMIC-CXR with 42M QA pairs, featuring hierarchical answers, bounding boxes, and structured tags. We generated QA-pairs using LLM-based extraction from radiology reports and localization models.

chest x-rays vqa localization scene graphs

Published: July 22, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-Ext-MIMIC-CXR-VQA: A Complex, Diverse, And Large-Scale Visual Question Answering Dataset for Chest X-ray Images

Seongsu Bae, Daeun Kyung, Jaehee Ryu, et al.

We introduce MIMIC-Ext-MIMIC-CXR-VQA, a complex, diverse, and large-scale dataset designed for Visual Question Answering (VQA) tasks within the medical domain, focusing primarily on chest radiographs.

question answering machine learning electronic health records evaluation chest x-ray radiology deep learning benchmark multimodal visual question answering

Published: July 19, 2024. Version: 1.0.0


Database Restricted Access

Pulmonary Edema Severity Grades Based on MIMIC-CXR

Ruizhi Liao, Geeticka Chauhan, Polina Golland, et al.

Pulmonary edema metadata and labels for MIMIC-CXR

Published: Feb. 9, 2021. Version: 1.0.1


Database Restricted Access

LATTE-CXR: Locally Aligned TexT and imagE, Explainable dataset for Chest X-Rays

Elham Ghelichkhan, Tolga Tasdizen

This dataset includes bounding box-statement pairs for chest X-ray images, derived from radiologists’ eye-tracking data (for explainability) and annotations, for local visual-language models.

eye-tracking chest x-ray dataset automatically generated dataset caption-guided object detection image captioning with region-level description grounded radiology report generation phrase grounding xai multi-modal learning local visual-language models localization

Published: Feb. 4, 2025. Version: 1.0.0