Resources


Database Credentialed Access

CXR-Align: A Benchmark for CXR-Report Alignment with Negations

Hanbin Ko

CXR-Align is a benchmark dataset created to evaluate vision-language models' capability to interpret negations in chest X-ray (CXR) reports, featuring systematically modified reports from MIMIC-CXR.

Published: Aug. 21, 2025. Version: 1.0.0


Database Credentialed Access

CXR-PRO: MIMIC-CXR with Prior References Omitted

Vignav Ramesh, Nathan Chi, Pranav Rajpurkar

CXR-PRO is an adaptation of the MIMIC-CXR dataset (consisting of chest radiographs and their associated free-text radiology reports) with references to non-existent priors removed.

generation free-text radiology reports references to priors retrieval large language models

Published: Nov. 23, 2022. Version: 1.0.0


Challenge Credentialed Access

CXR-LT: Multi-Label Long-Tailed Classification on Chest X-Rays

Gregory Holste, Mingquan Lin, Song Wang, Yiliang Zhou, Yishu Wei, Hao Chen, Atlas Wang, Yifan Peng

CXR-LT 2024 was a challenge for long-tailed, multi-label, and zero-shot thorax disease classification on chest X-rays, held at MICCAI 2024. This page contains long-tailed labels for 45 diseases from the CXR-LT 2024 and 2023 challenges.

disease classification artificial intelligence chest x-ray deep learning computer-aided diagnosis long-tailed learning cardiopulmonary disease zero-shot learning

Published: March 19, 2025. Version: 2.0.0


Database Credentialed Access

MIMIC-CXR-JPG - chest radiographs with structured labels

Alistair Johnson, Matthew Lungren, Yifan Peng, Zhiyong Lu, Roger Mark, Seth Berkowitz, Steven Horng

Chest x-rays in JPG format with structured labels derived from the associated radiology report.

computer vision chest x-ray radiology deep learning mimic

Published: March 12, 2024. Version: 2.1.0


Database Restricted Access

Application of Med-PaLM 2 in the refinement of MIMIC-CXR labels

Kendall Park, Rory Sayres, Andrew Sellergren, Tom Pollard, Fayaz Jamil, Timo Kohlberger, Charles Lau, Atilla Kiraly

This work further refines the labels associated with CheXpert in MIMIC-CXR-JPG 2.0.0 by filtering with Med-PaLM 2 followed by verification by manual review by three US board-certified radiologists.

mimic-cxr labels

Published: Feb. 4, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-Ext-CXR-QBA: A Structured, Tagged, and Localized Visual Question Answering Dataset with Question-Box-Answer Triplets and Scene Graphs for Chest X-ray Images

Philip MĂĽller, Friederike Jungmann, Georgios Kaissis, Daniel Rueckert

We present a large-scale CXR VQA dataset derived from MIMIC-CXR with 42M QA pairs, featuring hierarchical answers, bounding boxes, and structured tags. We generated QA-pairs using LLM-based extraction from radiology reports and localization models.

chest x-rays vqa localization scene graphs

Published: July 22, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-Ext-MIMIC-CXR-VQA: A Complex, Diverse, And Large-Scale Visual Question Answering Dataset for Chest X-ray Images

Seongsu Bae, Daeun Kyung, Jaehee Ryu, Eunbyeol Cho, Gyubok Lee, Sunjun Kweon, Jungwoo Oh, Lei JI, Eric Chang, Tackeun Kim, Edward Choi

We introduce MIMIC-Ext-MIMIC-CXR-VQA, a complex, diverse, and large-scale dataset designed for Visual Question Answering (VQA) tasks within the medical domain, focusing primarily on chest radiographs.

question answering chest x-ray benchmark evaluation radiology machine learning electronic health records deep learning multimodal visual question answering

Published: July 19, 2024. Version: 1.0.0


Database Restricted Access

Pulmonary Edema Severity Grades Based on MIMIC-CXR

Ruizhi Liao, Geeticka Chauhan, Polina Golland, Seth Berkowitz, Steven Horng

Pulmonary edema metadata and labels for MIMIC-CXR

Published: Feb. 9, 2021. Version: 1.0.1


Database Restricted Access

LATTE-CXR: Locally Aligned TexT and imagE, Explainable dataset for Chest X-Rays

Elham Ghelichkhan, Tolga Tasdizen

This dataset includes bounding box-statement pairs for chest X-ray images, derived from radiologists’ eye-tracking data (for explainability) and annotations, for local visual-language models.

eye-tracking chest x-ray dataset automatically generated dataset caption-guided object detection image captioning with region-level description grounded radiology report generation phrase grounding xai multi-modal learning local visual-language models localization

Published: Feb. 4, 2025. Version: 1.0.0


Database Restricted Access

Visual Question Answering evaluation dataset for MIMIC CXR

Timo Kohlberger, Charles Lau, Tom Pollard, Andrew Sellergren, Atilla Kiraly, Fayaz Jamil

This dataset provides 224 VQAs for 40 test set cases, and 111 VQAs for 23 validation set cases of the MIMIC CXR dataset.

Published: Jan. 28, 2025. Version: 1.0.0