Resources


Database Credentialed Access

Generalized Image Embeddings for the MIMIC Chest X-Ray dataset

Andrew Sellergren, Atilla Kiraly, Tom Pollard, Wei-Hung Weng, Yun Liu, Akib Uddin, Christina Chen

This database contains compact information-rich embeddings of the MIMIC-CXR Database v2.0.0 using the CXR Foundation API v1.0.

Published: Feb. 22, 2023. Version: 1.0


Database Restricted Access

Smartphone-Captured Chest X-Ray Photographs

Po-Chih Kuo, ChengChe Tsai, Diego M Lopez, Alexandros Karargyris, Tom Pollard, Alistair Johnson, Leo Anthony Celi

Smartphone-captured CXR images including photographs taken from MIMIC-CXR and CheXpert, photographs taken by resident doctors, and photographs taken with different devices.

smartphone photograph cxr

Published: Sept. 27, 2020. Version: 1.0.0


Database Credentialed Access

MIMIC-Ext-CXR-QBA: A Structured, Tagged, and Localized Visual Question Answering Dataset with Question-Box-Answer Triplets and Scene Graphs for Chest X-ray Images

Philip Müller, Friederike Jungmann, Georgios Kaissis, Daniel Rueckert

We present a large-scale CXR VQA dataset derived from MIMIC-CXR with 42M QA pairs, featuring hierarchical answers, bounding boxes, and structured tags. We generated QA-pairs using LLM-based extraction from radiology reports and localization models.

chest x-rays vqa localization scene graphs

Published: July 22, 2025. Version: 1.0.0


Database Credentialed Access

Symile-MIMIC: a multimodal clinical dataset of chest X-rays, electrocardiograms, and blood labs from MIMIC-IV

Adriel Saporta, Aahlad Manas Puli, Mark Goldstein, Rajesh Ranganath

A multimodal clinical dataset consisting of CXRs, ECGs, and blood labs, designed to evaluate Symile, a simple contrastive loss that accommodates any number of modalities and allows any model to produce representations for each modality.

database cxr ecg chest x-ray electrocardiogram contrastive learning model multimodal mimic

Published: Jan. 28, 2025. Version: 1.0.0


Database Credentialed Access

Chest X-ray segmentation images based on MIMIC-CXR

Li-Ching Chen, Po-Chih Kuo, Ryan Wang, Judy Gichoya, Leo Anthony Celi

A chest x-rays segmentation dataset derived from MIMIC-CXR based on deep learning algorithm and human examination.

segmentation chest x-rays cxr

Published: Aug. 18, 2022. Version: 1.0.0


Challenge Credentialed Access

CXR-LT: Multi-Label Long-Tailed Classification on Chest X-Rays

Gregory Holste, Mingquan Lin, Song Wang, Yiliang Zhou, Yishu Wei, Hao Chen, Atlas Wang, Yifan Peng

CXR-LT 2024 was a challenge for long-tailed, multi-label, and zero-shot thorax disease classification on chest X-rays, held at MICCAI 2024. This page contains long-tailed labels for 45 diseases from the CXR-LT 2024 and 2023 challenges.

disease classification artificial intelligence chest x-ray deep learning computer-aided diagnosis long-tailed learning cardiopulmonary disease zero-shot learning

Published: March 19, 2025. Version: 2.0.0


Database Open Access

ReXErr-v1: Clinically Meaningful Chest X-Ray Report Errors Derived from MIMIC-CXR

Vishwanatha Rao, Serena Zhang, Julian Acosta, Subathra Adithan, Pranav Rajpurkar

Chest X-Ray reports containing synthetic errors based upon the MIMIC-CXR database. Errors were injected using LLMs and sampled across common human and AI model errors.

Published: March 19, 2025. Version: 1.0.0


Database Open Access

CheXmask Database: a large-scale dataset of anatomical segmentation masks for chest x-ray images

Nicolas Gaggion, Candelaria Mosquera, Martina Aineseder, Lucas Mansilla, Diego Milone, Enzo Ferrante

CheXmask Database is a 657,566 uniformly annotated chest radiographs with segmentation masks. Images were segmented using HybridGNet, with automatic quality control indicated by RCA scores.

automatic quality assesment chest x-ray segmentation medical image segmentation

Published: Jan. 22, 2025. Version: 1.0.0


Database Open Access

Image-derived cardiomegaly biomarker values for 96K chest X-rays in MIMIC-CXR/MIMIC-CXR-JPG

Benjamin Duvieusart, Felix Krones, Guy Parsons, Lionel Tarassenko, Bartlomiej W Papiez, Adam Mahdi

Automatically extracted cardiomegaly biomarkers - cardiothoracic ratio (CTR) and cardiopulmonary area ratio (CPAR) - for all posterior-anterior chest x-ray scans in MIMIC-CXR/MIMIC-CXR-JPG.

biomarkers mimic-cxr cpar ctr cardiomegaly

Published: Aug. 23, 2024. Version: 1.0.0


Database Credentialed Access

EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images

Seongsu Bae, Daeun Kyung, Jaehee Ryu, Eunbyeol Cho, Gyubok Lee, Sunjun Kweon, Jungwoo Oh, Lei JI, Eric Chang, Tackeun Kim, Edward Choi

We present EHRXQA, the first multi-modal EHR QA dataset combining structured patient records with aligned chest X-ray images. EHRXQA contains a comprehensive set of QA pairs covering image-related, table-related, and image+table-related questions.

question answering chest x-ray benchmark evaluation multi-modal question answering ehr question answering semantic parsing machine learning electronic health records deep learning visual question answering

Published: July 23, 2024. Version: 1.0.0