Resources


Database Credentialed Access

MS-CXR: Making the Most of Text Semantics to Improve Biomedical Vision-Language Processing

Benedikt Boecking, Naoto Usuyama, Shruthi Bannur, et al.

MS-CXR is a new dataset containing 1162 chest X-ray bounding box labels paired with radiology text descriptions, annotated and verified by two board-certified radiologists.

vision-language processing chest x-ray phrase grounding localization

Published: Nov. 15, 2024. Version: 1.1.0


Database Credentialed Access

FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark

Mingjie Li, Wenjia Cai, Rui Liu, et al.

Benchmark dataset for report generation based on fundus fluorescein angiography images and reports.

fundus fluorescein angiography medical report generation vision and language explainable and reliable evaluation

Published: Jan. 21, 2025. Version: 1.1.0


Database Contributor Review

COVID Data for Shared Learning (CDSL): A comprehensive, multimodal COVID-19 dataset from HM Hospitales

Álvaro Ritoré, Andreea M Oprescu, Alberto Estirado Bronchalo, et al.

COVID Data for Shared Learning (CDSL) is a multimodal database comprising de-identified structured health data and radiological images from 4,479 patients with COVID-19, as a comprehensive toolkit for developing predictive models.

covid-19 multimodal database radiological images open data healthcare data machine learning and ai

Published: Oct. 25, 2024. Version: 1.0.0


Database Credentialed Access

ReXPref-Prior: A MIMIC-CXR Preference Dataset for Reducing Hallucinated Prior Exams in Radiology Report Generation

Oishi Banerjee, Hong-Yu Zhou, Subathra Adithan, et al.

We propose ReXPref-Prior, an adapted version of MIMIC-CXR where GPT-4 has removed references to prior exams from both findings and impression sections of chest X-ray reports.

chest x-rays reinforcement learning hallucination

Published: Aug. 14, 2024. Version: 1.0.0


Database Credentialed Access

RadGraph2: Tracking Findings Over Time in Radiology Reports

Adam Dejl, Sameer Khanna, Patricia Therese Pile, et al.

RadGraph2 is a dataset of 800 chest radiology reports annotated using a fine-grained entity-relationship schema, which captures key findings as well as mentions of changes that occurred in comparison with the previous radiology studies.

chest x-rays relation extraction disease progression information extraction radiology reports named entity recognition

Published: Aug. 8, 2024. Version: 1.0.0


Database Credentialed Access

RadVLM Instruction Dataset

Nicolas Deperrois, Hidetoshi Matsuo, Samuel Ruiperez-Campillo, et al.

This dataset is designed to construct RadVLM, a vision–language model for chest X-ray interpretation. It includes instruction data for tasks such as report generation, abnormality detection, and region grounding, and multitask conversation.

chest x-rays vision-language models medical ai

Published: Sept. 25, 2025. Version: 1.0.0


Database Credentialed Access

RadGraph2: Tracking Findings Over Time in Radiology Reports

Adam Dejl, Sameer Khanna, Patricia Therese Pile, et al.

RadGraph2 is a dataset of 800 chest radiology reports annotated using a fine-grained entity-relationship schema, which captures key findings as well as mentions of changes that occurred in comparison with the previous radiology studies.

chest x-rays relation extraction disease progression information extraction radiology reports named entity recognition

Published: Aug. 8, 2024. Version: 1.0.0


Database Credentialed Access

RadVLM Instruction Dataset

Nicolas Deperrois, Hidetoshi Matsuo, Samuel Ruiperez-Campillo, et al.

This dataset is designed to construct RadVLM, a vision–language model for chest X-ray interpretation. It includes instruction data for tasks such as report generation, abnormality detection, and region grounding, and multitask conversation.

chest x-rays vision-language models medical ai

Published: Sept. 25, 2025. Version: 1.0.0