Resources


Database Credentialed Access

CAD-Chest: Comprehensive Annotation of Diseases based on MIMIC-CXR Radiology Report

Mengliang Zhang, Xinyue Hu, Lin Gu, Tatsuya Harada, Kazuma Kobayashi, Ronald Summers, Yingying Zhu

The CAD-Chest dataset provides comprehensive annotations of disease, including disease severity, uncertainty, and location based on the MIMIC-CXR radiologist reports.

chesr x-ray disease label

Published: Dec. 8, 2023. Version: 1.0


Database Credentialed Access

A Brazilian Multilabel Ophthalmological Dataset (BRSET)

Luis Filipe Nakayama, Mariana Goncalves, Lucas Zago Ribeiro, Helen Santos, Daniel Ferraz, Fernando Malerbi, Leo Anthony Celi, Caio Regatieri

This is the first Brazilian Multilabel Ophthalmological Dataset with demographic information and retinal photos labeled images according to anatomical parameters, quality control, and presumed diagnosis.

dataset ophthalmology retina

Published: March 8, 2023. Version: 1.0.0


Database Credentialed Access

Eye Gaze Data for Chest X-rays

Alexandros Karargyris, Satyananda Kashyap, Ismini Lourentzou, Joy Wu, Matthew Tong, Arjun Sharma, Shafiq Abedin, David Beymer, Vandana Mukherjee, Elizabeth Krupinski, Mehdi Moradi

This dataset was a collected using an eye tracking system while a radiologist interpreted and read 1,083 public CXR images. The dataset contains the following aligned modalities: image, transcribed report text, dictation audio and eye gaze data.

audio convolutional network heatmap eye tracking multimodal machine learning chest x-ray radiology explainability chest cxr deep learning

Published: Sept. 12, 2020. Version: 1.0.0


Database Credentialed Access

Annotation dataset of problematic opioid use and related contexts from MIMIC-III Critical Care Database discharge summaries

Melissa Poulsen, Vanessa Troiani, Philip Freda, Danielle Mowery, Anahita Davoudi

The database contains a corpus of annotated data from the MIMIC-III Critical Care Database from a study that aimed to develop and apply an annotation schema to characterize opioid use disorder and related contextual factors.

natural language processing opioid use disorder clinical notes substance use

Published: Feb. 8, 2023. Version: 1.0.0


Database Credentialed Access

CAD-Chest: Comprehensive Annotation of Diseases based on MIMIC-CXR Radiology Report

Mengliang Zhang, Xinyue Hu, Lin Gu, Tatsuya Harada, Kazuma Kobayashi, Ronald Summers, Yingying Zhu

The CAD-Chest dataset provides comprehensive annotations of disease, including disease severity, uncertainty, and location based on the MIMIC-CXR radiologist reports.

chesr x-ray disease label

Published: Dec. 8, 2023. Version: 1.0


Database Credentialed Access

VinDr-CXR: An open dataset of chest X-rays with radiologist annotations

Ha Quy Nguyen, Hieu Huy Pham, le tuan linh, Minh Dao, lam khanh

VinDr-CXR: An open dataset of chest X-rays with radiologist's annotations

lesion detection disease classification chest x-ray interpretation computer vision deep learning

Published: June 22, 2021. Version: 1.0.0


Challenge Credentialed Access

CXR-LT: Multi-Label Long-Tailed Classification on Chest X-Rays

Gregory Holste, Song Wang, Ajay Jaiswal, Yuzhe Yang, Mingquan Lin, Yifan Peng, Atlas Wang

CXR-LT 2023 was a challenge for long-tailed, multi-label thorax disease classification on chest X-rays, held in conjunction with the ICCV 2023 workshop, CVAMD. This page contains extended long-tailed versions of the MIMIC-CXR-JPG v2.0.0 dataset.

Published: Sept. 28, 2023. Version: 1.1.0


Database Restricted Access

KURIAS-ECG: a 12-lead electrocardiogram database with standardized diagnosis ontology

Hakje Yoo, Yunjin Yum, Soowan Park, Jeong Moon Lee, Moonjoung Jang, Yoojoong Kim, Jong-Ho Kim, Hyun-Joon Park, Kap Su Han, Jae Hyoung Park, Hyung Joon Joo

The KURIAS-ECG database is a high-quality 12-lead ECG DB including standard vocabulary (SNOMED CT, OMOP-CDM), and ECG diagnoses of our DB are grouped into 10 diagnoses by applying the minnesota code.

snomed 12-lead minnesota ecg

Published: Nov. 8, 2021. Version: 1.0


Database Credentialed Access

MS-CXR-T: Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing

Shruthi Bannur, Stephanie Hyland, Qianchu Liu, Fernando Pérez-García, Max Ilse, Daniel Coelho de Castro, Benedikt Boecking, Harshita Sharma, Kenza Bouzid, Anton Schwaighofer, Maria Teodora Wetscherek, Hannah Richardson, Tristan Naumann, Javier Alvarez Valle, Ozan Oktay

The MS-CXR-T is a multimodal benchmark that enhances the MIMIC-CXR v2 dataset by including expert-verified annotations. Its goal is to evaluate biomedical visual-language processing models in terms of temporal semantics extracted from image and text.

multimodal chest x-ray radiology cxr disease progression vision-language processing

Published: March 17, 2023. Version: 1.0.0


Database Contributor Review

CARMEN-I: A resource of anonymized electronic health records in Spanish and Catalan for training and testing NLP tools

Eulalia Farre Maduell, Salvador Lima-Lopez, Santiago Andres Frid, Artur Conesa, Elisa Asensio, Antonio Lopez-Rueda, Helena Arino, Elena Calvo, Maria Jesús Bertran, Maria Angeles Marcos, Montserrat Nofre Maiz, Laura Tañá Velasco, Antonia Marti, Ricardo Farreres, Xavier Pastor, Xavier Borrat Frigola, Martin Krallinger

CARMEN-I is a Spanish corpus of 2,000 clinical records from Hospital Clínic, Barcelona. It covers COVID-19 patients and comorbidities, serving as a resource for training clinical NLP models and researchers in NLP applied to clinical documents.

de-identification clinical ner anonymization

Published: April 20, 2024. Version: 1.0.1