Resources


Challenge Credentialed Access

CXR-LT: Multi-Label Long-Tailed Classification on Chest X-Rays

Gregory Holste, Song Wang, Ajay Jaiswal, Yuzhe Yang, Mingquan Lin, Yifan Peng, Atlas Wang

CXR-LT 2023 was a challenge for long-tailed, multi-label thorax disease classification on chest X-rays, held in conjunction with the ICCV 2023 workshop, CVAMD. This page contains extended long-tailed versions of the MIMIC-CXR-JPG v2.0.0 dataset.

Published: Sept. 28, 2023. Version: 1.1.0


Database Credentialed Access

ReFiSco: Report Fix and Score Dataset for Radiology Report Generation

Katherine Tian, Sina J Hartung, Andrew A Li, Jaehwan Jeong, Fardad Behzadi, Juan Calle-Toro, Subathra Adithan, Michael Pohlen, David Osayande, Pranav Rajpurkar

Preliminary human expert evaluation study on 60 MIMIC-CXR radiology reports

Published: Aug. 23, 2023. Version: 0.0


Model Credentialed Access

Medical AI Research Foundations: A repository of medical foundation models

Shekoofeh Azizi, Jan Freyberg, Laura Culp, Patricia MacWilliams, Sara Mahdavi, Vivek Natarajan, Alan Karthikesalingam

Medical AI Research Foundations is a repository of medical foundation models.

Published: April 25, 2023. Version: 1.0.0


Database Credentialed Access

MS-CXR-T: Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing

Shruthi Bannur, Stephanie Hyland, Qianchu Liu, Fernando Pérez-García, Max Ilse, Daniel Coelho de Castro, Benedikt Boecking, Harshita Sharma, Kenza Bouzid, Anton Schwaighofer, Maria Teodora Wetscherek, Hannah Richardson, Tristan Naumann, Javier Alvarez Valle, Ozan Oktay

The MS-CXR-T is a multimodal benchmark that enhances the MIMIC-CXR v2 dataset by including expert-verified annotations. Its goal is to evaluate biomedical visual-language processing models in terms of temporal semantics extracted from image and text.

cxr disease progression vision-language processing multimodal radiology chest x-ray

Published: March 17, 2023. Version: 1.0.0


Model Credentialed Access

Clinical-T5: Large Language Models Built Using MIMIC Clinical Text

Eric Lehman, Alistair Johnson

We train a T5-Base and T5-Large from scratch on MIMIC-III and MIMIC-IV. Additionally, we further pretrain T5-Base and SciFive on notes from MIMIC. We release these model weights on PhysioNet.

Published: Jan. 25, 2023. Version: 1.0.0


Database Open Access

PTB-XL, a large publicly available electrocardiography dataset

Patrick Wagner, Nils Strodthoff, Ralf-Dieter Bousseljot, Wojciech Samek, Tobias Schaeffter

The PTB-XL ECG dataset is a large dataset of 21801 clinical 12-lead ECGs from 18869 patients of 10 second length. The raw signal data has been annotated by up to two cardiologists with 71 different ECG statements and is supplemented by rich metadata.

ptb-xl ptb ecg electrocardiography

Published: Nov. 9, 2022. Version: 1.0.3

Visualize waveforms

Database Credentialed Access

Chest X-ray segmentation images based on MIMIC-CXR

Li-Ching Chen, Po-Chih Kuo, Ryan Wang, Judy Gichoya, Leo Anthony Celi

A chest x-rays segmentation dataset derived from MIMIC-CXR based on deep learning algorithm and human examination.

segmentation chest x-rays cxr

Published: Aug. 18, 2022. Version: 1.0.0


Database Open Access

MIMIC-IV Clinical Database Demo on FHIR

Alex Bennett, Joshua Wiedekopf, Hannes Ulrich, Alistair Johnson

MIMIC-IV-on-FHIR is a hundred patient demo of MIMIC-IV v2.0 in the Fast Healthcare Interoperability Resources(FHIR) format. MIMIC-IV-on-FHIR provides implementers with a real-world FHIR datastore to aid in FHIR research and development.

fhir electronic health records mimic

Published: June 7, 2022. Version: 2.0


Database Credentialed Access

DrugEHRQA: A Question Answering Dataset on Structured and Unstructured Electronic Health Records For Medicine Related Queries

Jayetri Bardhan, Anthony Colas, Kirk Roberts, Daisy Zhe Wang

DrugEHRQA is a QA dataset containing question-answers from MIMIC-III tables and discharge summaries.

question-answer qa

Published: April 12, 2022. Version: 1.0.0


Database Credentialed Access

RuMedNLI: A Russian Natural Language Inference Dataset For The Clinical Domain

Pavel Blinov, Aleksandr Nesterov, Galina Zubkova, Arina Reshetnikova, Vladimir Kokh, Chaitanya Shivade

RuMedNLI is the full counterpart dataset of MedNLI in Russian language.

natural language inference recognizing textual entailment russian language

Published: April 1, 2022. Version: 1.0.0