Resources


Database Credentialed Access

Multimodal Clinical Monitoring in the Emergency Department (MC-MED)

Aman Kansal, Emma Chen, Tom Jin, Pranav Rajpurkar, David Kim

A multimodal dataset of deidentified clinical and physiological data from emergency department visits, aimed at enabling research on patient outcomes, care processes, and the impact of continuous monitoring on treatment during and after the COVID-19.

Published: March 3, 2025. Version: 1.0.0


Database Restricted Access

OpenOximetry Repository

Nicholas Fong, Michael Lipnick, Philip Bickler, John Feiner, Tyler Law

A repository of matched arterial oxygen and pulse oximeter readings obtained under controlled conditions, with high-frequency physiologic waveforms and skin color measurements.

Published: Feb. 28, 2025. Version: 1.1.1


Database Credentialed Access

PIFIR: PET-CT Invasive Fungal Infection Reports

Vlada Rozova, Anna Khanina, Jeremy Ong, Ramin Alipour, Leon Worth, Monica Slavin, Karin Thursky, Karin Verspoor

A corpus of PET-CT reports annotated for terminology relevant to fungal infections. Ideal for validation of named entity recognition and relation extraction methods.

nlp clinical documentation information extraction invasive fungal infections

Published: Feb. 27, 2025. Version: 1.0.0


Database Open Access

A Multimodal Dataset for Investigating Working Memory in Presence of Music

Saman Khazaei, Srinidhi Parshi, Samiul Alam, Md Rafiul Amin, Rose T Faghih

A multimodal dataset containing fNIRS data along with a wide range of physiological signals like EDA, HR, PPG, etc over the course of n-back experiments in presence of music.

Published: Feb. 26, 2025. Version: 1.0.0


Database Restricted Access

LATTE-CXR: Locally Aligned TexT and imagE, Explainable dataset for Chest X-Rays

Elham Ghelichkhan, Tolga Tasdizen

This dataset includes bounding box-statement pairs for chest X-ray images, derived from radiologists’ eye-tracking data (for explainability) and annotations, for local visual-language models.

eye-tracking chest x-ray dataset automatically generated dataset caption-guided object detection localization image captioning with region-level description grounded radiology report generation phrase grounding xai multi-modal learning local visual-language models

Published: Feb. 4, 2025. Version: 1.0.0


Database Restricted Access

Application of Med-PaLM 2 in the refinement of MIMIC-CXR labels

Kendall Park, Rory Sayres, Andrew Sellergren, Tom Pollard, Fayaz Jamil, Timo Kohlberger, Charles Lau, Atilla Kiraly

This work further refines the labels associated with CheXpert in MIMIC-CXR-JPG 2.0.0 by filtering with Med-PaLM 2 followed by verification by manual review by three US board-certified radiologists.

mimic-cxr labels

Published: Feb. 4, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-Ext-BHC: Labeled Clinical Notes Dataset for Hospital Course Summarization

Asad Aali, Dave Van Veen, Yamin Arefeen, Jason Hom, Christian Bluethgen, Eduardo Pontes Reis, Sergios Gatidis, Namuun Clifford, Joseph Daws, Arash Tehrani, Jangwon Kim, Akshay Chaudhari

This dataset presents a collection of preprocessed and labeled clinical notes derived from "MIMIC-IV-Note", and aims to facilitate the development of ML models focused on summarizing brief hospital courses (BHC) from clinical notes.

clinical notes natural language processing machine learning brief hospital course text summarization

Published: Feb. 3, 2025. Version: 1.2.0


Database Open Access

Synthetic Mention Corpora for Disease Entity Recognition and Normalization

Kuleen Sasse, John David Osborne

We present the Synthetic Mention Corpora for Disease Entity Recognition and Normalization, containing 128000 disease mentions from the UMLS disorder group, generated by an LLM. This corpus aims to improve these tasks in biomedical and clinical texts.

nlp named entity recognition machine learning data augmentation entity normalization

Published: Feb. 3, 2025. Version: 1.0.0


Database Credentialed Access

Medical-Diff-VQA: A Large-Scale Medical Dataset for Difference Visual Question Answering on Chest X-Ray Images

Xinyue Hu, Lin Gu, Qiyuan An, Mengliang Zhang, liangchen liu, Kazuma Kobayashi, Tatsuya Harada, Ronald Summers, Yingying Zhu

MIMIC-Diff-VQA provides a large-scale dataset for Difference visual question answering in medical chest x-ray images.

vqa difference visual question answering difference vqa chest x-ray visual question answering

Published: Feb. 3, 2025. Version: 1.0.1


Database Restricted Access

CXRGraph: Using Information Extraction to Normalize the Training Data for Automatic Radiology Report Generation

Yuxiang Liao, Hoisang Heung, Hantao Liu, Irena Spasic

CXRGraph is a structured radiology report dataset built upon RadGraph and tailored for the Automatic Radiology Report Generation task. It can identify more task-relevant information such as abnormalities and hallucinated prior references.

relation extraction information extraction natural language processing named entity recognition structured radiology report

Published: Feb. 3, 2025. Version: 1.0.0