Resources


Database Restricted Access

Swiss-Mammo: A physician-written, synthetic dataset of German mammography reports

Daniel Reichenpfader, Sandro von Däniken, Harald Marcel Bonel

Swiss-Mammo: A physician-written, synthetic dataset of 28 German mammography reports. The dataset is stratified based on BI-RADS categories and available in German and English.

radiology mammography structured reporting bi-rads

Published: June 24, 2025. Version: 1.0.1


Database Restricted Access

Microbiological, Immunological and Biochemical Characteristics of the Development of Ventilator Associated Pneumonia

Natalia Sanabria-Herrera, Ingrid Gisell Bustos Moya, Luis Felipe Reyes

This study explores the respiratory microbiome's role in nosocomial lower respiratory tract infections in ICU patients. Conducted in Chía, Colombia, revealing the microbiome's impact on disease progression.

Published: June 24, 2025. Version: 1.1.0


Database Open Access

bigP3BCI: An Open, Diverse and Machine Learning Ready P300-based Brain-Computer Interface Dataset

Boyla Mainsah, Chance Fleeting, Thomas Balmat, Eric Sellers, Leslie Collins

A collection of data from P300-based brain-computer interface studies.

brain-computer interface electroencephalography ieee p2731 working group standard amyotrophic lateral sclerosis p300 speller p300 event related potential oddball paradigm error-related potential

Published: May 19, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-Ext Cardiac Disease

Jiawei Cao, Sendong Zhao

The subset of the MIMIC-IV dataset includes the examination results and diagnostic information of 4,761 cardiac disease patients. The examination results for each patient are listed separately as evidence for the final diagnosis.

Published: May 6, 2025. Version: 1.0.0


Database Restricted Access

MIMIC-III-Ext-Synthetic-Clinical-Trial-Questions

Elizabeth Woo, Michael Craig Burkhart, Emily Alsentzer, Brett Beaulieu-Jones

In our recent study, we used Llama-3.1-70B-Instruct to generate synthetic training examples resembling clinical trial eligibility criteria. We manually reviewed 1000 of these examples and release them here.

large language models synthetic data distillation clinical trial eligibility

Published: April 22, 2025. Version: 1.0.0


Database Restricted Access

Electrocardiogram-Capable Smartwatches: Assessing Their Clinical Accuracy and Application

Joaquin Recas, Mauro Buelga Suárez, Sergio González-Cabeza, Mario Sanz-Guerrero, Marian Diaz-Vicente, Alfonso Rebolleda, Luis Piñuel Moreno, Gonzalo Luis Alonso Salinas

This database allows the study of the feasibility of using ECG-capable smartwatches as diagnostic tools, focusing on their compliance with clinical standards and their ability to measure critical ECG parameters beyond AF detection.

ischemia fitbit sense ambulatory samsung galaxy watch st-segment apple watch withings scanwatch smartwatch

Published: April 9, 2025. Version: 1.0.0


Database Credentialed Access

MedVH: Towards Systematic Evaluation of Hallucination for Large Vision Language Models in the Medical Context

Zishan Gu, Jiayuan Chen, Fenglin Liu, Changchang Yin, Ping Zhang

MedVH provides a visual hallucination evaluation benchmark for large language models in the medical context. It formulates tests using chest X-ray images, including multi-choice question answering and long-text generation tasks.

Published: March 11, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-Ext Triage Instruction Corpus

Qingyang Shen, Quan Guo

MIMIC-IV-Ext Triage Instruction Corpus includes 9,629 ED triage cases organized by the five-level ESI, enabling LLMs to improve triage accuracy. It provides CSV data, generation prompts, expert validation samples, and SQL QC scripts.

nlp clinical decision support machine learning large language models emergency severity index emergency triage

Published: March 4, 2025. Version: 1.0.0


Database Restricted Access

Dataset for Segmentation and Classification of Cardiac Implantable Electronic Devices in Chest X-Rays

Keno Bressem, Felix Busch, Andrei Zhukov, Lisa Adams

This dataset comprises 11,094 converted DICOM and smartphone images of Cardiac Implantable Electronic Devices (CIEDs), collected from 897 patients. It aims to facilitate the development of algorithms for CIED detection and classification.

chest x-ray radiology medical imaging cardiac implantable electronic devices

Published: March 4, 2025. Version: 1.0.0


Database Credentialed Access

PIFIR: PET-CT Invasive Fungal Infection Reports

Vlada Rozova, Anna Khanina, Jeremy Ong, Ramin Alipour, Leon Worth, Monica Slavin, Karin Thursky, Karin Verspoor

A corpus of PET-CT reports annotated for terminology relevant to fungal infections. Ideal for validation of named entity recognition and relation extraction methods.

nlp clinical documentation information extraction invasive fungal infections

Published: Feb. 27, 2025. Version: 1.0.0