Resources


Software Open Access

Transformer-DeID: Deidentification of free-text clinical notes with transformers

Callandra Moore, Lucas Bulgarelli, Tom Pollard, et al.

Fine tune transformer-based neural networks to deidentify clinical text data.

deidentification neural networks transformers

Published: Nov. 2, 2023. Version: 1.0.0


Software Open Access

Transformer-DeID: Deidentification of free-text clinical notes with transformers

Callandra Moore, Lucas Bulgarelli, Tom Pollard, et al.

Fine tune transformer-based neural networks to deidentify clinical text data.

deidentification neural networks transformers

Published: Nov. 2, 2023. Version: 1.0.0


Model Credentialed Access

Transformer models trained on MIMIC-III to generate synthetic patient notes

Ali Amin-Nejad, Julia Ive, Sumithra Velupillai

Machine learning models that have been trained using MIMIC-III to enable the creation of synthetic discharge summaries.

Published: May 27, 2020. Version: 1.0.0


Database Open Access

MIMIC-IV demo data in the OMOP Common Data Model

Michael Kallfelz, Anna Tsvetkova, Tom Pollard, et al.

Preliminary work to transform a MIMIC-IV demo dataset to the OMOP Common Data Model

omop common data model

Published: June 21, 2021. Version: 0.9


Model Credentialed Access

Fine-tuning foundational models to code diagnoses from veterinary health records

Adam Kiehl, Nadia Saklou, G Joseph Strecker, et al.

Fine-tuned GatorTron LLM for veterinary diagnosis coding to 7,739 SNOMED-CT codes based on clinical summary text from the Colorado State University Veterinary Teaching Hospital.

transformers natural language processing large language models foundational models one health diagnoses snomed-ct veterinary medicine omop cdm veterinary medical records clinical coding

Published: Jan. 25, 2026. Version: 1.0.0


Model Credentialed Access

Fine-tuning foundational models to code diagnoses from veterinary health records

Adam Kiehl, Nadia Saklou, G Joseph Strecker, et al.

Fine-tuned GatorTron LLM for veterinary diagnosis coding to 7,739 SNOMED-CT codes based on clinical summary text from the Colorado State University Veterinary Teaching Hospital.

transformers natural language processing large language models foundational models one health diagnoses snomed-ct veterinary medicine omop cdm veterinary medical records clinical coding

Published: Jan. 25, 2026. Version: 1.0.0


Database Credentialed Access

MedVAL-Bench: Expert-Annotated Medical Text Validation Benchmark

Asad Aali, Vasiliki Bikia, Maya Varma, et al.

MedVAL-Bench is the first large-scale physician-validated benchmark for medical text validation, spanning 6 diverse medical tasks and containing 840 language model-generated outputs annotated by 12 physicians with error assessments and risk grades.

Published: Nov. 14, 2025. Version: 1.0.1


Database Credentialed Access

RaDialog Instruct Dataset

Chantal Pellegrini, Ege Özsoy, Benjamin Busam, et al.

Image-based instruct data for Chest X-Ray understanding and analysis.

medical image understaning radiology chatbot radiology report generation radiology assistant large vision-language models

Published: July 12, 2024. Version: 1.1.0


Database Open Access

PADS - Parkinsons Disease Smartwatch dataset

Julian Varghese, Alexander Brenner, Lucas Plagwitz, et al.

The PADS dataset contains smartwatch-based records from interactive neurological assessments of Parkinsons disease patients, differential diagnoses and healthy controls. The data is complemented with non-motor symptoms and medical history information

wearables movement disorders parkinsons disease

Published: March 25, 2024. Version: 1.0.0


Database Credentialed Access

EchoNotes Structured Database derived from MIMIC-III (ECHO-NOTE2NUM)

Gloria Hyunjung Kwak, Dana Moukheiber, Mira Moukheiber, et al.

A structured echocardiogram database derived from 43,472 observational notes obtained during echocardiogram studies conducted in the intensive care unit at the Beth Israel Deaconess Medical Center between 2001 and 2012.

Published: Feb. 23, 2024. Version: 1.0.0