Resources


Software Open Access

Transformer-DeID: Deidentification of free-text clinical notes with transformers

Callandra Moore, Lucas Bulgarelli, Tom Pollard, Alistair Johnson

Fine tune transformer-based neural networks to deidentify clinical text data.

deidentification neural networks transformers

Published: Nov. 2, 2023. Version: 1.0.0


Database Contributor Review

CARMEN-I: A resource of anonymized electronic health records in Spanish and Catalan for training and testing NLP tools

Eulalia Farre Maduell, Salvador Lima-Lopez, Santiago Andres Frid, Artur Conesa, Elisa Asensio, Antonio Lopez-Rueda, Helena Arino, Elena Calvo, Maria Jesús Bertran, Maria Angeles Marcos, Montserrat Nofre Maiz, Laura Tañá Velasco, Antonia Marti, Ricardo Farreres, Xavier Pastor, Xavier Borrat Frigola, Martin Krallinger

CARMEN-I is a Spanish corpus of 2,000 clinical records from Hospital Clínic, Barcelona. It covers COVID-19 patients and comorbidities, serving as a resource for training clinical NLP models and researchers in NLP applied to clinical documents.

de-identification anonymization clinical ner

Published: Nov. 2, 2023. Version: 1.0


Database Open Access

Induced Cesarean EHG DataSet (ICEHG DS): An open dataset with electrohysterogram records of pregnancies ending in induced and cesarean section delivery

Franc Jager

The design and development of ICEHG DS was funded by the Slovenian Research Agency (ARRS) under the research project Metabolic and inborn factors of reproductive health, birth III.

neuroelectric pregnancy electrohysterogram cesarean-section delivery induced delivery

Published: Oct. 8, 2023. Version: 1.0.1

Visualize waveforms

Challenge Open Access

Heart Murmur Detection from Phonocardiogram Recordings: The George B. Moody PhysioNet Challenge 2022

Matthew Reyna, Yashar Kiarashi, Andoni Elola, Jorge Oliveira, Francesco Renna, Annie Gu, Erick Andres Perez Alday, Nadi Sadr, Sandra Mattos, Miguel Coimbra, Reza Sameni, Ali Bahrami Rad, Zuzana Koscova, Gari Clifford

2022 Physionet Challenge is devoted to detecting the presence or absence of murmurs from multiple heart sound recordings from multiple auscultation locations, as well as detecting the clinical outcomes.

challenge competition cardiac auscultation congenital heart diseases

Published: Sept. 28, 2023. Version: 1.0.0


Challenge Credentialed Access

CXR-LT: Multi-Label Long-Tailed Classification on Chest X-Rays

Gregory Holste, Song Wang, Ajay Jaiswal, Yuzhe Yang, Mingquan Lin, Yifan Peng, Atlas Wang

CXR-LT 2023 was a challenge for long-tailed, multi-label thorax disease classification on chest X-rays, held in conjunction with the ICCV 2023 workshop, CVAMD. This page contains extended long-tailed versions of the MIMIC-CXR-JPG v2.0.0 dataset.

Published: Sept. 28, 2023. Version: 1.1.0


Database Open Access

BIG IDEAs Lab Glycemic Variability and Wearable Device Data

Peter Cho, Juseong Kim, Brinnae Bent, Jessilyn Dunn

Glucose measurements and wrist-worn wearable sensor data from highnormoglycemic participants.

biomedical engineering pre-diabetes biomarkers

Published: Sept. 18, 2023. Version: 1.1.2


Database Credentialed Access

Medical-Diff-VQA: A Large-Scale Medical Dataset for Difference Visual Question Answering on Chest X-Ray Images

Xinyue Hu, Lin Gu, Qiyuan An, Mengliang Zhang, liangchen liu, Kazuma Kobayashi, Tatsuya Harada, Ronald Summers, Yingying Zhu

MIMIC-Diff-VQA provides a large-scale dataset for Difference visual question answering in medical chest x-ray images.

chest x-ray visual question answering difference vqa vqa difference visual question answering

Published: Sept. 15, 2023. Version: 1.0.0


Database Open Access

MIMIC-IV-ECG: Diagnostic Electrocardiogram Matched Subset

Brian Gow, Tom Pollard, Larry A Nathanson, Alistair Johnson, Benjamin Moody, Chrystinne Fernandes, Nathaniel Greenbaum, Jonathan W Waks, Parastou Eslami, Tanner Carbonati, Ashish Chaudhari, Elizabeth Herbst, Dana Moukheiber, Seth Berkowitz, Roger Mark, Steven Horng

The MIMIC-IV ECG module contains approximately 800,000 diagnostic electrocardiograms across nearly 160,000 unique patients. These patients overlap with the patients from the MIMIC-IV Clinical Database.

Published: Sept. 15, 2023. Version: 1.0

Visualize waveforms

Database Open Access

CheXmask Database: a large-scale dataset of anatomical segmentation masks for chest x-ray images

Nicolas Gaggion, Candelaria Mosquera, Martina Aineseder, Lucas Mansilla, Diego Milone, Enzo Ferrante

CheXmask Database is a 676,803 uniformly annotated chest radiographs with segmentation masks. Images were segmented using HybridGNet, with automatic quality control indicated by RCA scores.

chest x-ray segmentation medical image segmentation automatic quality assesment

Published: Sept. 6, 2023. Version: 0.2


Database Restricted Access

A multimodal dental dataset facilitating machine learning research and clinic services

wenjing liu, Yunyou Huang, Suqin Tang

A new dental dataset that contains 389 patients, three commonly used dental image models, and images of various health conditions of the oral cavity.

Published: Sept. 6, 2023. Version: 1.0.0