Resources


Database Credentialed Access

VinDr-CXR: An open dataset of chest X-rays with radiologist annotations

Ha Quy Nguyen, Hieu Huy Pham, le tuan linh, Minh Dao, lam khanh

VinDr-CXR: An open dataset of chest X-rays with radiologist's annotations

lesion detection disease classification chest x-ray interpretation computer vision deep learning

Published: June 22, 2021. Version: 1.0.0


Database Open Access

MIT-BIH Arrhythmia Database P-Wave Annotations

P-wave annotations for twelve signals from the MIT-BIH Arrhythmia Database.

arrhythmia ecg

Published: Jan. 9, 2018. Version: 1.0.0

Visualize waveforms

Database Open Access

ScientISST MOVE: Annotated Wearable Multimodal Biosignals recorded during Everyday Life Activities in Naturalistic Environments

João Areias Saraiva, Mariana Abreu, Ana Sofia Carmo, Hugo Plácido da Silva, Ana Fred

Multimodal (ECG, EMG, EDA, PPG, TEMP, ACC) biosignal dataset of everyday activities. Created with 3 wearable devices based on ScientISST Sense and Empatica E4.

multimodal greet lift uncontrolled environments run jump gesticulate walk wearable

Published: March 25, 2024. Version: 1.0.1


Database Credentialed Access

Annotated MIMIC-IV discharge summaries for a study on deidentification of names

Shulammite Lim, Yuxin Xiao, Alistair Johnson, Dana Moukheiber, Lama Moukheiber, Mira Moukheiber, Marzyeh Ghassemi, Tom Pollard

Annotated MIMIC-IV discharge summaries used to explore deidentification of names

deidentification fairness

Published: July 5, 2023. Version: 1.0


Database Credentialed Access

RadCoref: Fine-tuning coreference resolution for different styles of clinical narratives

Yuxiang Liao, Hantao Liu, Irena Spasic

RadCoref is a small subset of MIMIC-CXR with manually annotated coreference mentions and clusters. Based on the annotated data, we fine-tuned a deep neural model and used it to annotate the whole MIMIC-CXR dataset. Both data are available.

radiology natural language processing coreference resolution

Published: Jan. 30, 2024. Version: 1.0.0


Database Open Access

The CirCor DigiScope Phonocardiogram Dataset

Jorge Oliveira, Francesco Renna, Paulo Costa, Marcelo Nogueira, Ana Cristina Oliveira, Andoni Elola, Carlos Ferreira, Alipio Jorge, Ali Bahrami Rad, Matthew Reyna, Reza Sameni, Gari Clifford, Miguel Coimbra

A large collection of multi-location heart sound signals, with 5272 records collected from 1568 subjects. Heart murmurs have been annotated by a human annotator based on their time, shape, pitch, grading, quality, location and location intensity.

signal processing murmur pitch george b moody physionet challenge 2022 murmur grading murmur location murmur timing phonocardiogram pregnant murmur shape pediatric murmur detection murmur intensity murmur quality

Published: May 10, 2022. Version: 1.0.3

Visualize waveforms

Database Credentialed Access

Tasks 1 and 3 from Progress Note Understanding Suite of Tasks: SOAP Note Tagging and Problem List Summarization

Yanjun Gao, John Caskey, Timothy Miller, Brihat Sharma, Matthew Churpek, Dmitriy Dligach, Majid Afshar

We introduce a hierarchical annotation suite of tasks addressing clinical text understanding, reasoning and abstraction over evidence, and diagnosis summarization. One task is section tagging major section and the other task is diagnosis generation.

Published: Sept. 30, 2022. Version: 1.0.0


Database Credentialed Access

CLIP: A Dataset for Extracting Action Items for Physicians from Hospital Discharge Notes

James Mullenbach, Yada Pruksachatkun, Sean Adler, Jennifer Seale, Jordan Swartz, T Greg McKelvey, Yi Yang, David Sontag

Clinical action items annotated over MIMIC-III. 718 discharge summaries are labeled at a sentence- and character-level with multiple action labels including Appointment, Lab, Procedure, Medication, Imaging, Patient Instructions, and Other.

Published: June 21, 2021. Version: 1.0.0


Database Credentialed Access

RadGraph: Extracting Clinical Entities and Relations from Radiology Reports

Saahil Jain, Ashwin Agrawal, Adriel Saporta, Steven QH Truong, Du Nguyen Duong, Tan Bui, Pierre Chambon, Matthew Lungren, Andrew Ng, Curtis Langlotz, Pranav Rajpurkar

RadGraph is a dataset of entities and relations in full-text chest X-ray radiology reports, which are obtained using a novel information extraction (IE) schema to capture clinically relevant information in a radiology report.

entity and relation extraction graph multi-modal radiology natural language processing

Published: June 3, 2021. Version: 1.0.0


Database Credentialed Access

CHIFIR: Cytology and Histopathology Invasive Fungal Infection Reports

Vlada Rozova, Anna Khanina, Jasmine Teng, Joanne Teh, Leon Worth, Monica Slavin, karin thursky, Karin Verspoor

A corpus of cytology and histopathology reports annotated for terminology relevant to fungal infections. Ideal for validation of named entity recognition and relation extraction methods.

nlp information extraction clinical documentation invasive fungal infections

Published: Feb. 20, 2024. Version: 1.0.2