Resources


Database Credentialed Access

MS-CXR: Making the Most of Text Semantics to Improve Biomedical Vision-Language Processing

Benedikt Boecking, Naoto Usuyama, Shruthi Bannur, Daniel Coelho de Castro, Anton Schwaighofer, Stephanie Hyland, Maria Teodora Wetscherek, Tristan Naumann, Aditya Nori, Javier Alvarez Valle, Hoifung Poon, Ozan Oktay

MS-CXR is a new dataset containing 1162 Chest X-ray bounding box labels paired with radiology text descriptions, annotated and verified by two board-certified radiologists.

chest x-ray vision-language processing

Published: May 16, 2022. Version: 0.1


Database Open Access

The CirCor DigiScope Phonocardiogram Dataset

Jorge Oliveira, Francesco Renna, Paulo Costa, Marcelo Nogueira, Ana Cristina Oliveira, Andoni Elola, Carlos Ferreira, Alipio Jorge, Ali Bahrami Rad, Matthew Reyna, Reza Sameni, Gari Clifford, Miguel Coimbra

A large collection of multi-location heart sound signals, with 5272 records collected from 1568 subjects. Heart murmurs have been annotated by a human annotator based on their time, shape, pitch, grading, quality, location and location intensity.

signal processing murmur pitch george b moody physionet challenge 2022 murmur grading murmur location murmur timing phonocardiogram pregnant murmur shape pediatric murmur detection murmur intensity murmur quality

Published: May 10, 2022. Version: 1.0.3

Visualize waveforms

Database Credentialed Access

DrugEHRQA: A Question Answering Dataset on Structured and Unstructured Electronic Health Records For Medicine Related Queries

Jayetri Bardhan, Anthony Colas, Kirk Roberts, Daisy Zhe Wang

DrugEHRQA is a QA dataset containing question-answers from MIMIC-III tables and discharge summaries.

question-answer qa

Published: April 12, 2022. Version: 1.0.0


Database Open Access

Icentia11k Single Lead Continuous Raw Electrocardiogram Dataset

Shawn Tan, Satya Ortiz-Gagné, Nicolas Beaudoin-Gagnon, Pierre Fecteau, Aaron Courville, Yoshua Bengio, Joseph Paul Cohen

This is a dataset of continuous raw electrocardiogram (ECG) signals for representation learning containing 11 thousand patients and 2 billion labelled beats.

representation learning ecg

Published: April 12, 2022. Version: 1.0

Visualize waveforms

Database Credentialed Access

RuMedNLI: A Russian Natural Language Inference Dataset For The Clinical Domain

Pavel Blinov, Aleksandr Nesterov, Galina Zubkova, Arina Reshetnikova, Vladimir Kokh, Chaitanya Shivade

RuMedNLI is the full counterpart dataset of MedNLI in Russian language.

natural language inference recognizing textual entailment russian language

Published: April 1, 2022. Version: 1.0.0


Database Open Access

Wearable-based signals during physical exercises from patients with frailty after open-heart surgery

Daivaras Sokas, Monika Butkuvienė, Egle Tamulevičiūtė-Prascienė, Aurelija Beigienė, Raimondas Kubilius, Andrius Petrėnas, Birutė Paliakaitė

A data collection contains a wearable-based electrocardiogram and triaxial acceleration signals of 80 elderly patients with frailty after an open-heart surgery. The signals were collected while the patients were performing a series of exercise tests.

heart rate response frailty posture veloergometry timed up and go stair climbing physical exercise heart rate reserve rehabilitation accelerometer aging heart rate gait electrocardiogram walk test balance wearable

Published: March 31, 2022. Version: 1.0.0

Visualize waveforms

Database Open Access

CPAP Pressure and Flow Data from a Local Trial of 30 Adults at the University of Canterbury

Ella Guy, Jennifer Knopp, Geoff Chase

A pressure and flow dataset was collected from a trial of 30 adults at the University of Canterbury undergoing CPAP therapy for a variety of instructed breath rates at PEEP levels of 4cmH2O and 7cmH2O.

peep cpap respiratory mechanics pulmonary mechanics respiratory modelling biomedical engineering

Published: March 24, 2022. Version: 1.0.1


Database Open Access

Norwegian Endurance Athlete ECG Database

Bjørn-Jostein Singstad

This project contains 28 ECGs from 28 healthy elite athletes. The ECGs have been interpreted by the Marquette SL12 (version 23) algorithm and a Cardiologist using the International Criteria for ECG interpretation (2018).

marquettesl12 ge cardiologist athletes electrocardiogram ecg

Published: March 23, 2022. Version: 1.0.0

Visualize waveforms

Database Restricted Access

VinDr-PCXR: An open, large-scale pediatric chest X-ray dataset for interpretation of common thoracic diseases

Hieu Huy Pham, Tien Thanh Tran, Ha Quy Nguyen

An open, large-scale pediatric chest X-ray dataset that contains both lesion-level labels and image-level labels for multiple findings and diseases for interpretation of common thoracic diseases.

Published: March 21, 2022. Version: 1.0.0


Database Restricted Access

VinDr-Mammo: A large-scale benchmark dataset for computer-aided detection and diagnosis in full-field digital mammography

Hieu Huy Pham, Hieu Nguyen Trung, Ha Quy Nguyen

A large-scale benchmark dataset for computer-aided detection and diagnosis in mammography

Published: March 21, 2022. Version: 1.0.0