Resources


Database Credentialed Access

Phenotype Annotations for Patient Notes in the MIMIC-III Database

Edward Moseley, Leo Anthony Celi, Joy Wu, et al.

Clinical notes, annotated by at least two expert annotators for over ten patient phenotypes, including advanced cancer, substance abuse, and treatment non-adherence.

patient classification natural language processing

Published: March 5, 2020. Version: 1.20.3


Database Credentialed Access

MedNLI for Shared Task at ACL BioNLP 2019

Chaitanya Shivade

Data for the MedNLI Shared Task at the 2019 ACL BioNLP 2019 Workshop on Biomedical Language Processing

natural language inference recognizing textual entailment mimic

Published: Nov. 28, 2019. Version: 1.0.1


Database Credentialed Access

MedNLI - A Natural Language Inference Dataset For The Clinical Domain

Chaitanya Shivade

This is a resource for training machine learning models for language inference in the medical domain.

natural language inference recognizing textual entailment

Published: Oct. 1, 2019. Version: 1.0.0


Database Open Access

CUILESS2016

A corpus of Concept Unique Identifier concepts taken from the SemEval2015 Task 14.

concept umls snomed

Published: Jan. 24, 2018. Version: 1.0.0


Database Open Access

STAFF III Database

The STAFF III database was acquired during 1995–96 at Charleston Area Medical Center (WV, USA) where single prolonged balloon inflation had been introduced to achieve optimal results of percutaneous transluminal coronary angiography (PTCA) pro…

angiography ecg

Published: Jan. 31, 2017. Version: 1.0.0

Visualize waveforms

Challenge Credentialed Access

ShAReCLEF eHealth 2013: Natural Language Processing and Information Retrieval for Clinical Care

Danielle Mowery

2013 ShARe/CLEF eHealth Evaluation Lab: Natural Language Processing and Information Retrieval for Clinical Care (Tasks 1 and 2).

natural language processing

Published: Feb. 15, 2013. Version: 1.0


Database Open Access

Examples of Electromyograms

An electromyogram (EMG) is a common clinical test used to assess function of muscles and the nerves that control them. EMG studies are used to help in the diagnosis and management of disorders such as the muscular dystrophies and neuropathies. Nerve…

neuropathy electromyogram

Published: Sept. 5, 2009. Version: 1.0.0

Visualize waveforms

Database Credentialed Access

INSPIRE, a publicly available research dataset for perioperative medicine

Leerang Lim, Hyung-Chul Lee

A public dataset that contains information related to surgery, anesthesia, laboratory results, medications, diagnosis, and outcomes from 50% of the patients who received surgery at Seoul National University Hospital between 2011 and 2020.

surgery open dataset perioperative medicine multi-center

Published: June 9, 2026. Version: 1.4.2


Database Restricted Access

EchoNext: A Dataset for Detecting Echocardiogram-Confirmed Structural Heart Disease from ECGs

Pierre Elias, Joshua Finer

EchoNext is a curated dataset of electrocardiograms (ECGs) paired with echocardiogram-confirmed structural heart disease labels, designed to support the development and validation of machine learning models.

clinical decision support artificial intelligence digital health structural heart disease electrocardiogram health equity ecg heart failure transthoracic echocardiogram ai model deployment valvular heart disease cardiovascular screening ai in healthcare left ventricular dysfunction deep learning population health aortic stenosis machine learning

Published: April 30, 2026. Version: 1.1.1


Database Restricted Access

KI EndoLIST: Endometriosis Longitudinal Individualized Symptoms Tracking Dataset

Tamar Zelovich, Vered Klaitman, Shaked Feiglin, et al.

This database contains daily symptoms of 34 endometriosis patients over 1-10 months of monitoring. It includes basic patient information, frequency and intensity of symptoms, and standard MedDRA symptom mapping for clinical interpretation.

Published: April 30, 2026. Version: 1.0.0