Resources


Model Credentialed Access

Asclepius-R : Clinical Large Language Model Built On MIMIC-III Discharge Summaries

Sunjun Kweon, Junu Kim, Jiyoun Kim, Sujeong Im, Eunbyeol Cho, Seongsu Bae, Jungwoo Oh, Gyubok Lee, Jong Hak Moon, Seng Chan You, Seungjin Baek, Chang Hoon Han, Yoon Bin Jung, Yohan Jo, Edward Choi

Asclepius: Publicly Available Clinical Large Language Models with Synthetic Clinical Notes Asclepius-R: A instruction-finetuned large language model with MIMIC-III clinical notes

clinical notes synthetic clinical notes synthetic notes asclepius open-source llm clinical llm large language model

Published: March 25, 2024. Version: 1.1.0


Database Credentialed Access

EchoNotes Structured Database derived from MIMIC-III (ECHO-NOTE2NUM)

Gloria Hyunjung Kwak, Dana Moukheiber, Mira Moukheiber, Lama Moukheiber, Sulaiman Moukheiber, Neel Butala, Leo Anthony Celi, Christina Chen

A structured echocardiogram database derived from 43,472 observational notes obtained during echocardiogram studies conducted in the intensive care unit at the Beth Israel Deaconess Medical Center between 2001 and 2012.

Published: Feb. 23, 2024. Version: 1.0.0


Database Credentialed Access

CHIFIR: Cytology and Histopathology Invasive Fungal Infection Reports

Vlada Rozova, Anna Khanina, Jasmine Teng, Joanne Teh, Leon Worth, Monica Slavin, karin thursky, Karin Verspoor

A corpus of cytology and histopathology reports annotated for terminology relevant to fungal infections. Ideal for validation of named entity recognition and relation extraction methods.

nlp clinical documentation information extraction invasive fungal infections

Published: Feb. 20, 2024. Version: 1.0.2


Database Credentialed Access

ODD: A Benchmark Dataset for the NLP-based Opioid Related Aberrant Behavior Detection

Sunjae Kwon, Xun Wang, Weisong Liu, Emily Druhl, Minhee Sung, Joel Reisman, Wenjun Li, Robert Kerns, William Becker, Hong Yu

Opioid-related aberrant behaviors (ORABs) detection Dataset (ODD) which is a large-size, expert-annotated, and multi-label classification benchmark dataset corresponding to the task

substance use natural language processing opioid related aberrant behavior

Published: Jan. 11, 2024. Version: 1.0.0


Database Credentialed Access

CAD-Chest: Comprehensive Annotation of Diseases based on MIMIC-CXR Radiology Report

Mengliang Zhang, Xinyue Hu, Lin Gu, Tatsuya Harada, Kazuma Kobayashi, Ronald Summers, Yingying Zhu

The CAD-Chest dataset provides comprehensive annotations of disease, including disease severity, uncertainty, and location based on the MIMIC-CXR radiologist reports.

chesr x-ray disease label

Published: Dec. 8, 2023. Version: 1.0


Database Open Access

Induced Cesarean EHG DataSet (ICEHG DS): An open dataset with electrohysterogram records of pregnancies ending in induced and cesarean section delivery

Franc Jager

The design and development of ICEHG DS was funded by the Slovenian Research Agency (ARRS) under the research project Metabolic and inborn factors of reproductive health, birth III.

neuroelectric pregnancy electrohysterogram cesarean-section delivery induced delivery

Published: Oct. 8, 2023. Version: 1.0.1

Visualize waveforms

Database Open Access

Heart and lung segmentations for MIMIC-CXR/MIMIC-CXR-JPG and Montgomery County TB databases

Benjamin Duvieusart, Felix Krones, Guy Parsons, Lionel Tarassenko, Bartlomiej W Papiez, Adam Mahdi

Heart and lung segmentations for 200 MIMIC-CXR/MIMIC-CXR-JPG chest x-rays and heart segmentations for 138 Montgomery County tuberculosis chest X-rays.

segmentation heart and lungs montgomery country tb mimic-cxr

Published: Aug. 14, 2023. Version: 1.0.0


Database Credentialed Access

Radiology Report Expert Evaluation (ReXVal) Dataset

Feiyang Yu, Mark Endo, Rayan Krishnan, Ian Pan, Andy Tsai, Eduardo Pontes Reis, Eduardo Kaiser Ururahy Nunes Fonseca, Henrique Lee, Zahra Shakeri, Andrew Ng, Curtis Langlotz, Vasantha Kumar Venugopal, Pranav Rajpurkar

The Radiology Report Expert Evaluation (ReXVal) Dataset is a publicly available dataset of radiologist evaluations of errors in automatically generated radiology reports.

Published: June 20, 2023. Version: 1.0.0


Challenge Credentialed Access

MIT Critical Datathon 2023: a MIMIC-IV Derived Dataset for Pulse Oximetry Correction Models

João Matos, Tristan Struja, David S Restrepo, Luis Filipe Nakayama, Jack Gallifant, Luca Weishaupt, Nikita Mullangi, Maria Loureiro, Skyler Shapiro, Adrien Carrel, Leo Anthony Celi

A SaO2-SpO2 Pairs Dataset derived from MIMIC-IV

pulse oximetry health equity machine learning

Published: May 8, 2023. Version: 1.0.0


Software Open Access

PhysioTag: An Open-Source Platform for Collaborative Annotation of Physiological Waveforms

Lucas McCullum, Benjamin Moody, Hasan Saeed, Tom Pollard, Xavier Borrat Frigola, Li-wei Lehman, Roger Mark

Platform for collaborative and interactive annotation of physiological waveform data.

annotation

Published: April 25, 2023. Version: 1.0.0