Resources


Database Credentialed Access

MIMIC-IV-Ext-CEKG: A Process-Oriented Dataset Derived from MIMIC-IV for Enhanced Clinical Insights

Milad Naeimaei Aali, Felix Mannhardt, Pieter Jelle Toussaint

The MIMIC-IV-Ext-CEKG dataset is crafted for object-centric process mining in healthcare, specifically to create clinical event knowledge graphs for patients with multimorbidity, as well as for data mining and machine learning tasks.

mimic process mining multi entity process mining object centric event log clinical event knowledge graph

Published: April 8, 2025. Version: 1.0.0


Challenge Open Access

2018 IEEE BHI and BSN Data Challenge

Tom Pollard, Alistair Johnson, Jesse Raffa

A challenge to explore real clinical questions in critically ill patients using the MIMIC. A collaboration with the IEEE Conference on Biomedical and Health Informatics 2018 and the IEEE Conference on Body Sensor Networks.

weekend effect ieee challenge mimic

Published: Feb. 5, 2018. Version: 1.0


Database Restricted Access

CXRGraph: Using Information Extraction to Normalize the Training Data for Automatic Radiology Report Generation

Yuxiang Liao, Hoisang Heung, Hantao Liu, et al.

CXRGraph is a structured radiology report dataset built upon RadGraph and tailored for the Automatic Radiology Report Generation task. It can identify more task-relevant information such as abnormalities and hallucinated prior references.

relation extraction information extraction natural language processing named entity recognition structured radiology report

Published: Feb. 3, 2025. Version: 1.0.0


Database Open Access

Permittivity of Healthy and Diseased Skeletal Muscle

Benjamin Sanchez

Conductivity and relative permittivity of healthy and diseased skeletal muscle.

permittivity neuromuscular disorders skeletal muscle anisotropy

Published: Nov. 12, 2019. Version: 1.1


Database Contributor Review

COVID Data for Shared Learning (CDSL): A comprehensive, multimodal COVID-19 dataset from HM Hospitales

Álvaro Ritoré, Andreea M Oprescu, Alberto Estirado Bronchalo, et al.

COVID Data for Shared Learning (CDSL) is a multimodal database comprising de-identified structured health data and radiological images from 4,479 patients with COVID-19, as a comprehensive toolkit for developing predictive models.

covid-19 multimodal database radiological images open data healthcare data machine learning and ai

Published: Oct. 25, 2024. Version: 1.0.0


Database Contributor Review

COVID Data for Shared Learning (CDSL): A comprehensive, multimodal COVID-19 dataset from HM Hospitales

Álvaro Ritoré, Andreea M Oprescu, Alberto Estirado Bronchalo, et al.

COVID Data for Shared Learning (CDSL) is a multimodal database comprising de-identified structured health data and radiological images from 4,479 patients with COVID-19, as a comprehensive toolkit for developing predictive models.

covid-19 multimodal database radiological images open data healthcare data machine learning and ai

Published: Oct. 25, 2024. Version: 1.0.0


Database Credentialed Access

BOLD, a blood-gas and oximetry linked dataset

João Matos, Tristan Struja, Jack Gallifant, et al.

An open-source pulse oximetry and arterial blood gas dataset, derived from MIMIC-III, MIMIC-IV, and eICU-CRD

pulse oximetry intensive care unit health equity electronic health records

Published: Nov. 8, 2023. Version: 1.0


Database Credentialed Access

EHR-DS-QA: A Synthetic QA Dataset Derived from Medical Discharge Summaries for Enhanced Medical Information Retrieval Systems

Konstantin Kotschenreuther

Dataset consisting of question and answer pairs synthetically generated from medical discharge summaries, designed to facilitate the training and development of large language models specifically tailored for healthcare applications

mimic-iv clinical question-answering medical discharge summaries large language models

Published: Jan. 11, 2024. Version: 1.0.0


Database Open Access

Pressure, flow, and dynamic thoraco-abdominal circumferences data for adults breathing under CPAP therapy

Ella Frances Sophia Guy, Jennifer Knopp, Theodore Lerios, et al.

Dataset of pressure, flow, and dynamic abdominal and chest circumference for healthy people breathing with CPAP. Data was collected with PEEP settings of 0 (ZEEP), 4, and 8cmH2O at normal/resting, panting/short and deep/long breath patterns/rates.

Published: Jan. 25, 2023. Version: 1.0.0


Challenge Credentialed Access

ShAReCLEF eHealth 2013: Natural Language Processing and Information Retrieval for Clinical Care

Danielle Mowery

2013 ShARe/CLEF eHealth Evaluation Lab: Natural Language Processing and Information Retrieval for Clinical Care (Tasks 1 and 2).

natural language processing

Published: Feb. 15, 2013. Version: 1.0