Resources


Database Open Access

Image-derived cardiomegaly biomarker values for 96K chest X-rays in MIMIC-CXR/MIMIC-CXR-JPG

Benjamin Duvieusart, Felix Krones, Guy Parsons, Lionel Tarassenko, Bartlomiej W Papiez, Adam Mahdi

Automatically extracted cardiomegaly biomarkers - cardiothoracic ratio (CTR) and cardiopulmonary area ratio (CPAR) - for all posterior-anterior chest x-ray scans in MIMIC-CXR/MIMIC-CXR-JPG.

biomarkers mimic-cxr cpar ctr cardiomegaly

Published: Aug. 23, 2024. Version: 1.0.0


Database Open Access

Heart and lung segmentations for MIMIC-CXR/MIMIC-CXR-JPG and Montgomery County TB databases

Benjamin Duvieusart, Felix Krones, Guy Parsons, Lionel Tarassenko, Bartlomiej W Papiez, Adam Mahdi

Heart and lung segmentations for 200 MIMIC-CXR/MIMIC-CXR-JPG chest x-rays and heart segmentations for 138 Montgomery County tuberculosis chest X-rays.

segmentation heart and lungs montgomery country tb mimic-cxr

Published: Aug. 14, 2023. Version: 1.0.0


Database Open Access

MIMIC Database

The MIMIC Database includes data recorded from over 90 ICU patients. The data in each case include signals and periodic measurements obtained from a bedside monitor as well as clinical data obtained from the patient's medical record. The recordi…

health record icu ehr critical care mimic

Published: March 15, 2000. Version: 1.0.0

Visualize waveforms

Database Credentialed Access

MIMIC-Ext-DrugDetection

Fabrice Harel-Canada, Nanyun Peng, David Goodman, Ruby Romero, Allan Nguyen, Brandon Moghanian, Anabel Salimian

This project offers a multilabel annotated dataset of clinical note sentences from MIMIC-III/IV for substance use detection. It supports NLP research for identifying various co-occurring drug use mentions in patient records.

ehr mimic-iv substance use clinical notes methamphetamine multi-label cocaine drug detection polysubstance use prescription opioid misuse cannabis benzodiazepine misuse injection drug use heroin mimic-iii

Published: Sept. 25, 2025. Version: 1.0.0


Database Credentialed Access

Immunosuppressive Condition and Medication Annotations for Admission Notes in the MIMIC-III Database

Vijeeth Guggilla, Melissa Bak, Mengjia Kang, Theresa Walunas, Catherine A Gao

This database contains 200 MIMIC-III admission notes with adjudicated labels for histories of various immunosuppressive conditions and usage of various immunosuppressive medications.

Published: Aug. 4, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-III-Ext-VeriFact-BHC: Labeled Propositions From Brief Hospital Course Summaries for Long-form Clinical Text Evaluation

Philip Chung, Akshay Swaminathan, Alex Goodell, Yeasul Kim, Momsen Reincke, Lichy Han, Ben Deverett, Mohammad Amin Sadeghi, Abdel badih El Ariss, Marc Ghanem, David Seong, Andrew Lee, Caitlin Coombes, Brad Bradshaw, Mahir Sufian, Hyo Jung Hong, Teresa Nguyen, Mohammad Rasouli, Komal Kamra, Mark Burbridge, James McAvoy, Roya Saffary, Stephen Parnell Ma, Dev Dash, James Xie, Ellen Wang, Cliff Schmiesing, Nigam Shah, Nima Aghaeepour

A clinician-labeled dataset for fact-checking long-form clinical text against patient EHRs. The dataset contains LLM-written and human-written Brief Hospital Course summaries decomposed to atomic claim and sentence propositions with annotations.

artificial intelligence natural language processing clinical notes large language models brief hospital course electronic health records long-form text chart review text reranking atomic claim hybrid retrieval clinical informatics clinical medicine fact verification retrieval-augmented generation logical atomism text embedding formal logic llm-as-a-judge llm evaluation

Published: April 9, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-III-Ext-tPatchGNN

Chenlong Yin, Weijia Zhang

The processed MIMIC-III dataset for the benchmark of Irregular Multivariate Time Series Forecasting: A Transformable Patching Graph Neural Networks Approach.

Published: April 9, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-Ext-BHC: Labeled Clinical Notes Dataset for Hospital Course Summarization

Asad Aali, Dave Van Veen, Yamin Arefeen, Jason Hom, Christian Bluethgen, Eduardo Pontes Reis, Sergios Gatidis, Namuun Clifford, Joseph Daws, Arash Tehrani, Jangwon Kim, Akshay Chaudhari

This dataset presents a collection of preprocessed and labeled clinical notes derived from "MIMIC-IV-Note", and aims to facilitate the development of ML models focused on summarizing brief hospital courses (BHC) from clinical notes.

natural language processing clinical notes brief hospital course text summarization machine learning

Published: Feb. 3, 2025. Version: 1.2.0


Database Restricted Access

MIMIC-IV-Ext-DiReCT

Bowen Wang, Jiuyang Chang, Yiming Qian

A diagnostic reasoning dataset designed to evaluate the performance of large language models in aligning with human doctors when making diagnoses from clinical notes.

Published: Jan. 21, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-Ext-GPT-3_5-Generated-Discharge-Summaries-for-Low-Resource-Codes

Matúš Falis, Aryo Pradipta Gema, Hang Dong, Luke Daines, Siddharth Basetti, Michael Holder, Rose Penfold, Alexandra Birch, Beatrice Alex

9,606 Synthetic Discharge Summaries generated by GPT-3.5 based on combinations of ICD-10-code descriptions associated with real discharge summaries in MIMIC-IV. Focus on low resource codes.

icd coding large language model data augmentation

Published: Dec. 16, 2024. Version: 1.0.0