Resources


Database Credentialed Access

Deidentified Medical Text

Margaret Douglass, Bill Long, George Moody, Peter Szolovits, Li-wei Lehman, Roger Mark, Gari Clifford

Gold standard corpus of 2,434 deidentified nursing notes

medical text nursing notes de-identification hipaa

Published: Dec. 18, 2007. Version: 1.0


Database Credentialed Access

Annotated Question-Answer Pairs for Clinical Notes in the MIMIC-III Database

Xiang Yue, Xinliang Frederick Zhang, Huan Sun

Annotated Question Answering Pairs for Clinical Notes in the MIMIC-III Database

clinical question answering clinical nlp clinical reading comprehension

Published: Jan. 15, 2021. Version: 1.0.0


Database Credentialed Access

MIMIC-III Clinical Database

Alistair Johnson, Tom Pollard, Roger Mark

MIMIC-III is a large, freely-available database comprising deidentified health-related data associated with over forty thousand patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. The databas…

intensive care clinical natural language processing critical care machine learning

Published: Sept. 4, 2016. Version: 1.4


Database Credentialed Access

Phenotype Annotations for Patient Notes in the MIMIC-III Database

Edward Moseley, Leo Anthony Celi, Joy Wu, Franck Dernoncourt

Clinical notes, annotated by at least two expert annotators for over ten patient phenotypes, including advanced cancer, substance abuse, and treatment non-adherence.

patient classification natural language processing

Published: March 5, 2020. Version: 1.20.03


Database Open Access

MIMIC-III Clinical Database Demo

Alistair Johnson, Tom Pollard, Roger Mark

An open source demo of the MIMIC-III Clinical Database

mimic critical care electronic health records

Published: April 24, 2019. Version: 1.4


Database Credentialed Access

MedNLI - A Natural Language Inference Dataset For The Clinical Domain

Chaitanya Shivade

This is a resource for training machine learning models for language inference in the medical domain.

recognizing textual entailment natural language inference

Published: Oct. 1, 2019. Version: 1.0.0


Database Credentialed Access

CLIP: A Dataset for Extracting Action Items for Physicians from Hospital Discharge Notes

James Mullenbach, Yada Pruksachatkun, Sean Adler, Jennifer Seale, Jordan Swartz, T Greg McKelvey, Yi Yang, David Sontag

Clinical action items annotated over MIMIC-III. 718 discharge summaries are labeled at a sentence- and character-level with multiple action labels including Appointment, Lab, Procedure, Medication, Imaging, Patient Instructions, and Other.

Published: June 21, 2021. Version: 1.0.0


Model Credentialed Access

Clinical BERT Models Trained on Pseudo Re-identified MIMIC-III Notes

Eric Lehman, Sarthak Jain, Karl Pichotta, Yoav Goldberg, Byron Wallace

We explore recovering sensitive info from BERT trained over non-deidentified EHR. We make our models and data available to further facilitate research.

Published: April 28, 2021. Version: 1.0.0


Model Credentialed Access

Transformer models trained on MIMIC-III to generate synthetic patient notes

Ali Amin-Nejad, Julia Ive, Sumithra Velupillai

Machine learning models that have been trained using MIMIC-III to enable the creation of synthetic discharge summaries.

Published: May 27, 2020. Version: 1.0.0


Model Credentialed Access

What's in a Note? Unpacking Predictive Value in Clinical Note Representations

Tristan Naumann, William Boag

Word vectors corresponding to the AMIA 2018 Informatics Summit paper of the same name.

Published: Jan. 7, 2018. Version: 0.1