Resources
Database Credentialed Access
MIMIC-III-Ext-VeriFact-BHC: Labeled Propositions From Brief Hospital Course Summaries for Long-form Clinical Text Evaluation
Philip Chung, Akshay Swaminathan, Alex Goodell, Yeasul Kim, Momsen Reincke, Lichy Han, Ben Deverett, Mohammad Amin Sadeghi, Abdel badih El Ariss, Marc Ghanem, David Seong, Andrew Lee, Caitlin Coombes, Brad Bradshaw, Mahir Sufian, Hyo Jung Hong, Teresa Nguyen, Mohammad Rasouli, Komal Kamra, Mark Burbridge, James McAvoy, Roya Saffary, Stephen Parnell Ma, Dev Dash, James Xie, Ellen Wang, Cliff Schmiesing, Nigam Shah, Nima Aghaeepour
artificial intelligence clinical notes natural language processing large language models brief hospital course electronic health records long-form text chart review text reranking atomic claim hybrid retrieval clinical informatics clinical medicine fact verification retrieval-augmented generation logical atomism text embedding formal logic llm-as-a-judge llm evaluation
Published: April 9, 2025. Version: 1.0.0
Database Credentialed Access
SCRIPT X2B8 Dataset: per-day clinical features to model successful next-day extubation
Sam Fenske, Alec Peltekian, Mengjia Kang, Nikolay Markov, Anna Pawlowski, Luke Rasmussen, Thomas Stoeger, Benjamin Singer, GR Scott Budinger, Richard Wunderink, Alexander Misharin, Ankit Agrawal, Catherine A Gao
Published: Jan. 28, 2025. Version: 1.0.0
Database Credentialed Access
MIMIC-CXR Database
Alistair Johnson, Tom Pollard, Roger Mark, Seth Berkowitz, Steven Horng
computer vision chest x-rays natural language processing radiology machine learning mimic
Published: July 23, 2024. Version: 2.1.0
Database Credentialed Access
MIMIC-CXR-JPG - chest radiographs with structured labels
Alistair Johnson, Matthew Lungren, Yifan Peng, Zhiyong Lu, Roger Mark, Seth Berkowitz, Steven Horng
computer vision chest x-ray radiology deep learning mimic
Published: March 12, 2024. Version: 2.1.0
Database Credentialed Access
ODD: A Benchmark Dataset for the NLP-based Opioid Related Aberrant Behavior Detection
Sunjae Kwon, Xun Wang, Weisong Liu, Emily Druhl, Minhee Sung, Joel Reisman, Wenjun Li, Robert Kerns, William Becker, Hong Yu
substance use natural language processing opioid related aberrant behavior
Published: Jan. 11, 2024. Version: 1.0.0
Database Credentialed Access
BOLD, a blood-gas and oximetry linked dataset
João Matos, Tristan Struja, Jack Gallifant, Luis Filipe Nakayama, Marie Charpignon, Xiaoli Liu, Jaime dos Santos Cardoso, Leo Anthony Celi, An Kwok Wong
pulse oximetry intensive care unit health equity electronic health records
Published: Nov. 8, 2023. Version: 1.0
Database Credentialed Access
GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behavior Modeling Generalization
Xuhai Xu, Han Zhang, Yasaman Sefidgar, Yiyi Ren, Xin Liu, Woosuk Seo, Jennifer Brown, Kevin Kuehn, Mike Merrill, Paula Nurius, Shwetak Patel, Tim Althoff, Margaret Morris, Eve Riskin, Jennifer Mankoff, Anind Dey
health ubiquitous computing well-being passive mobile sensing human behavior modeling
Published: March 14, 2023. Version: 1.1
Database Credentialed Access
Annotation dataset of problematic opioid use and related contexts from MIMIC-III Critical Care Database discharge summaries
Melissa Poulsen, Vanessa Troiani, Philip Freda, Danielle Mowery, Anahita Davoudi
opioid use disorder substance use clinical notes natural language processing
Published: Feb. 8, 2023. Version: 1.0.0
Database Credentialed Access
MIMIC-IV-Note: Deidentified free-text clinical notes
Alistair Johnson, Tom Pollard, Steven Horng, Leo Anthony Celi, Roger Mark
deidentification critical care clinical notes natural language processing electronic health record mimic
Published: Jan. 6, 2023. Version: 2.2
Database Credentialed Access
NCH Sleep DataBank: A Large Collection of Real-world Pediatric Sleep Studies with Longitudinal Clinical Data
Harlin Lee, Boyue Li, Yungui Huang, Yuejie Chi, Simon Lin
eeg ehr pediatrics clinical decision support polysomnography sleep study ecg sleep disorders electronic health records
Published: Oct. 27, 2021. Version: 3.1.0