Resources
Model Credentialed Access
Characterization of Stigmatizing Language in Medical Records
Keith Harrigian, Ayah Zirikly, Brant Chee, Alya Ahmad, Anne Links, Somnath Saha, Mary Catherine Beach, Mark Dredze
clinical natural language processing domain transfer bias stigmatizing language large language models mimic
Published: Nov. 6, 2023. Version: 1.0.0
Database Credentialed Access
Nosocomial Risk Datasets from MIMIC-III
Travis Goodwin
pressure injury risk prediction acute kidney injury anemia forecasting natural language processing deep learning
Published: Sept. 15, 2022. Version: 1.0
Database Contributor Review
BRATECA (Brazilian Tertiary Care Dataset): a Clinical Information Dataset for the Portuguese Language
Henrique Dias, Ana Helena Dias Pereira dos Ulbrich
prescriptions exams tertiary care clinical notes natural language processing
Published: July 14, 2022. Version: 1.1
Database Credentialed Access
Paediatric Intensive Care database
Haomin Li, Xian Zeng, Gang Yu
intensive care pediatrics critical care natural language processing
Published: Nov. 12, 2020. Version: 1.1.0
Database Credentialed Access
MIMIC-III Clinical Database
Alistair Johnson, Tom Pollard, Roger Mark
MIMIC-III is a large, freely-available database comprising deidentified health-related data associated with over forty thousand patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. The databas…
clinical intensive care critical care natural language processing machine learning
Published: Sept. 4, 2016. Version: 1.4
Database Credentialed Access
EHR-DS-QA: A Synthetic QA Dataset Derived from Medical Discharge Summaries for Enhanced Medical Information Retrieval Systems
Konstantin Kotschenreuther
mimic-iv clinical question-answering medical discharge summaries large language models
Published: Jan. 11, 2024. Version: 1.0.0
Database Credentialed Access
EHR-DS-QA: A Synthetic QA Dataset Derived from Medical Discharge Summaries for Enhanced Medical Information Retrieval Systems
Konstantin Kotschenreuther
mimic-iv clinical question-answering medical discharge summaries large language models
Published: Jan. 11, 2024. Version: 1.0.0
Database Credentialed Access
MIMIC-III-Ext-VeriFact-BHC: Labeled Propositions From Brief Hospital Course Summaries for Long-form Clinical Text Evaluation
Philip Chung, Akshay Swaminathan, Alex Goodell, Yeasul Kim, Momsen Reincke, Lichy Han, Ben Deverett, Mohammad Amin Sadeghi, Abdel badih El Ariss, Marc Ghanem, David Seong, Andrew Lee, Caitlin Coombes, Brad Bradshaw, Mahir Sufian, Hyo Jung Hong, Teresa Nguyen, Mohammad Rasouli, Komal Kamra, Mark Burbridge, James McAvoy, Roya Saffary, Stephen Parnell Ma, Dev Dash, James Xie, Ellen Wang, Cliff Schmiesing, Nigam Shah, Nima Aghaeepour
artificial intelligence clinical notes natural language processing large language models brief hospital course electronic health records long-form text chart review text reranking atomic claim hybrid retrieval clinical informatics clinical medicine fact verification retrieval-augmented generation logical atomism text embedding formal logic llm-as-a-judge llm evaluation
Published: April 9, 2025. Version: 1.0.0
Database Credentialed Access
MIMIC-IV-Ext-BHC: Labeled Clinical Notes Dataset for Hospital Course Summarization
Asad Aali, Dave Van Veen, Yamin Arefeen, Jason Hom, Christian Bluethgen, Eduardo Pontes Reis, Sergios Gatidis, Namuun Clifford, Joseph Daws, Arash Tehrani, Jangwon Kim, Akshay Chaudhari
clinical notes natural language processing brief hospital course text summarization machine learning
Published: Feb. 3, 2025. Version: 1.2.0
Database Restricted Access
CXRGraph: Using Information Extraction to Normalize the Training Data for Automatic Radiology Report Generation
Yuxiang Liao, Hoisang Heung, Hantao Liu, Irena Spasic
relation extraction information extraction natural language processing named entity recognition structured radiology report
Published: Feb. 3, 2025. Version: 1.0.0