Resources
Database Credentialed Access
Deidentified Medical Text
Margaret Douglass, Bill Long, George Moody, Peter Szolovits, Li-wei Lehman, Roger Mark, Gari D. Clifford
medical text nursing notes hipaa de-identification
Published: Dec. 18, 2007. Version: 1.0
Database Credentialed Access
Deidentified Medical Text
Margaret Douglass, Bill Long, George Moody, Peter Szolovits, Li-wei Lehman, Roger Mark, Gari D. Clifford
medical text nursing notes hipaa de-identification
Published: Dec. 18, 2007. Version: 1.0
Database Credentialed Access
EHR-DS-QA: A Synthetic QA Dataset Derived from Medical Discharge Summaries for Enhanced Medical Information Retrieval Systems
Konstantin Kotschenreuther
mimic-iv clinical question-answering medical discharge summaries large language models
Published: Jan. 11, 2024. Version: 1.0.0
Database Credentialed Access
MIMIC-III-Ext-VeriFact-BHC: Labeled Propositions From Brief Hospital Course Summaries for Long-form Clinical Text Evaluation
Philip Chung, Akshay Swaminathan, Alex Goodell, Yeasul Kim, Momsen Reincke, Lichy Han, Ben Deverett, Mohammad Amin Sadeghi, Abdel badih El Ariss, Marc Ghanem, David Seong, Andrew Lee, Caitlin Coombes, Brad Bradshaw, Mahir Sufian, Hyo Jung Hong, Teresa Nguyen, Mohammad Rasouli, Komal Kamra, Mark Burbridge, James McAvoy, Roya Saffary, Stephen Parnell Ma, Dev Dash, James Xie, Ellen Wang, Cliff Schmiesing, Nigam Shah, Nima Aghaeepour
artificial intelligence clinical notes natural language processing large language models brief hospital course electronic health records long-form text chart review text reranking atomic claim hybrid retrieval clinical informatics clinical medicine fact verification retrieval-augmented generation logical atomism text embedding formal logic llm-as-a-judge llm evaluation
Published: April 9, 2025. Version: 1.0.0
Database Credentialed Access
Medical Expert Annotations of Unsupported Facts in Doctor-Written and LLM-Generated Patient Summaries
Stefan Hegselmann, Shannon Shen, Florian Gierse, Monica Agrawal, David Sontag, Xiaoyi Jiang
Published: April 30, 2025. Version: 1.0.1
Database Credentialed Access
RuMedNLI: A Russian Natural Language Inference Dataset For The Clinical Domain
Pavel Blinov, Aleksandr Nesterov, Galina Zubkova, Arina Reshetnikova, Vladimir Kokh, Chaitanya Shivade
natural language inference recognizing textual entailment russian language
Published: April 1, 2022. Version: 1.0.0
Database Credentialed Access
FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark
Mingjie Li, Wenjia Cai, Rui Liu, Yuetian Weng, Tengfei Liu, Cong Wang, xin chen, zhong liu, Caineng Pan, Mengke Li, yingfeng zheng, Yizhi Liu, Flora Salim, Karin Verspoor, Xiaodan Liang, Xiaojun Chang
fundus fluorescein angiography medical report generation vision and language explainable and reliable evaluation
Published: Jan. 21, 2025. Version: 1.1.0
Software Open Access
De-Identification Software Package
The deid
software package includes code and dictionaries for automated location and removal of protected health information (PHI) in free text from medical records.
phi deidentification anonymization
Published: Dec. 18, 2007. Version: 1.1
Database Credentialed Access
Annotated Question-Answer Pairs for Clinical Notes in the MIMIC-III Database
Xiang Yue, Xinliang Frederick Zhang, Huan Sun
clinical question answering clinical nlp clinical reading comprehension
Published: Jan. 15, 2021. Version: 1.0.0
Database Credentialed Access
MIMIC-IV-Ext-BHC: Labeled Clinical Notes Dataset for Hospital Course Summarization
Asad Aali, Dave Van Veen, Yamin Arefeen, Jason Hom, Christian Bluethgen, Eduardo Pontes Reis, Sergios Gatidis, Namuun Clifford, Joseph Daws, Arash Tehrani, Jangwon Kim, Akshay Chaudhari
clinical notes natural language processing machine learning brief hospital course text summarization
Published: Feb. 3, 2025. Version: 1.2.0