Resources


Database Credentialed Access

PIFIR: PET-CT Invasive Fungal Infection Reports

Vlada Rozova, Anna Khanina, Jeremy Ong, et al.

A corpus of PET-CT reports annotated for terminology relevant to fungal infections. Ideal for validation of named entity recognition and relation extraction methods.

nlp clinical documentation information extraction invasive fungal infections

Published: Feb. 27, 2025. Version: 1.0.0


Database Credentialed Access

CHIFIR: Cytology and Histopathology Invasive Fungal Infection Reports

Vlada Rozova, Anna Khanina, Jasmine Teng, et al.

A corpus of cytology and histopathology reports annotated for terminology relevant to fungal infections. Ideal for validation of named entity recognition and relation extraction methods.

nlp clinical documentation information extraction invasive fungal infections

Published: Feb. 20, 2024. Version: 1.0.2


Database Credentialed Access

PIFIR: PET-CT Invasive Fungal Infection Reports

Vlada Rozova, Anna Khanina, Jeremy Ong, et al.

A corpus of PET-CT reports annotated for terminology relevant to fungal infections. Ideal for validation of named entity recognition and relation extraction methods.

nlp clinical documentation information extraction invasive fungal infections

Published: Feb. 27, 2025. Version: 1.0.0


Database Credentialed Access

CORAL: expert-Curated medical Oncology Reports to Advance Language model inference

Madhumita Sushil, Vanessa Kennedy, Divneet Mandair, et al.

Medical oncology progress notes annotated with advanced, comprehensive oncology-relevant concepts and relationships.

artificial intelligence information extraction oncology natural language processing electronic health records large language models

Published: Feb. 7, 2024. Version: 1.0


Database Credentialed Access

C-REACT: Contextualized Race and Ethnicity Annotations for Clinical Text

Oliver Bear Don't Walk IV, Adrienne Pichon, Harry Reyes Nieva, et al.

Two sets of gold-standard annotations for race and ethnicity information from clinical notes in MIMIC-III. Contains race and ethnicity label assignments and related information such as country of origin and spoken language.

clinical notes patient country information race and ethnicity patient language information

Published: Oct. 21, 2024. Version: 1.0.0


Database Credentialed Access

CHIFIR: Cytology and Histopathology Invasive Fungal Infection Reports

Vlada Rozova, Anna Khanina, Jasmine Teng, et al.

A corpus of cytology and histopathology reports annotated for terminology relevant to fungal infections. Ideal for validation of named entity recognition and relation extraction methods.

nlp clinical documentation information extraction invasive fungal infections

Published: Feb. 20, 2024. Version: 1.0.2


Database Credentialed Access

C-REACT: Contextualized Race and Ethnicity Annotations for Clinical Text

Oliver Bear Don't Walk IV, Adrienne Pichon, Harry Reyes Nieva, et al.

Two sets of gold-standard annotations for race and ethnicity information from clinical notes in MIMIC-III. Contains race and ethnicity label assignments and related information such as country of origin and spoken language.

clinical notes patient country information race and ethnicity patient language information

Published: Oct. 21, 2024. Version: 1.0.0


Database Credentialed Access

EHRCon: Dataset for Checking Consistency between Unstructured Notes and Structured Tables in Electronic Health Records

Yeonsu Kwon, Jiho Kim, Gyubok Lee, et al.

Dataset for Checking Consistency between Unstructured Notes and Structured Tables in Electronic Health Records

Published: March 19, 2025. Version: 1.0.1


Database Restricted Access

MIMIC-III-Ext-Synthetic-Clinical-Trial-Questions

Elizabeth Woo, Michael Craig Burkhart, Emily Alsentzer, et al.

In our recent study, we used Llama-3.1-70B-Instruct to generate synthetic training examples resembling clinical trial eligibility criteria. We manually reviewed 1000 of these examples and release them here.

large language models synthetic data distillation clinical trial eligibility

Published: April 22, 2025. Version: 1.0.0


Database Restricted Access

MIMIC-IV-Ext-Apixaban-Trial-Criteria-Questions

Elizabeth Woo, Michael Craig Burkhart, Emily Alsentzer, et al.

We created 23 questions resembling eligibility criteria from the apixaban clinical trial and evaluated them on a random sample of 100 patient notes from MIMIC-IV. We release the 2300 total question-answer pairs as a dataset here.

clinical q and a evaluation set clinical trial eligibility

Published: April 30, 2025. Version: 1.0.0