Name: Immunosuppressive Condition and Medication Annotations for Admission Notes in the MIMIC-III Database
Published: Aug. 4, 2025
License: https://github.com/MIT-LCP/license-and-dua/tree/master/drafts

Database Credentialed Access

Vijeeth Guggilla , Melissa Bak , Mengjia Kang , Theresa Walunas , Catherine A Gao

Published: Aug. 4, 2025. Version: 1.0.0

When using this resource, please cite: (show more options)
Guggilla, V., Bak, M., Kang, M., Walunas, T., & Gao, C. A. (2025). Immunosuppressive Condition and Medication Annotations for Admission Notes in the MIMIC-III Database (version 1.0.0). PhysioNet. RRID:SCR_007345. https://doi.org/10.13026/etd0-dq69

MLA	Guggilla, Vijeeth, et al. "Immunosuppressive Condition and Medication Annotations for Admission Notes in the MIMIC-III Database" (version 1.0.0). PhysioNet (2025). RRID:SCR_007345. https://doi.org/10.13026/etd0-dq69
APA	Guggilla, V., Bak, M., Kang, M., Walunas, T., & Gao, C. A. (2025). Immunosuppressive Condition and Medication Annotations for Admission Notes in the MIMIC-III Database (version 1.0.0). PhysioNet. RRID:SCR_007345. https://doi.org/10.13026/etd0-dq69
Chicago	Guggilla, Vijeeth, Bak, Melissa, Kang, Mengjia, Walunas, Theresa, and Catherine A Gao. "Immunosuppressive Condition and Medication Annotations for Admission Notes in the MIMIC-III Database" (version 1.0.0). PhysioNet (2025). RRID:SCR_007345. https://doi.org/10.13026/etd0-dq69
Harvard	Guggilla, V., Bak, M., Kang, M., Walunas, T., and Gao, C. A. (2025) 'Immunosuppressive Condition and Medication Annotations for Admission Notes in the MIMIC-III Database' (version 1.0.0), PhysioNet. RRID:SCR_007345. Available at: https://doi.org/10.13026/etd0-dq69
Vancouver	Guggilla V, Bak M, Kang M, Walunas T, Gao C A. Immunosuppressive Condition and Medication Annotations for Admission Notes in the MIMIC-III Database (version 1.0.0). PhysioNet. 2025. RRID:SCR_007345. Available from: https://doi.org/10.13026/etd0-dq69

Please include the standard citation for PhysioNet: (show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220. RRID:SCR_007345.

APA	Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220. RRID:SCR_007345.
MLA	Goldberger, A., et al. "PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220." (2000). RRID:SCR_007345.
CHICAGO	Goldberger, A., L. Amaral, L. Glass, J. Hausdorff, P. C. Ivanov, R. Mark, J. E. Mietus, G. B. Moody, C. K. Peng, and H. E. Stanley. "PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220." (2000). RRID:SCR_007345.
HARVARD	Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P.C., Mark, R., Mietus, J.E., Moody, G.B., Peng, C.K. and Stanley, H.E., 2000. PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220. RRID:SCR_007345.
VANCOUVER	Goldberger A, Amaral L, Glass L, Hausdorff J, Ivanov PC, Mark R, Mietus JE, Moody GB, Peng CK, Stanley HE. PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220. RRID:SCR_007345.

Abstract

Immunosuppression due to underlying conditions or immunosuppressive medication use increases the risk of morbidity and mortality in the context of infectious disease. Identifying patients with immunosuppression is important for better studying and understanding the impact of immunosuppression on critical care outcomes. While structured data (e.g., diagnosis codes, medication orders) from the electronic health record (EHR) can help identify patients with immunosuppression, the reliability of structured data is limited as it can miss more nuanced information that is only present in unstructured data, such as patient notes.

We introduce a dataset for phenotyping immunosuppression, defined as identification of a patient’s immune status, based on admission notes. Patient admission notes were extracted from the Medical Information Mart for Intensive Care III (MIMIC-III) dataset, which contains health-related data and clinical notes associated with patients who stayed in critical care units at Beth Israel Deaconess Medical Center between 2001 and 2012. These notes were manually annotated for the presence of several immunosuppressive conditions and immunosuppressive medications.

Each admission note was independently annotated by two human annotators, and discrepancies were reviewed by an attending critical care physician. Annotated conditions include solid organ transplant, stem cell transplant, HIV, acute leukemia, lymphoma, multiple myeloma, and immunoglobulin deficiency. Annotated medications include azathioprine, cyclosporine, cyclophosphamide, mycophenolate, rituximab, and tacrolimus. This dataset can be leveraged for medical and computer science research, especially as related to the application of natural language processing and large language models (LLMs) in medicine. It can also be used as a starting point for research related to immunosuppression in critically ill patients.

Background

Structured data collected in the EHR is a powerful starting point for retrospective studies aiming to identify and study populations with specific characteristics. However, structured data such as diagnosis codes, which are most relevant to reimbursement and billing, do not always reflect the truth when it comes to patients' health status. For example, a patient may have a solid organ transplant diagnosis code even if they have never received one simply because they were evaluated for a transplant. Similarly, medication orders are not always reliable because they represent what a physician ordered rather than what a patient took. Thus, retrospective studies relying solely on structured data to group patients may suffer from inaccuracies that can misrepresent patient phenotypes [1].

Patient notes contain nuanced information not found in structured EHR data. When it comes to identifying pre-existing conditions and medications, this is especially true for admission notes, which often include specific History and Physical (H&P) sections and Medication sections which cover past medical history and current/recent medications in detail. However, clinical notes are significantly more cumbersome to work with than structured data, and many natural language processing approaches have been developed to automate data extraction from clinical notes [2,3]. Recently, LLMs have shown great promise in their ability to extract meaning from clinical text [4–7]. As LLMs continue advancing, the availability of high-quality annotated datasets spanning diverse clinical note types is crucial for rigorously benchmarking LLMs’ performance in extracting information from medical text. We sought to identify and annotate immunosuppressive phenotypes in unstructured clinical notes. We chose to use admission notes from the publicly available MIMIC-III dataset.

Given our interest in immunosuppression and its impact in the context of critical infectious disease settings, we focused on patients in MIMIC-III who were admitted with a diagnosis of pneumonia and mechanically ventilated [8–10]. Patients with immunosuppression have especially high mortality in the context of pneumonia. Thus, these patients represent a critically ill population where the impact of immunosuppression is particularly important [11]. Defining our cohort in this manner also allowed us to capture a variety of immunosuppressive conditions and medications, as immunosuppression and critical illness requiring mechanical ventilation are expected to be correlated.

Methods

MIMIC-III hospital admissions for patients who were mechanically ventilated, stayed in the medical ICU, and were diagnosed with pneumonia were identified. The complete identification pipeline is described in MIMIC_corpus.ipynb at this GitHub repository [12]. One admission note for each identified hospital admission was included in our final corpus. Two notes were later removed from our corpus after manual review determined them to be empty templates.

The annotators for this project were co-authors VG (MD/PhD student in the second year of his PhD), MK (data scientist with 5+ years of experience working with healthcare data), MJB (fourth-year medical student), and CAG (attending pulmonary and critical care physician-scientist).

All four annotators collaborated to carry out text annotation in December 2024. All notes were reviewed in duplicate, allowing for cases of disagreement. Co-authors VG and MK reviewed 100 unique notes each, and MJB reviewed all 200 notes. Disagreements were rereviewed by attending physician CAG. All reviewers were first trained on the conditions and medications to look for as well as their definitions/variations/abbreviations. A total of seven conditions (solid organ transplant, stem cell transplant, HIV, acute leukemia, lymphoma, multiple myeloma, immunoglobulin deficiency) and six medications (azathioprine, cyclosporine, cyclophosphamide, mycophenolate, rituximab, tacrolimus) were considered for annotation, with annotators labeling as “Yes,” “No,” or “Unsure.” Cases were labeled “Unsure,” for example, when medications had unclear timelines of use or when conditions were highly suspected but not explicitly confirmed. All cases marked as “Unsure” were re-reviewed by attending physician CAG.

Here, we present an annotation example to better illustrate how annotations were made. For one of the notes, the following text was used to make a “Yes” determination for both lymphoma and stem cell transplant:

“AGE y.o. M admitted on [**DATE**] with h/o large B-cell lymphoma status post allogenic stem cell transplant in [**DATE**]”

Both VG and MJB agreed on these determinations. However, in this same note, the following text led to a disagreement between VG and MJB regarding the determination for rituximab:

“He has had severe chronic GVHD of the skin and oropharyngeal mucosa; he completed Rituxan weekly x4 for GVHD at end of [**DATE**].”

In this case, the date of Rituxan completion was very close to the 6-month cutoff for counting as “Yes,” and this led to disagreement between the reviewers. Thus, CAG re-reviewed this case, determined it was within the timeline, and confirmed it to be a “Yes.”

This example shows why a manual annotation approach is necessary. A simple keyword search would not account for the timing of medication use, and it would also not account for cases where a condition was evaluated for or suspected but never diagnosed. For example, there was a different case where a note mentioned suspicion for lymphoma, but lymphoma was not a pre-existing diagnosis. Thus, manual annotation improves on a keyword search by accounting for such cases where context is important.

Data Description

We have created a dataset of admission notes with a focus on critically ill pneumonia patients labeled with 13 clinical variables associated with an immunosuppressed state. MIMIC-III-Ext-Immunosuppressive-Conditions-Medications.csv contains the final annotations and Reviewer-1-Annotations.csv and Reviewer-2-Annotations.csv contain the initial annotations by VG/MK and MJB, respectively.

Each entry in this database consists of a MIMIC-III v1.4 derived Hospital Admission Identifier ("HADM_ID", integer), the index from MIMIC-III v1.4 NOTEEVENTS table ("ROW_ID", integer), seven immunosuppressive conditions (1 = Yes, 0 = No, 0.5 = Unsure), and six immunosuppressive medications (1 = Yes, 0 = No, 0.5 = Unsure).

Phenotype definitions are as follows:

Conditions

Acute leukemia – Documented history of acute lymphocytic leukemia or acute myelogenous leukemia
HIV – Documented history of HIV infection
Immunoglobulin deficiency – Documented history of a primary immunodeficiency, including IgA deficiency, IgG deficiency
Lymphoma – Documented history of Hodgkin lymphoma or non-Hodgkin lymphoma
Multiple myeloma – Documented history of multiple myeloma
Solid organ transplant – Documented history of a kidney, liver, heart, or lung transplant
Stem cell transplant – Documented history of an autologous or allogeneic stem cell transplant

Conditions were labeled “Yes” if there was ever a history of that condition, with no time frame cutoff.

Medications

Azathioprine – e.g., Imuran, Azasan
Cyclophosphamide – e.g., Cytoxan, Neosar, Procytox
Cyclosporine – e.g., Neoral, Sandimmune, Gengraf
Mycophenolate – e.g., Cellcept, Myfortic
Rituximab – e.g., Rituxan, Truxima, Ruxience, Riabni
Tacrolimus – e.g., Prograf, Advagraf, Astagraf, Envarsus, Hecoria

Medications were labeled “Yes” if a patient had been using the medication in the six months prior to admission.

Dataset Distribution

Here, we report the frequency of “Yes” determinations for each of the immunosuppressive conditions and medications we adjudicated:

Acute leukemia – 2/200, 1%
HIV – 3/200, 1.5%
Immunoglobulin deficiency – 0/200, 0%
Lymphoma – 11/200, 5.5%
Multiple myeloma – 3/200, 1.5%
Solid organ transplant – 3/200, 1.5%
Stem cell transplant – 8/200, 4%
Azathioprine – 0/200, 0%
Cyclophosphamide – 3/200, 1.5%
Cyclosporine – 1/200, 0.5%
Mycophenolate – 6/200, 3%
Rituximab – 3/200, 1.5%
Tacrolimus – 5/200, 2.5%

Inter-Annotator Disagreement

Overall, inter-annotator disagreement was very low. The rate of disagreement between VG and MJB was 8/1300 determinations (13 variables for 100 notes) or 0.62%. The rate of disagreement between MK and MJB was 9/1300 determinations or 0.69%. Thus, the average rate of disagreement was 0.65%.

Usage Notes

This corpus of annotated patient notes contains protected health information (PHI) per The Health Information Portability and Accountability Act of 1996 (HIPAA) and can be joined to the MIMIC-III database. Therefore, those who wish to access this dataset must satisfy all requirements to access the MIMIC-III database.

We used this annotated corpus to assess the ability of LLMs to abstract immunosuppression concepts from admission notes [13]. Analysis code for this study is available on GitHub [12]. The MIMIC_corpus.ipynb notebook in this repository also details how the cohort underlying this dataset was defined.

This dataset could be used to benchmark other LLM or natural language processing-based methods of extracting clinical information. Other potential applications of this dataset include performing outcome prediction and risk modeling in critical care patients based on immune status, as the cohort consists of mechanically ventilated pneumonia patients – a high-risk critical care cohort. This dataset could also be compared with other EHR datasets to perform larger-scale studies of risk factors and long-term outcomes among immunosuppressed populations.

One limitation of this dataset is that it does not represent a comprehensive list of immunosuppressive conditions and medications. For example, corticosteroids and myelosuppressive chemotherapy, which are often used in critically ill patients and can lead to immunosuppression, were not adjudicated for. Therefore, a caveat of using this dataset is that patients may be immunosuppressed for reasons beyond those which we adjudicated for. Another caveat of this data is that we chose a specific critically ill population (mechanically ventilated pneumonia patients). Thus, the prevalence of certain immunosuppressive conditions or medication use may not align with general critical care populations.

Ethics

This project builds upon the MIMIC-III database, and the approval for this project is based on the original MIMIC-III database being deidentified and approved for credentialed distribution. All authors completed CITI training, were credentialed by PhysioNet, and signed the Data Use Agreement to access the deidentified MIMIC-III notes.

Acknowledgements

The authors would like to thank the MIMIC-III team for creating and maintaining this dataset, from which this work is derived. The NU SCRIPT Study is funded by NIH NIAID U19AI135964. This work was also supported by NUCATS, SQLIFTS, and the Canning Thoracic Institute of Northwestern Medicine. TLW was supported by Gilead Sciences (award no. CO-US-540-6435) and the NIH (grant nos. U19AI135964, U19AI181102, and R21HD107571). CAG was supported by the NIH (grant no. K23HL169815), a Parker B. Francis Opportunity Award, and an ATS Unrestricted Grant.

Conflicts of Interest

No commercial conflicts of interest.

References

Pathak J, Kho AN, Denny JC. Electronic health records-driven phenotyping: challenges, recent advances, and perspectives. J Am Med Inform Assoc. 2013 Dec 1;20(e2):e206–11.
Hao T, Huang Z, Liang L, Weng H, Tang B. Health Natural Language Processing: Methodology Development and Applications. JMIR Med Inform. 2021 Oct 21;9(10):e23898.
Wu H, Wang M, Wu J, Francis F, Chang YH, Shavick A, et al. A survey on clinical natural language processing in the United Kingdom from 2007 to 2022. Npj Digit Med. 2022 Dec 21;5(1):1–15.
Huang J, Yang DM, Rong R, Nezafati K, Treager C, Chi Z, et al. A critical assessment of using ChatGPT for extracting structured data from clinical notes. Npj Digit Med. 2024 May 1;7(1):1–13.
Cai ZR, Chen ML, Kim J, Novoa RA, Barnes LA, Beam A, et al. Assessment of Correctness, Content Omission, and Risk of Harm in Large Language Model Responses to Dermatology Continuing Medical Education Questions. J Invest Dermatol. 2024 Aug 1;144(8):1877–9.
Performance of a Large Language Model on Practice Questions for the Neonatal Board Examination | Neonatology | JAMA Pediatrics | JAMA Network [Internet]. [cited 2024 Dec 31]. Available from: https://jamanetwork-com.turing.library.northwestern.edu/journals/jamapediatrics/fullarticle/2807329
Kim J, Leonte KG, Chen ML, Torous JB, Linos E, Pinto A, et al. Large language models outperform mental and medical health care professionals in identifying obsessive-compulsive disorder. Npj Digit Med. 2024 Jul 19;7(1):1–5.
Johnson A, Pollard T, Mark R. MIMIC-III Clinical Database [Internet]. PhysioNet; 2015 [cited 2024 Dec 31]. Available from: https://physionet.org/content/mimiciii/1.4/
Johnson AEW, Pollard TJ, Shen L, Lehman L wei H, Feng M, Ghassemi M, et al. MIMIC-III, a freely accessible critical care database. Sci Data. 2016 May 24;3(1):160035.
Goldberger AL, Amaral LA, Glass L, Hausdorff JM, Ivanov PC, Mark RG, et al. PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation. 2000 Jun 13;101(23):E215-220.
Ramirez JA, Chandler TR, Furmanek SP, Carrico R, Wilde AM, Sheikh D, et al. Community-Acquired Pneumonia in the Immunocompromised Host: Epidemiology and Outcomes. Open Forum Infect Dis. 2023 Nov 22;10(11):ofad565.
Guggilla V, Kang M, Walunas T, Gao C. LLM Identification of Immunosuppression [Internet]. 2025 [cited 2025 Jun 9]. Available from: https://github.com/NUSCRIPT/guggilla_immunosuppression_2025
Guggilla V, Kang M, Bak MJ, Tran SD, Pawlowski A, Nannapaneni P, et al. Large language models outperform traditional structured data-based approaches in identifying immunosuppressed patients [Internet]. medRxiv; 2025 [cited 2025 Jan 28]. p. 2025.01.16.25320564. Available from: https://www.medrxiv.org/content/10.1101/2025.01.16.25320564v1