Challenge Credentialed Access
MIT Critical Datathon 2023: a MIMIC-IV Derived Dataset for Pulse Oximetry Correction Models
João Matos , Tristan Struja , David S Restrepo , Luis Filipe Nakayama , Jack Gallifant , Luca Weishaupt , Nikita Mullangi , Maria Loureiro , Skyler Shapiro , Adrien Carrel , Leo Anthony Celi
Published: May 8, 2023. Version: 1.0.0
When using this resource, please cite:
(show more options)
Matos, J., Struja, T., Restrepo, D. S., Nakayama, L. F., Gallifant, J., Weishaupt, L., Mullangi, N., Loureiro, M., Shapiro, S., Carrel, A., & Celi, L. A. (2023). MIT Critical Datathon 2023: a MIMIC-IV Derived Dataset for Pulse Oximetry Correction Models (version 1.0.0). PhysioNet. https://doi.org/10.13026/jfpc-pz79.
Please include the standard citation for PhysioNet:
(show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.
Pulse oximeters are medical devices used to assess peripheral arterial oxygen saturation () noninvasively. In contrast, the "gold standard" requires arterial blood to be drawn to measure the arterial oxygen saturation (). Pulse oximeters currently on the market measure in populations with darker skin tones with lower accuracy. Pulse oximetry inaccuracies can fail to detect episodes of hidden hypoxemia, i.e., low with high . Hidden hypoxemias can result in less treatment and increased mortality. Yet flawed, pulse oximeters remain ubiquitously used because of their ease of use; debiasing the underlying algorithms could alleviate the downstream repercussions of hidden hypoxemia. The dataset supports the building of pulse oximetry correction models. Derived from MIMIC-IV v2.2, it includes 81,797 Intensive Care Unit (ICU) pairs.
Because of lower pulse oximetry performance in people with darker skin pigmentation, questions have been raised about racial and ethnic bias , as the pigmentation directly impacts light absorptions, the fundamental principle used in such devices .
Recent studies have shown substantial differences in populations with darker skin tones, often overestimating values , leading to disparities in care [4,5].
The worldwide utilization of pulse oximeters demands urgent action to reduce this gap, preventing further downstream harm . To mitigate these race-based differences in performance, new approaches to detect hidden hypoxemias and recalibrate devices must be developed.
By making this dataset available, we hope to spotlight these racial-ethnical health disparities and foster research that can contribute to fix this pressing issue. Using the MIMIC-IV dataset, we provide a dataset that aligns pairs with patient demographics, physiological data, and specific treatment information.
Our research focuses on building a system to overcome the racial-ethnical bias present in pulse oximetry technology. However, statistical and social biases need to be addressed before integrating our work into any real-world clinical setting. In particular, implicit bias towards vulnerable populations that may or may not be present in our cohort are genuine concerns that can transfer into the training of pulse oximetry correction models. Therefore, it should be assumed that this data set will carry embedded biases of different kinds that can affect fairness and equity during model training.
Before the deployment of any model of this kind, it is the responsibility of the scientists and health care professional to audit the model for fairness and equity in its performance across disparate health groups. Fairness and equity audits alongside model explanations are needed to ensure an ethical model trustworthy to all stakeholders, especially patients and providers.
This dataset will be piloted in the MIT Critical Datathon , taking place on May 18-19, 2023, in Cambridge, MA.
We strongly encourage the community to use this dataset to conduct research studies on pulse oximetry, as well as to serve as an inspiration for replication in other Electronic Health Records (EHR) databases.
Data was sourced from MIMIC-IV (v2.2), a publicly available dataset of de-identified EHR data from 50,920 unique patients at Beth Israel Deaconess Medical Center, in Boston, MA, between 2008 - 2019 [6,7].
A pair is created whenever a measurement can be found within 90 minutes prior to a value. Each pair's timestamp offset is reported and users of the dataset can select the most appropriate window, depending on necessary precision for the study design.
Each pair was aligned with baseline and time-varying variables (always the closest value). Time-varying variables are accompanied by the offset in time relative to the timestamp (time delta) to reflect the accuracy of the time series alignment. The time offset can be either positive (i.e., value recorded after ) or negative (i.e., value recorded before ).
Variables included (but not limited to):
- Patient Information: Sex, Age, Race, English Proficiency, Admission SOFA, Charlson Comorbidity Index
- Specific Treatment Information: Mechanical Ventilation Status; Fraction of inspired Oxygen (FiO2); Use of Renal Replacement Therapy (RRT); Use of Vasopressor(s); Time elapsed since treatment start
- SOFA Score: Liver, Coagulation, Cardiovascular, Renal, Neurological, Respiratory functions
- Physiological Measurements: Blood Counts, Enzyme, Chemistry, and Coagulation related variables
- Vital Signs: Heart Rate, Mean Blood Pressure, Respiratory Rate, Temperature, and Heart Rhythm
Hidden hypoxemia was defined when , but .
The main known limitation of the dataset is related with sampling selection bias. Patients with no valid pairs, due to irregular sampling of lab values, are not be included in this derived dataset.
The code behind the preparation of the dataset and further details are available on a publically-available GitHub repository .
The dataset can be found in mimic_pulseOx_data.csv, and a variable dictionary in mimic_pulseOx_dictionary.csv
Modelling Evaluation is not defined a priori and depends on the study design. In a Machine Learning setting, we suggest three possible tasks:
- Hidden Hypoxemia Prediction (Classification)
- Prediction (Regression)
- Gap Prediction (Regresion)
Users are free to design the most suitable and fair evaluation metrics.
Version 1.0.0: This is the initial release for the MIT Critical Datathon 2023: a SaO2-SpO2 Pairs Dataset derived from MIMIC-IV. In addition to the dataset, we provide a variables dictionary.
The use of the data in this research came from MIMIC-IV, a fully de-identified dataset (contains no protected health information) that we received permission for use under a PhysioNet Credentialed Health Data Use Agreement (v1.5.0). In the original dataset, MIMIC-IV, the collection of patient information and creation of the research resource was reviewed by the Institutional Review Board at the Beth Israel Deaconess Medical Center, who granted a waiver of informed consent and approved the data sharing initiative. The study was determined to be exempt from human subjects research. All experiments need to follow the PhysioNet Credentialed Health Data License Agreement. Medical charting by providers in the electronic health record is at-risk for multiple types of bias.
Conflicts of Interest
No competing interests are declared.
- Sjoding, M. W., Dickson, R. P., Iwashyna, T. J., Gay, S. E., & Valley, T. S. (2020). Racial Bias in Pulse Oximetry Measurement. The New England journal of medicine, 383(25), 2477–2478. https://doi.org/10.1056/NEJMc2029240
- Holder, A. L., & Wong, A. I. (2022). The Big Consequences of Small Discrepancies: Why Racial Differences in Pulse Oximetry Errors Matter. Critical care medicine, 50(2), 335–337. https://doi.org/10.1097/CCM.0000000000005447
- Wong, A. I., Charpignon, M., Kim, H., Josef, C., de Hond, A. A. H., Fojas, J. J., Tabaie, A., Liu, X., Mireles-Cabodevila, E., Carvalho, L., Kamaleswaran, R., Madushani, R. W. M. A., Adhikari, L., Holder, A. L., Steyerberg, E. W., Buchman, T. G., Lough, M. E., & Celi, L. A. (2021). Analysis of Discrepancies Between Pulse Oximetry and Arterial Oxygen Saturation Measurements by Race and Ethnicity and Association With Organ Dysfunction and Mortality. JAMA network open, 4(11), e2131674. https://doi.org/10.1001/jamanetworkopen.2021.31674
- Fawzy, A., Wu, T. D., Wang, K., Robinson, M. L., Farha, J., Bradke, A., Golden, S. H., Xu, Y., & Garibaldi, B. T. (2022). Racial and Ethnic Discrepancy in Pulse Oximetry and Delayed Identification of Treatment Eligibility Among Patients With COVID-19. JAMA internal medicine, 182(7), 730–738. https://doi.org/10.1001/jamainternmed.2022.1906
- Gottlieb, E. R., Ziegler, J., Morley, K., Rush, B., & Celi, L. A. (2022). Assessment of Racial and Ethnic Differences in Oxygen Supplementation Among Patients in the Intensive Care Unit. JAMA internal medicine, 182(8), 849–858. https://doi.org/10.1001/jamainternmed.2022.2587
- Johnson, A.E.W., Bulgarelli, L., Shen, L. et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci Data 10, 1 (2023). https://doi.org/10.1038/s41597-022-01899-x
- Johnson, A.E.W., Stone, D. J., Celi, L. A., & Pollard, T. J. (2018). The MIMIC Code Repository: enabling reproducibility in critical care research. Journal of the American Medical Informatics Association : JAMIA, 25(1), 32–39. https://doi.org/10.1093/jamia/ocx084
- MIT Critical Datathon 2023. https://criticaldatathon.github.io/. Accessed 4 May 2023.
- "MIT Critical Datathon 2023". GitHub, https://github.com/CriticalDatathon. Accessed 4 May 2023.
Only credentialed users who sign the DUA can access the files.
License (for files):
PhysioNet Credentialed Health Data License 1.5.0
Data Use Agreement:
PhysioNet Credentialed Health Data Use Agreement 1.5.0
CITI Data or Specimens Only Research
machine learning pulse oximetry health equity
- be a credentialed user
- complete required training:
- CITI Data or Specimens Only Research You may submit your training here.
- sign the data use agreement for the project