Database Credentialed Access
MIMIC-III-Ext-CA: a MIMIC-III Derived Dataset of Cardiac Arrests in Photoplethysmographs
Gerben Hup , Xi Long , Rik Vullings
Published: March 10, 2026. Version: 1.0.0
When using this resource, please cite:
Hup, G., Long, X., & Vullings, R. (2026). MIMIC-III-Ext-CA: a MIMIC-III Derived Dataset of Cardiac Arrests in Photoplethysmographs (version 1.0.0). PhysioNet. RRID:SCR_007345. https://doi.org/10.13026/ec0n-y377
Please include the standard citation for PhysioNet:
(show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220. RRID:SCR_007345.
Abstract
Photoplethysmography (PPG) is increasingly considered for detecting out-of-hospital cardiac arrest (OHCA), but publicly available datasets remain scarce. We used a method for identifying cardiac arrest episodes captured by PPG in the MIMIC-III database, which combines automated screening with manual annotation of waveforms and clinical data. Using this approach, we compiled 36 annotated cardiac arrest episodes from 31 patients. The dataset serves as a valuable resource for developing and validating wearable OHCA detection technologies.
Background
Photoplethysmography (PPG) is an optical method to determine the relative changes in blood volume in the skin by measuring light absorption. In the context of wearable devices, it has gained interest as a method to measure cardiovascular parameters continuously and non-invasively, including heart rate, oxygen saturation, pulse rate variability, cuffless blood pressure, and blood glucose levels [1]. Special interest has been shown in the detection of obstructive sleep apnea [2], signs of infection [3], heart failure [4], and atrial fibrillation [5, 6].
One emerging application is the detection of out-of-hospital cardiac arrest (OHCA). As 29.7% to 63.4% of the OHCA episodes go unwitnessed [7], the use of automated monitoring with PPG could improve survival rates and patient outcomes. Several research teams, such as BECA [8], DETECT [9], HEART-SAFE [10], and Google Research [11], are actively working on this technology.
A major challenge is the availability of PPG measurements during cardiac arrests. Although the above-mentioned projects mostly simulate or approximate (out-of-hospital) cardiac arrest, real data remains vital for algorithm development and validation. To our knowledge, until now, PhysioNet has published only a single dataset containing PPG and annotated life-threatening arrhythmias, as part of the PhysioNet/Computing in Cardiology Challenge 2015 [12]. Other arrhythmia datasets, such as the MIT-BIH Arrhythmia Database, do contain occurrences of cardiac arrest, but not in PPG signals [13].
One PhysioNet dataset sparks interest about this topic: the MIMIC-III critical care database, which contains data from 38,597 adult ICU patients [14]. MIMIC-III contains not only bedside monitor waveforms, including PPG, but also anonymized electronic medical record (EMR) data. Given the vast number of patients and their health conditions, episodes of cardiac arrest are expected to occur in this dataset. However, the size of the dataset makes it highly impractical to manually screen for cardiac arrest episodes. This dataset follows from a method to identify episodes of PPG-captured cardiac arrest in the MIMIC-III database.
Methods
Data acquisition
We acquired the EMR data from the MIMIC-III Clinical Database (version 1.4) [15] and the waveform data, matched to the EMR data, from the MIMIC-III Waveform Database Matched Subset (version 1.0) [16]. The EMR data were inserted directly into a SQLite relational database. For each waveform record, we extracted the subject ID, start and end time of the recording, and the available waveform signal names from the metadata, and inserted this into the waveforms table in the SQLite database.
Identifying candidate cardiac arrests
Potential cardiac arrests were identified by searching for bedside monitor-annotated cardiac arrest events in the chartevents database table. Inspection of the d_items and chartevents tables led us to the required values of chartevents.itemid and chartevents.value that indicate ventricular tachycardia/fibrillation or asystole in both hospital systems used. For the CareVue hospital system, we require itemid = 212 and value ∈ {Asystole, Vent. Tachy, Ventricular Fib}. For the MetaVision system, we require itemid = 220048 and value ∈ {Asystole, VT (Ventricular Tachycardia), VF (Ventricular Fibrillation)}.
The waveform files were matched to the EMR data using the subject_id field. To exclude bedside monitor rhythm classifications that were not accompanied by waveform signals, we set the condition that the classification event should be charted between the start and end time of the matched waveform records, e.g., waveforms.starttime ≤ chartevents.charttime ≤ waveforms.endtime.
Another condition is the presence of a PPG signal (in MIMIC-III: PLETH, PLETH_L, PLETH_R, PLETHl, PLETHr). Furthermore, to allow for verification of cardiac arrests in the PPG signal, we added the requirement that either the electrocardiogram (ECG, lead II/II+) or the continuous arterial blood pressure (ABP/ART) signal had to be present as a reference.
Lastly, we excluded any patient who was underage (< 18 years) at the time of hospital admission, e.g., admissions.admittime - patients.dob ≥ 18 years.
This automated screening reduced the number of chartevents rows from 330,712,483 to 233 candidate cardiac arrest events in 113 unique patients.
Manual annotation of cardiac arrests
After identifying 233 candidate cardiac arrest events, all of these events were manually reviewed by a researcher with experience in physiological signals and cardiac events. The PPG, ECG, and ABP waveforms were inspected around the event timestamp, and if present, the discharge notes in the noteevents table corresponding to the hospital admission (noteevents.hadm_id = chartevents.hadm_id). To ensure that only clinically meaningful cardiac arrest episodes were retained, events were excluded whenever one of the following conditions was met:
- The PPG was unavailable/unreliable at the event timestamp, for example, due to motion artifacts or sensor disconnect.
- Both the ECG and ABP were unavailable/unreliable at the event timestamp, for example, due to clipping.
- Ventricular tachycardia is present, but is not associated with loss of pulse (non-life-threatening VT).
- There is no VF/VT/asystole present between one hour before the event timestamp and one hour after (false detection).
- The event points to the same cardiac arrest episode as another event (duplicate).
All events that were not excluded are considered part of the final dataset, which is 36 events in 31 unique patients. For each of these events, the start of the cardiac arrest episode has also been determined based on the onset of loss of pulse in the ABP and/or VT/VF/asystole in the ECG. The end of the episode has been determined based on the return of pulse in the ABP and/or the return of an organized rhythm in the ECG. If a waveform record ended before the end of cardiac arrest, the end timestamp was set to the end of the record.
Flowchart of the study population
The flowchart below shows the number of events excluded at every subsequent exclusion step.
+------------------------------+
| All chart events |
| 330,712,483 events | +------------------------------+
| 46,467 unique patients | | Excluded events |
+------------------------------+ +->| No rhythm classification |
| | | event (325,385,660) |
|------------------+ +------------------------------+
v
+------------------------------+ +------------------------------+
| Rhythm classification events | | Excluded classifications |
| 5,326,823 events | | Sinus rhythms (4,215,743) |
| 38,370 unique patients | | Atrial rhythms (656,029) |
+------------------------------+ | Paced rhythms (308,885) |
| | Conduction abnorm. (120,080) |
|-------------------->| Junctional rhythms (13,867) |
v | Supravent. tach. (6,184) |
+------------------------------+ | Ideoventricular (952) |
| VT/VF or asystole events | | Other rhythms (2,376) |
| 2,707 events | +------------------------------+
| 1,473 unique patients |
+------------------------------+ +------------------------------+
| | Waveform record mismatch |
|-------------------->| No record found (208) |
v | Event not in record (2152) |
+------------------------------+ +------------------------------+
| Events with waveforms |
| 347 events |
| 189 unique patients |
+------------------------------+ +------------------------------+
| | Signals missing |
|-------------------->| No PPG (114) |
v | No ECG and ABP (0) |
+------------------------------+ +------------------------------+
| Events with PPG & ECG/ABP |
| 233 events |
| 133 unique patients |
+------------------------------+ +------------------------------+
| | Age restrictions |
|-------------------->| Underage (0) |
v +------------------------------+
+------------------------------+
| Candidate CA events | +------------------------------+
| 233 events | | No cardiac arrest |
| 113 unique patients | | Non-life-threatening VT (86) |
+------------------------------+ | PPG unavailable (47) |
| | False detection (31) |
|-------------------->| PPG unreliable (29) |
v | Duplicate (3) |
+------------------------------+ | ECG & ABP unreliable (1) |
| Cardiac arrest events | +------------------------------+
| 36 events |
| 31 unique patients |
+------------------------------+
Data Description
The annotations of the cardiac arrest events are included in the dataset.csv file.
The row_id column links to the chartevents.row_id column in the MIMIC-III Clinical Database and refers to the specific chart event that led to the discovery of cardiac arrest. The subject_id and hadm_id columns link to several table columns in the clinical database, e.g., patients.subject_id and admissions.hadm_id.
The file column corresponds to the wfdb file in the MIMIC-III Waveform Database Matched Subset.
The cardiac_arrest_start and cardiac_arrest_end columns contain the timestamps of the start and end of the cardiac arrests, in the format YYYY-MM-DD HH:MM:SS. If a record ends before the cardiac arrest ends, the record end is taken as cardiac_arrest_end. Note that the timestamps are shifted in the original datasets for anonymization purposes.
Usage Notes
The dataset is intended to be used for the development of cardiac arrest detection algorithms based on PPG signals. To be able to view the actual cardiac arrest episodes in the waveform records, please acquire the corresponding files from the MIMIC-III Waveform Database Matched Subset. Clinical data from the patients can be acquired from the MIMIC-III Clinical Database.
Please note that this dataset is not exhaustive: it depends on the heart rhythm classification performance of the patient monitors that produced the chart events. Furthermore, as about a third of the MIMIC-III Waveform Database has been matched to the MIMIC-III Clinical Database, many cases of cardiac arrest in the Waveform Database are not identified by the automated process as described under Methods.
Release Notes
Version 1.0.0: First release.
Ethics
This project builds upon previously established datasets, which have been de-identified and approved for credentialed distribution. All data used in this research are sourced from MIMIC-III databases
Acknowledgements
This work is financed by the PPP Allowance made available by Top Sector Life Sciences & Health to the Dutch Heart Foundation to stimulate public-private partnerships, grant number 01-003-2021-B005, and by Philips Electronics Nederland B.V.
Conflicts of Interest
The authors have no conflicts of interest to declare.
References
- Kim KB, Baek HJ. Photoplethysmography in Wearable Devices: A Comprehensive Review of Technological Advances, Current Challenges, and Future Directions. Vol. 12, Electronics (Switzerland). Multidisciplinary Digital Publishing Institute (MDPI); 2023.
- Kim MW, Park SH, Choi MS. Diagnostic Performance of Photoplethysmography-Based Smartwatch for Obstructive Sleep Apnea. J Rhinol. 2022 Nov 1;29(3):155–62.
- Conroy B, Silva I, Mehraei G, Damiano R, Gross B, Salvati E, et al. Real-time infection prediction with wearable physiological monitoring and AI to aid military workforce readiness during COVID-19. Sci Rep. 2022 Dec 1;12(1).
- Shah AJ, Isakadze N, Levantsevych O, Vest A, Clifford G, Nemati S. Detecting heart failure using wearables: A pilot study. Physiol Meas. 2020 Apr 1;41(4).
- Perez M V., Mahaffey KW, Hedlin H, Rumsfeld JS, Garcia A, Ferris T, et al. Large-Scale Assessment of a Smartwatch to Identify Atrial Fibrillation. N Engl J Med. 2019;381(20):1909–17.
- Lubitz SA, Faranesh AZ, Selvaggi C, Atlas SJ, McManus DD, Singer DE, et al. Detection of Atrial Fibrillation in a Large Population Using Wearable Devices: The Fitbit Heart Study. Circulation. 2022 Nov 8;146(19):1415–24.
- Nishiyama C, Kiguchi T, Okubo M, Alihodžić H, Al-Araji R, Baldi E, et al. Three-year trends in out-of-hospital cardiac arrest across the world: Second report from the International Liaison Committee on Resuscitation (ILCOR). Resuscitation. 2023;186(December 2022):109757.
- Hup RG, Linssen EC, Eversdijk M, Verbruggen B, Bak MAR, Habibovic M, et al. Rationale and design of the BECA project: Smartwatch-based activation of the chain of survival for out-of-hospital cardiac arrest. Resusc Plus. 2024 Mar 1;17.
- Edgar R, Scholte NTB, Ebrahimkheil K, Brouwer MA, Beukema RJ, Mafi-Rad M, et al. Automated cardiac arrest detection using a photoplethysmography wristband: algorithm development and validation in patients with induced circulatory arrest in the DETECT-1 study. Lancet Digit Heal. 2024 Mar 1;6(3):e201–10.
- Schober P, van den Beuken WMF, Nideröst B, Kooy TA, Thijssen S, Bulte CSE, et al. Smartwatch based automatic detection of out-of-hospital cardiac arrest: Study rationale and protocol of the HEART-SAFE project. Resusc Plus. 2022 Dec;12:100324.
- Shah K, Wang A, Chen Y, Munjal J, Chhabra S, Stange A, et al. Automated loss of pulse detection on a consumer smartwatch. Nature. 2025 Jun 5;642(8066):174–81.
- Clifford GD, Silva I, Moody B, Li Q, Kella D, Shahin A, et al. The PhysioNet/Computing in Cardiology Challenge 2015: Reducing False Arrhythmia Alarms in the ICU. Comput Cardiol (2010). 2015;42:273–6.
- Moody GB, Mark RG. The impact of the MIT-BIH Arrhythmia Database. IEEE Eng Med Biol Mag. 2001;20(3):45–50.
- Johnson AEW, Pollard TJ, Shen L, Lehman LWH, Feng M, Ghassemi M, et al. MIMIC-III, a freely accessible critical care database. Sci Data. 2016 May 24;3.
- Johnson A, Pollard T, Mark R. PhysioNet. 2016. MIMIC-III Clinical Database (version 1.4).
- Moody B, Moody G, Villarroel M, Clifford GD, Silva I. PhysioNet. 2020. MIMIC-III Waveform Database Matched Subset (version 1.0).
Parent Projects
Access
Access Policy:
Only credentialed users who sign the DUA can access the files.
License (for files):
PhysioNet Credentialed Health Data License 1.5.0
Data Use Agreement:
PhysioNet Credentialed Health Data Use Agreement 1.5.0
Required training:
CITI Data or Specimens Only Research
Discovery
DOI (version 1.0.0):
https://doi.org/10.13026/ec0n-y377
DOI (latest version):
https://doi.org/10.13026/5a4t-7324
Topics:
ppg
photoplethysmography
mimic-iii
cardiac arrest
out-of-hospital cardiac arrest
ohca
Project Views
1
Current Version1
All VersionsCorresponding Author
Versions
Files
- be a credentialed user
- complete required training:
- CITI Data or Specimens Only Research You may submit your training here.
- sign the data use agreement for the project