Database Restricted Access
ALarms, Outcomes Telemetry with Timing (ALOTT): a Bedside-EMR Database
John Lawrence , Mike Rayo , Timothy Huerta
Published: March 19, 2025. Version: 1.0.0
When using this resource, please cite:
(show more options)
Lawrence, J., Rayo, M., & Huerta, T. (2025). ALarms, Outcomes Telemetry with Timing (ALOTT): a Bedside-EMR Database (version 1.0.0). PhysioNet. RRID:SCR_007345. https://doi.org/10.13026/sbq5-dy17
Please include the standard citation for PhysioNet:
(show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220. RRID:SCR_007345.
Abstract
ALOTT is a pilot project that gathered telemetry data from 270 beds and over 15,000 hospital admissions from September 2018 through November 2020 at The James Cancer Hospital and Ross Heart Hospital. ALOTT contains telemetry waveforms, such as electrocardiogram and blood oxygen, at 60-240Hz and temperature, pulse, and perfusion measurements at two-second intervals linked to Electronic Medical Record (EMR) data. These EMR data include patient admissions, demographics, diagnosis, history, orders, labs, medications, allergies, and nurse flowsheet chart events, including vitals, risk scores, ventilator use, and emergency response team (ERT) events. This dataset was constructed to facilitate the creation of algorithms to predict ERT events.
Background
Past research suggests a gap exists between the information practitioners need to accurately anticipate decompensation events and the information current technologies provide [1]. Practitioners use a set of early warning signs, often weak and non-definitive signals, to anticipate clinical decompensation regardless of the physiologic origin [2-4]. Patient decompensation is a change in the overall ability to maintain physiological function, representing a potentially life-threatening condition requiring immediate intervention and stabilization.
In 2018, The Ohio State University (OSU), in collaboration with General Electric (GE), began to collect telemetry data from all beds at The James Comprehensive Cancer Center and Solove Research Institute (The James) and The Richard M. Ross Memorial Heart Hospital (The Ross) and match that telemetry data with data from the electronic medical record.
Telemetry data is a continuous dataset that is used for bedside patient surveillance. EMR data contains information about the patient in the bed and describes at a higher level who the patient is, what they are experiencing, and their outcomes. By combining telemetry and EMR datasets, we hope to use the continuous elements of the telemetry dataset to predict future decompensation events in the EMR Dataset. An example of such a Telemetry-EMR dataset is MIMIC-III [4], and an example outcome of MIMIC-III’s use is sepsis prediction algorithms [5]. However, most of these Telemetry-EMR datasets, including MIMIC-III, do not preserve the time of patient events because of patient privacy concerns. The aim of this collaboration, ALOTT, is to collect data that can be used to identify patterns that predict patient decompensation events and help physicians anticipate them. These patterns can include temporal patterns, as ALOTT preserves the relative date and time of each patient event relative the beginning of that patient’s encounter, but obfuscates the absolute date time, thereby continuing to preserve privacy.
Our hope in sharing the ALOTT dataset is to:
- Empower researchers to use these data to improve healthcare outcomes.
- Improve the rigor of existing algorithms through intersubjective verifiability.
- Encourage transparent and reproducible research using a publicly available and well-documented dataset.
- Act as a model for how other institutions can generate a similarly structured dataset from their patient population.
- Provide a dataset that facilitates the creation of temporal patterns
Methods
ALOTT combines data from electronic medical records and bedside monitors. ALOTT was populated with data collected during routine hospital care, and there was no associated burden with the data collection. Data in ALOTT originated from either the streamed output of bedside monitors or from an EMR information warehouse (IW).
Telemetry Data Collection
The bedside data were collected from all bedside monitors in The James and Ross hospitals by streaming their outputs onto a secure network drive using GE’s Bernoulli system. The data from this system were the sources of Alarm, Measurement, and Waveform data files.
The Bernoulli system outputs bedside data in hierarchical, extensible markup language (XML) files containing up to 30 minutes of data split into two-second elements. Each two-second element contains patient identifiers, alarms, measurements, and waveforms that occurred during that two-second interval. These XML files were transformed into flat files and split into alarm, measurement, and waveform components by patient, hospital admission, and time. The patient identifiers were then mapped to a random non-repeating integer (aMRN) and were assigned a random date offset. The waveform flat files were then split into continuous reads and converted into waveform database (WFDB) files using code modified from the WFDB Python library [6]. The alarm and measurement data were combined into a tabular format and converted into comma-delimited (csv) files. All of the code used to perform these transformations can be found in the ALOTT repository [8].
Some examples of data generated from bedside monitors include:
- Time-stamped alarms such as dissolved oxygen and heart rate high or low
- Two-second resolution vital measurements such as temperature and respiratory rate
- 240 Hz cardiac waveforms
EMR Data Collection
The other data source in the ALOTT dataset originated from the EMR. These data were generated using the EMR’s IW through an honest broker request. We requested data at the hospital admission level related to patient demographics, social history, orders, vitals, and diagnosis. The honest broker provided data for patients that were included in the bedside monitor dataset, including:
- Primary and secondary diagnosis for the admission
- Labs, Procedures, and Medications administered and performed during the admission
- Selected, timestamped nursing flowsheets related to vitals, risk scores, and emergency response events
The honest broker provided data as SQL tables. These tables were deidentified, date-shifted, and exported as .csv flat files.
The demographics of the ALOTT dataset are:
- 57.8% of patients are male
- The median length of stay (LOS) in the ICU is 128 hours (75 Q1- 235 Q3)
- The median LOS of hospital admission is 69 hours (33.5 Q1- 131 Q3)
- Each hospital admission has, on average:
- 779 charted observations
- 266 laboratory measurements
- 28,193 telemetry alarms
- 524,744 240hz 2-second waveform segments
- 1,020,071 2-second resolution telemetry measurements for each hospital admission
Data De-identification
Data from bedside monitors initially contained medical record numbers (MRN) and other patient identifiers. All patient identifiers other than MRN, contact serial number (CSN) and dates were removed during data processing. All patients were assigned a random nonrepeating identifier (aMRN), and each aMRN was assigned a random offset between -180 and 180 days. All dates were shifted by that random offset. The time of day was not shifted. Each CSN was also masked to a random non-repeating identifier (aCSN).
Data from the EMR only contained MRN and contact serial numbers as patient identifiers, as the honest broker removed all other identifiers. CSNs are the unique identifier for a hospital visit or encounter. The honest broker did not provide ages, and it is not a populated variable in this dataset.
EMR data was mapped to telemetry data using MRN and location-time data. The algorithm that performed this mapping initially uses MRN and date to map telemetry data to hospital admissions. If the algorithm cannot match using MRN and date it will then use location and date to determine which patient should have been in a particular hospital bed at that time. Once an MRN for a telemetry file was identified, the dates of that file were shifted by the random offset that is assigned to that MRN. Once all EMR and telemetry files were mapped to aMRN and aCSN and all dates were shifted, the mapping table was deleted to convert the coded limited dataset into a limited dataset.
The Institutional Review Board (IRB) of OSUWMC approved the project. The requirement for individual patient consent was waived because the project did not impact clinical care, and all protected health information was deidentified through an IRB-approved process.
Data Description
ALOTT is comprised of ten EMR IW files and a collection of WFDB and csv telemetry files that are grouped by patient and admission. All files can be linked to one another using aMRN (the masked patient identifier) or aCSN (the masked hospital admission identifier).
The ten EMR IW csv tables are:
The allergies, demographics, and social_Hx tables contain information about the patient. Every row of the allergy table details a specific allergen and reaction for a patient. Every row of the demographics table contains a patient’s reported race, ethnicity, marital status, sex, and language with blood type. Every row of the social_Hx table contains a patient’s reported tobacco, alcohol, and non-medical drug use.
The admissions table contains information about hospital admissions. Every row of this table includes when and where the patient was admitted and discharged, the primary and secondary diagnosis, and the length of stay of the admission.
The location and intensive care unit (ICU) tables contain information about where patients were during their admission and if they were in an inpatient or ICU bed. Every row of this table contains information about when a patient was in a bed and how long they were there.
The orders, lab, and medication tables contain information about when something was ordered for the patient when that order occurred, and the result of that order. Every row of one of these tables is an order, when that order was administered, and its result.
The flowsheet table contains time-stamped, provider-entered data about the patient. Every row of this table is an entry that contains vitals, risk scores, and emergency response events.
The telemetry data are organized into folders by patient (aMRN) and then by that patient's hospital admissions (aCSN). Each folder contains the alarm, measurement, and waveform files belonging to that hospital admission. Alarm files are named ALARMp<aMRN>e<aCSN>.csv, measurement files are named MEASUREMENTp<aMRN>e<aCSN>.csv, and waveform files are named p<aMRN>e<aCSN>d<yyyymmdd>T<hhmmss>.hea and p<aMRN>e<aCSN>d<yyyymmdd>T<hhmmss><A-Z>.dat. One alarm and measurement file exists for each hospital admission. One .hea and one or more .dat files exist for each continuous waveform on each day. The d and T components of the filenames are the date and time that a waveform starts.
The alarm table contains timestamped bedside monitor alarms and their source for that patient. Every row of this table contains the alarms occurring during a two-second period.
alarmName
is the name of the alarm that has been triggered.pollTime
is at what time this alarm is active.setLow
and setHigh are the normal range that would not activate the alarm.chanValue
field is the value that triggered the alarm.sil
is equal to "1" if the alarm was silenced.inactivationState
states if the alarm has been disabled.abnormalFlags
are a textual representation of the level and type of alarm reported.
The measurements table contains non-waveform measurements that are gathered by bedside monitors for that patent. Every row of this table is a bedside measurement for a device, the units of that device, and where that device is located.
pollTime
is when a measurement is being collected.mesname
is the name of that measurement.msite
is where that measurement is being collected from.muom
is the unit of measure for that measurement.mtext
is the value of that measurement.
The WFDB files are split into continuous reads by patient. These files contain all waveforms that were collected for a patient during an admission by day.
A full data dictionary can be found in the repository [8].
Usage Notes
ALOTT is provided as a collection of 10 csv files and folders containing csv and WFDB files by patient, along with example code for using the data with Spark in Python. While the dataset has all 18 Health Insurance Portability and Accountability Act (HIPAA) identifiers removed, researchers must still treat these data carefully and abide by the rules defined in the Data Use Agreement (DUA). Before users can access the ALOTT dataset, they must sign a DUA outlining appropriate security standards and forbidding efforts to identify individual patients.
Our primary goal in releasing the ALOTT dataset is to enable the prediction of ERT events in the ICU. By analyzing the continuous telemetry data alongside EMR data, we hope to identify early warning signs and patterns that precede critical events, allowing for timely intervention and the potential of improved patient outcomes. Example use cases include:
- Predictive Modeling: Researchers can develop and validate predictive algorithms for early identification of patient decompensation, sepsis, cardiac events, and other critical conditions.
- Clinical Decision Support: The dataset can be used to enhance clinical decision support systems by integrating predictive models that utilize telemetry and EMR data, similar to the use of MIMIC.
- Machine Learning Research: The dataset provides a rich resource for testing and improving machine learning models, especially those focusing on time-series analysis and anomaly detection.
Together, we believe these efforts will lead us towards the development of robust algorithms that alert medical staff to impending emergencies, thereby enhancing patient safety and care efficiency.
Release Notes
Version 1.0.0 is the first public release of the ALOTT Database.
Ethics
This project was approved by the Institutional Review Boards (IRB) of the Wexner Medical Center and The Ohio State University (Columbus, Ohio) under IRB identifiers 2013H0419 and 2019H0058. Requirement for individual patient consent was waived because the project did not impact clinical care and all protected health information was deidentified.
Conflicts of Interest
The authors have no conflicts of interest to declare.
References
- Horwood, Chelsea R., et al. “Gaps Between Alarm Capabilities and Decision-Making Needs: An Observational Study of Detecting Patient Decompensation.” Proceedings of the International Symposium on Human Factors and Ergonomics in Health Care, vol. 7, no. 1, 2018, pp. 112–116., doi:10.1177/2327857918071028.
- Ludikhuize, J., Smorenburg, S. M., de Rooij, S. E., & de Jonge, E. (2012). Identification of deteriorating patients on general wards; measurement of vital parameters and potential effectiveness of the Modified Early Warning Score. J Crit Care, 27(4), 424.e427-413. doi:10.1016/j.jcrc.2012.01.003
- Wickens, T. D. (2008). Elementary Signal Detection Theory. In: Oxford Univ. Press.
- Johnston, M. J., Arora, S., Pucher, P. H., Reissis, Y., Hull, L., Huddy, J. R., . . . Darzi, A. (2016). Improving Escalation of Care: Development and Validation of the Quality of Information Transfer Tool. Ann Surg, 263(3), 477-486. doi:10.1097/SLA.0000000000001164
- Scherpf, Matthieu, et al. "Predicting sepsis with a recurrent neural network using the MIMIC III database." Computers in biology and medicine 113 (2019): 103395.
- Xie, C., McCullum, L., Johnson, A., Pollard, T., Gow, B., & Moody, B. (2023). Waveform Database Software Package (WFDB) for Python (version 4.1.0). PhysioNet. https://doi.org/10.13026/9njx-6322.
- Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.
- Lawrence, J. (2023). Pre Publication Release (Version v0.0.1) [Computer software]. https://doi.org/10.5281/zenodo.7922122
Access
Access Policy:
Only registered users who sign the specified data use agreement can access the files.
License (for files):
PhysioNet Restricted Health Data License 1.5.0
Data Use Agreement:
PhysioNet Restricted Health Data Use Agreement 1.5.0
Discovery
DOI (version 1.0.0):
https://doi.org/10.13026/sbq5-dy17
DOI (latest version):
https://doi.org/10.13026/46e5-hh23
Project Website:
https://github.com/johnclawrence/Carescape
Corresponding Author
Files
- sign the data use agreement for the project