Database Open Access

MIMIC-IV-ECG Demo - Diagnostic Electrocardiogram Matched Subset Demo

Brian Gow Tom Pollard Larry A Nathanson Benjamin Moody Alistair Johnson Dana Moukheiber Nathaniel Greenbaum Seth Berkowitz Parastou Eslami Elizabeth Herbst Roger Mark Steven Horng

Published: June 30, 2022. Version: 0.1

When using this resource, please cite: (show more options)
Gow, B., Pollard, T., Nathanson, L. A., Moody, B., Johnson, A., Moukheiber, D., Greenbaum, N., Berkowitz, S., Eslami, P., Herbst, E., Mark, R., & Horng, S. (2022). MIMIC-IV-ECG Demo - Diagnostic Electrocardiogram Matched Subset Demo (version 0.1). PhysioNet.

Please include the standard citation for PhysioNet: (show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.


The MIMIC-IV-ECG Demo module contains 659 diagnostic electrocardiograms across 92 unique patients. These 92 patients overlap with the patients from the MIMIC-IV Clinical Demo and are also part of the MIMIC-IV Clinical Database. These diagnostic ECGs use 12 leads and are 10 seconds in length. They are sampled at 500 Hz. This open access demo allows researchers to assess whether the MIMIC-IV-ECG module is appropriate for their research and facilitates demonstration of these diagnostic ECGs.


An Electrocardiogram, or ECG / EKG, measures the electrical activity associated with the heart [1]. Diagnostic ECGs are a standard part of a patients care [2]. The standard ECG leads are denoted as lead I, II, III, aVF, aVR, aVL, V1, V2, V3, V4, V5, V6. They are routinely obtained when admitted to the Emergency Department (ED) or to a hospital floor. ECGs will typically be repeated for patients who exhibit cardiac symptoms such as chest pain or abnormal rhythms. Daily ECGs may be obtained following acute cardiovascular events such as myocardial infarction. Patients in the ICU are continuously monitored to detect rhythm abnormalities, but full ECGs are needed to evaluate evidence of cardiac ischemia or infarction. However, diagnostic ECGs typically only comprise a small part of understanding the overall condition of a subject at the hospital. To fully understand how to best treat a given patient, a broader set of data is collected which may include: patient demographics, diagnosis, medications, lab tests, and additional information. 

This broader set of of clinical information is shared as part of the MIMIC-IV Clinical Database [3]. The MIMIC-IV-ECG Matched Subset Demo contains ECGs for 92 out of the 100 subjects in the MIMIC-IV Clinical Database Demo [4]. These 92 subjects are a small subset of the forthcoming full-version of the MIMIC-IV-ECG module. This demo database is open access, providing researchers a means for assessing whether the full ECG database is appropriate for their research. This open access database also provides easy access for demonstration in the classroom or at workshops. The overlap of subjects between the MIMIC-IV Clinical Database Demo and the MIMIC-IV-ECG Demo provides a means to link subjects to their clinical data. This allows a researcher to get a better understanding for the patient's health status. 


As part of routine care, diagnostic ECGs are collected across Beth Israel Deaconess Medical Center (BIDMC). These diagnostic ECGs are collected on machines from various manufacturers. When the ECG is collected, the machine is populated with the patient's demographics and their medical record number (MRN).

Patients from the MIMIC-IV Clinical Database who had ECGs collected between 2008 - 2019 are included as part of MIMIC-IV-ECG and this associated demo database. We converted these ECGs from the manufacturers format to the open WFDB format [5] with each WFDB record comprised of a header (.hea) file and a signal (.dat) file. The files were then transferred from BIDMC to MIT for additional processing. 

At MIT, the files were matched to the subjects in the MIMIC-IV Clinical Database and further deidentified. The patient's MRN was used to match a given 12-lead ECG to the corresponding subject ID in the MIMIC-IV Clinical Database.

We scrubbed the WFDB header files for PHI such that only the signal information and the subject ID are provided. As another part of the deidentification, the date-time information was shifted to obscure the actual date and time. The shifted date-times were matched against date-times in the subject's MIMIC-IV Clinical Database records. Therefore, timestamps for events in the MIMIC-IV Clinical Database and the associated demo, such as drug administration, are aligned with the timestamps in the MIMIC-IV-ECG Demo. However, some of the diagnostic ECGs provided here were collected outside of ED or Intensive Care Unit (ICU) visits at the hospital. Since the MIMIC-IV Clinical Database is comprised solely of ED and ICU data, the ECG timestamp can occur before or after a visit from the clinical database. 

Data Description

A total of 659 ten-second-long 12 lead diagnostic ECGs across 92 unique subjects are provided in the MIMIC-IV-ECG Demo. The ECGs are sampled at 500 Hz. The patients in this module have been matched with the MIMIC-IV Clinical Database Demo. All available diagnostic ECGs for a particular patient have been placed under a single subdirectory, named according to the patient's MIMIC-IV subject ID. 

Each waveform record path is named as files/pXXXXXXX/sZZZZZZZZ/ZZZZZZZZ, where XXXXXXX is the subject ID, and ZZZZZZZZ is the study ID. An example of the file structure is as follows:

├── p10001725
│   └── s102147240
│       ├── 102147240.dat
│       └── 102147240.hea
├── p10023771
    ├── s104496507
    │   ├── 104496507.dat
    │   └── 104496507.hea
    ├── s108135749
    │   ├── 108135749.dat
    │   └── 108135749.hea
    └── s105384473
        ├── 105384473.dat
        └── 105384473.hea

Above we find two subjects p10001725 and p10023771. For subject p10001725 we find one study: s102147240. For p10023771 we find three studies: s104496507, s108135749, s105384473. The study identifiers are completely random, and their order has no implications for the chronological order of the actual studies. Each study has a like named .hea and .dat file, comprising the WFDB record. 

The record_list.csv file contains the file name and path for each WFDB record. It also provides the corresponding subject ID and study ID. The subject ID can be used to link a subject from the MIMIC-IV-ECG Demo to the other modules in the MIMIC-IV Clinical Database and the associated demo. 

Usage Notes

This MIMIC-IV-ECG Demo provides a small subset of the forthcoming MIMIC-IV-ECG Database. These diagnostic ECGs provide a new, potentially important, piece of information for researchers using MIMIC-IV. 

A limitation of this dataset is that the 12-lead ECG timestamps may not be perfectly time synced with the other waveforms in MIMIC, as they are collected from different machines. An additional limitation, as noted above, is that some of the ECGs provided here were collected outside of the ED and ICU at the hospital. This means that the timestamps for those ECGs won't overlap with data from the MIMIC-IV Clinical Database. Finally, this database consists solely of the diagnostic ECG waveforms themselves, the reports associated with each ECG are not included and will be released at a later time. 

The signals can be viewed in Lightwave by clicking the Visualize waveforms links in the Files section below. Additionally, the signals can be read by using the WFDB toolboxes provided on PhysioNet: WFDB (in C) [7], WFDB-Matlab [8], and WFDB-Python [9]. Here is a basic script for reading a downloaded record from this project and plotting it by using the WFDB-Python toolbox:

import wfdb 
rec_path = '/files/p10001725/s102147240/102147240' 
rd_record = wfdb.rdrecord(rec_path) 
wfdb.plot_wfdb(record=rd_record, figsize=(24,18), title='Study 102147240 example', ecg_grids='all')

where rec_path is the path to the name of the .hea and .dat files for the record you'd like to plot.

Here we provide an example of how subject p10023771 from the MIMIC-IV-ECG Demo can be linked to their admission information in the MIMIC-IV Clinical Database.  Executing this from BigQuery [6]:

SELECT * FROM `physionet-data.mimic_core.admissions` WHERE subject_id=10023771

we see that the patient only has one admission to the hospital with an admittime = 2113-08-25T07:15:00and a dischtime = 2113-08-30T14:15:00.

Next, we get the timestamps from the diagnostic ECGs by checking base_date and base_time and save the result to a csv file:

from pathlib import Path
import pandas as pd

import wfdb

# get the path to all the study .hea files for p10023771
paths = list(Path("p10023771/.").rglob("*.hea"))

# get date and time for each study
date_times = {'study':[],'date':[],'time':[]} # use a dictionary to store the date and time for each study
for file in paths:
    study = file.stem
    metadata = wfdb.rdheader(f'{file.parent}/{file.stem}')

df_date_times = pd.DataFrame(data=date_times)
df_date_times.to_csv('p10023771_date_times.csv', index=False)

We observe the following for the 3 diagnostic ECGs for p10023771

study datetime
104496507 2110-07-23T08:43
108135749 2113-08-19T07:18
105384473 2113-08-25T13:58

where the date is given before the T as YYYY-MM-DD and the time is given after the T as HH:MM. Comparing this to the subjects admission in the MIMIC-IV Clinical Database:

admittime dischtime
2113-08-25T07:15 2113-08-30T14:15

we observe that s104496507 and s108135749 occurred prior to their only hospital admission while s105384473 occurred during their hospital admission. 


The project was approved by the Institutional Review Boards of Beth Israel Deaconess Medical Center (Boston, MA) and the Massachusetts Institute of Technology (Cambridge, MA). Requirement for individual patient consent was waived because the project did not impact clinical care and all protected health information was deidentified.


NRG is supported by National Institutes of Health National Library of Medicine Biomedical Informatics and Data Science Research Training Program grant number T15LM007092-30.

Conflicts of Interest

The author(s) have no conflicts of interest to declare.


  1. Geselowitz DB. On the theory of the electrocardiogram. Proceedings of the IEEE. 1989 Jun;77(6):857-76.
  2. Harris PR. The Normal electrocardiogram: resting 12-Lead and electrocardiogram monitoring in the hospital. Critical Care Nursing Clinics. 2016 Sep 1;28(3):281-96.
  3. Johnson, A., Bulgarelli, L., Pollard, T., Horng, S., Celi, L. A., & Mark, R. (2021). MIMIC-IV (version 1.0). PhysioNet.
  4. Johnson, A., Pollard, T., & Mark, R. (2019). MIMIC-III Clinical Database Demo (version 1.4). PhysioNet.
  5. Documentation for the Waveform Database (WFDB) file format. [Accessed 21 June 2022]
  6. Documentation about using the Medical Information Mart for Intensive Care (MIMIC) Database with Google BigQuery. [Accessed 21 June 2022]
  7. Documentation for the Waveform Database (WFDB) toolbox in C. [Accessed 21 June 2022]
  8. Documentation for the Waveform Database (WFDB) toolbox for Matlab. [Accessed 21 June 2022]
  9. Documentation for the Waveform Database (WFDB) toolbox for Python. [Accessed 21 June 2022]

Parent Projects
MIMIC-IV-ECG Demo - Diagnostic Electrocardiogram Matched Subset Demo was derived from: Please cite them when using this project.

Access Policy:
Anyone can access the files, as long as they conform to the terms of the specified license.

License (for files):
Open Data Commons Open Database License v1.0

Corresponding Author
You must be logged in to view the contact information.


Total uncompressed size: 76.1 MB.

Access the files

Visualize waveforms

Folder Navigation: <base>
Name Size Modified
LICENSE.txt (download) 25.2 KB 2022-06-30
RECORDS (download) 23.8 KB 2022-06-10
SHA256SUMS.txt (download) 136.7 KB 2022-06-30
record_list.csv (download) 55.4 KB 2022-06-10