Software Open Access
PhysioTag: An Open-Source Platform for Collaborative Annotation of Physiological Waveforms
Lucas McCullum , Benjamin Moody , Hasan Saeed , Tom Pollard , Xavier Borrat Frigola , Li-wei Lehman , Roger Mark
Published: April 25, 2023. Version: 1.0.0
When using this resource, please cite:
(show more options)
McCullum, L., Moody, B., Saeed, H., Pollard, T., Borrat Frigola, X., Lehman, L., & Mark, R. (2023). PhysioTag: An Open-Source Platform for Collaborative Annotation of Physiological Waveforms (version 1.0.0). PhysioNet. https://doi.org/10.13026/g06j-3612.
Lucas McCullum, Hasan Saeed, Benjamin Moody, Diane Perry, Eric Gottlieb, Tom Pollard, Xavier Borrat Frigola, Qiao Li, Gari Clifford, Roger Mark, and Li-wei H Lehman. PhysioTag: An Open-Source Platform for Collaborative Annotation of Physiological Waveforms, Proceedings of the Computing in Cardiology, September 2022.
Please include the standard citation for PhysioNet:
(show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.
To develop robust algorithms for automated diagnosis of medical conditions such as cardiac arrhythmias, researchers require large collections of data with human expert annotations. Currently, there is a lack of accessible, open-source platforms for human experts to collaboratively develop these annotated datasets through a web interface. In this work, we developed a flexible, generalizable, web-based framework to enable multiple users to create and share annotations on multi-channel physiological waveforms. The software is simple to install and offers a range of features, including: user management and task customization; a programmatic interface for data import and export; and a leaderboard for annotation progress tracking.
Physiological waveform (i.e., electrocardiographic, or EKG) monitoring is becoming increasingly commonplace, with modern devices allowing long-term, continuous capture from a broad population. To develop robust algorithms for automated diagnosis and characterization of medical conditions such as ventricular tachycardia (VT), researchers require high-quality annotations provided by human experts. Currently there is a lack of freely available, maintained, open-source software to enable collaborative, human annotation of physiological waveforms such as EKG. Prior annotation tools, such as MetaAnn and WAVE, focused on a single-user mode annotation in a local desk-top setting without support for remote Web-based access. Web-based EKG annotators such as WaveformECG  and LabelECG  were developed to enable remote visualization and annotation of EKG waveforms, however they are not actively maintained.
We developed an open-source annotation platform, PhysioTag, that enables experts to collaboratively annotate physiological waveform records using a standard web browser. The software is simple to install, following best practice in Python packaging, and offers a range of features, including: user management and task customization; a programmatic interface for data import and export; and a leaderboard for annotation progress tracking.
PhysioTag Front-End for VTach Annotations
Using the platform, we carried out a pilot study to assess the validity of ventricular tachycardia (VTach) alarms from several commercial hospital monitors. In this current release, the front end and user interface were customized for the VTach annotation task. In this task, the interface displays a default 10-seconds of multichannel waveform segments immediately prior to the VTach alarm onset, and asks the annotators to classify the monitor-identified arrhythmias as “true”, “false”, “uncertain” or "reject" if a decision cannot be made due to low data quality. Each alarm event will be presented to two annotators. In the event of disagreement, the alarm will be annotated by an adjudicator.
Web-based access: server/client architecture
Since the main purpose of this software is to create as many annotations as possible, our goal was to minimize the amount of time and effort required by the individual annotators. To avoid the need to install a custom application on the annotator's machine, we opted to build the system as a web-based application that can be used from any modern web browser. The interface is designed to present the viewer with the most relevant information, allow them to make their decision quickly, and save their responses to the server automatically.
Waveform Visualization and Annotation Workflow
After registering and logging in, an annotator can access the annotation interface on the "Create More Annotations" page which displays the current annotation project, patient record, patient event, arrhythmia label (i.e., VTach for ventricular tachycardia if annotating cardiac events), annotation decisions, accompanying comments, a button to submit the annotation, and arrows move to the previous or next annotation in the left column.
For the annotation decisions, users were given the options of “True” for when they believe the annotation event was correct, “False” for when they believe the annotation event was incorrect, “Uncertain” for when they are unsure which annotation to assign, “Reject” for when the alarm is unreadable due to noise, artifacts, or other hindrance, and “Save for Later” for when the user would like to return to annotate this event at another time. In the right column are the necessary physiological waveforms for annotation (i.e., EKG with blood pressure for cardiac events) with an optional scalable and draggable caliper for precise interval measurement.
At the top of the figure, small buttons can be seen which take a screenshot of the current waveforms, zoom in, zoom out, and reset the view while, at the bottom, a compressed version of the entire signal range can be seen and dragged to move the signal window through time surrounding the physiological event. The user also has the option to manually adjust the y-axis scaling for each waveform by clicking on the y-axis labels. The vertical blue line extending across each signal represents the time of the physiological event, while the vertical dashed black line represents the current cursor location for comparing signals at the same time point.
To provide self-evaluation and training, the platform supports the integration of a small set of expert-annotated sample waveforms as a practice set on the "Practice Test" page. With this feature enabled, new users are encouraged to review and annotate these sample records in order to practice using the platform and to refresh their knowledge. At the end of the practice test, the user can compare their answers with those of the experts. These scores could be used to qualify and rank the annotations provided by many annotators. Further support is provided on the "Annotator Tutorial" page which provides instructions for the annotator's usage through descriptive text and live demonstration videos.
In order to build a "gold standard" corpus of event annotations, each event needs to be reviewed by two annotators independently. However, the experience of past annotation projects has shown that we rarely know in advance how many total records will be available for annotation, how many expert reviewers will be able to participate in the project, or how quickly each annotator will be able to work. So, rather than deciding in advance which annotators will review each event, events are assigned to annotators dynamically.
When a new annotator joins the project, they can navigate to the "Current Assignment" page where they can view their previously annotated events, see which events they have left to annotate, and assign themselves 10 randomly-selected events that have not yet been annotated. After they have finished reviewing those 10 events, they can assign themselves a new batch of 10, and so on. The number of randomly-selected events does not have to be 10 and can be adjusted in the settings file. This "self-assignment" strategy is intended to ensure a diverse annotation dataset between the annotators, while encouraging productivity through multiple small tasks instead of one large task.
Significant inter-annotator variability can exist among manual labels by clinical experts. In order to resolve conflicts between two annotator decisions, an adjudication framework was added to recruit extra opinions on particularly difficult alarm events. Project administrators can designate trusted annotators to serve as adjudicators based on the annotator's level of expertise where they can then access the "Adjudicator Console" page which displays similar information as the "Create More Annotations" page with additional information for the previously annotated for the current adjudicated event. Disagreements between annotators can then be resolved either through an active one-on-one discussion between the annotators involved, or by an adjudicator voting to break the tie.
Throughout the annotation process, users will have the ability to view their current complete, incomplete, and save-for-later annotations and make changes (i.e., adjust their comments, change their decision) through the "Viewer Settings" page. At any point, the user can adjust the settings for the annotator interface (i.e., signal thickness, time before / after the alarm event, colors, downsampling, etc.) to optimize their efficiency in completing the annotations.
A public leaderboard was created to show the ranking of each annotator over the past day, week, month, and all time in the number of annotations completed to motivate healthy competition between the annotators to stimulate productivity and can be viewed on the "Leaderboard" page. Further, pie charts are displayed on the leaderboard to track the ratio of annotation decisions and completion statistics for all the possible physiological events in the dataset.
Admin users of the platform have access to the "Admin Console" page which has functionality to invite new users, view current user waveform annotator settings, and assign current users as an adjudicator or admin. Also shown on the admin console are all the complete and incomplete annotations and adjudications along with their associated user, decision, comments, and timestamp
The design of our platform is driven by the following objectives and considerations:
- Scalable collaborative annotation: allow multiple concurrent annotators, potentially from diverse geographic locations, to annotate datasets in parallel.
- Fast response time in waveform visualization with customizable display settings.
- Flexible adjudication user interface to enable human adjudicators to review and resolve conflicting annotator decisions.
- Functional annotation management to track project progress and manage user accounts and annotations.
- Lightweight and easy to deploy annotation servers.
- Open data format using standard WFDB Python library to enable remote access to datasets on PhysioNet through an Application Programming Interface (API).
The annotation platform was implemented in Python using Django, enabling ease of deployment, as well as ease of developing new features and customizations for specific annotation projects. The user interface uses Django Plotly Dash to provide a featureful, efficient, cross-browser display of the waveforms. The sample annotations were created from a subset of the PhysioNet 2015 challenge  and are saved to an SQLite3 database in the
db folder which can be manipulated in the backend or frontend.
On the back end, the server uses the WFDB Python package  to read the input waveform files. This allows the system to be used to annotate any waveform records stored in WFDB format. The
merge_new_data.sh script in the
record-files folder may provide some automated assistance when adding new data and merging it to an existing project. Further, inside the
sample_data folder, more shell scripts are provided and may be adjusted to help out more (see the README for more). To provide an interface to open-access web-based data interfaces such as PhysioNet, an Application API was developed using GraphQL to extract annotations from the platform’s database for external applications
Installation and Requirements
The following commands are instructions on how to run a local instance using a Django server.
Install sqlite3 for database management:
$ sudo apt-get install sqlite3
Install Redis for caching of waveforms (or more recent version):
$ wget https://download.redis.io/releases/redis-6.2.6.tar.gz $ tar xzf redis-6.2.6.tar.gz $ cd redis-6.2.6 $ make $ make install
Create the Python virtual environment with Python 3.6 (or similar):
$ python3 -m venv env
Activate the Python virtual environment:
$ source env/bin/activate
Install packages using Pip:
$ pip install -r requirements.txt
Set up the Django environment:
$ cp .env.example .env
To run the server, within the `waveform-django` directory:
$ python manage.py runserver
You should now be able to access the waveform landing page at (if running on localhost port 8000): http://localhost:8000/waveform-annotation/waveforms/.
To have access to the cache run the following command in a new terminal. You should be able to see the content of the website which would have been sent on the live site. If you do not run this command first before testing out the parts of the site which need cache, you will receive a `ConnectionRefusedError: [Errno 61] Connection refused` error.
If you would like to test out the email features run the following command in a new terminal. You should be able to see the content of the email which would have been sent on the live site. If you do not run this command first before testing out the email features, you will receive a `ConnectionRefusedError: [Errno 61] Connection refused` error.
$ python -m smtpd -n -c DebuggingServer localhost:1025
The source code for the PhysioTag Software is publicly available on GitHub . Details on how to use the platform are as follows:
Overview of folder structure
backups;: stores the backup annotations created by `waveform-django/cron.py`
db: stores the SQLite3 database which stores all of the websites backend data such as user account information, annotations, etc.
debug: stores errors in the platform if `DEBUG` is set to `True` in the settings file (`waveform-django/website/settings/base.py`)
deploy: stores the files necessary to be deployed onto a server using NGINX
record-files: stores the data to be annotated in WFDB format (including `RECORDS` files and specific structure)
Basic server commands (for a Mac / Linux user, Windows users may have to tweak some commands such as activating a virtual environment)
To migrate new models:
$ python manage.py migrate --run-syncdb
To reset the database:
$ python manage.py flush
After you are finished, deactivate the Python virtual environment:
To view annotations in when using an SQLite3 database:
$ cd waveform-django $ sqlite3 db.sqlite3
Once inside of the SQLite3 database, you can try out a sample command:
select * from waveforms_annotation;
Version 1.0.0: First public release.
The authors have no ethics statement to declare.
This research is supported by NIH grant R01EB030362.
Conflicts of Interest
The authors have no conflicts of interest to declare.
- Winslow RL Granite S JC. WaveformECG: A platform for visualizing, annotating, and analyzing ECG data. Comput Sci Eng Sep-Oct 2016;18(5):36–46.
- Ding Z, Qiu S, Guo Y, Lin J, Sun L, Fu D, Yang Z, Li C, Yu Y, Meng L, Lv T, Li D, Zhang P. LabelECG: A web-based tool for distributed electrocardiogram annotation. httpsarx- ivorgabs190806553 August 2019;.
- Clifford GD, Silva I, Moody B, Li Q, Kella D, Shahin A, Kooistra T, Perry D, Mark RG. The PhysioNet/computing in cardiology challenge 2015: reducing false arrhythmia alarms in the ICU. In 2015 Computing in Cardiology Conference (CinC) 2015 Sep 6 (pp. 273-276). IEEE.
- Xie C, McCullum L, Johnson A, Pollard T, Gow B, Moody B. Waveform Database Software Package (WFDB) for Python. PhysioNet 2022;URL https://doi.org/10. 13026/MMPM-2V55.
- PhysioTag on GitHub. https://github.com/MIT-LCP/waveform-annotation/ [Accessed: 6 Feb 2023]
Anyone can access the files, as long as they conform to the terms of the specified license.
License (for files):
Total uncompressed size: 56.8 MB.
Access the files
- Download the ZIP file (45.5 MB)
- Access the files using the Google Cloud Storage Browser here. Login with a Google account is required.
Access the data using the Google Cloud command line tools (please refer to the gsutil
documentation for guidance):
gsutil -m -u YOUR_PROJECT_ID cp -r gs://physiotag-1.0.0.physionet.org DESTINATION
Download the files using your terminal:
wget -r -N -c -np https://physionet.org/files/physiotag/1.0.0/
|merge_new_data.sh (download)||617 B||2023-01-22|