Challenge Open Access
Improving the Quality of ECGs Collected using Mobile Phones - The PhysioNet Computing in Cardiology Challenge 2011
Published: April 19, 2011. Version: 1.0.0
Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PCh, Mark RG, Mietus JE, Moody GB, Peng C-K, Stanley HE. PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals (2003). Circulation. 101(23):e215-e220.
According to the World Health Organization, cardiovascular diseases (CVD) are the number one cause of death worldwide. Of these deaths, 82% take place in low- and middle-income countries. Given their computing power and pervasiveness, is it possible for mobile phones to aid in delivery of quality health care, particularly to rural populations distant from physicians with the expertise needed to diagnose CVD?
Advances in mobile phone technology have resulted in global availability of portable computing devices capable of performing many of the functions traditionally requiring desktop and larger computers. In addition to their technological features, mobile phones have a large cultural impact. They are user-friendly and are among the most efficient and most widely used means of communication. Currently there is about one cell phone for every two humans in the world.
India is experiencing a double burden of disease with persistent infectious disease coupled with increasing incidence of chronic disease. Two chronic diseases, CVD and cancer, currently account for nearly 20% of the total disease burden, which is expected to double to 40% by 2016. Unfortunately, due to a lack of adequate primary care capacity, most chronic diseases are diagnosed at an advanced stage, when the cost of treatment and rehabilitation is prohibitive for the masses, particularly the poor. This is true for other middle-income developing countries such as Brazil, China, Indonesia and South Africa as well.
India's large population spread, in conjunction with the increase in heart-related diseases, is a public health concern that has led to a joint collaboration between Narayana Hrudayalaya (one of India’s leading health-care providers) and Sana (an open-source, student-managed, mobile telemedicine group at MIT; see http://www.sanamobile.org/). Sana's specific objective in this venture is to enable an inexperienced nurse or paramedic to collect and transmit electrocardiograms (ECGs) from rural patients for remote analysis by cardiologists at a city hospital. While Sana has been successful in developing open-source software for transmitting and archiving ECGs through Bluetooth recording, significant obstacles still remain. PhysioNet has partnered with Sana to identify some of the crucial obstacles involved in having an inexperienced person record ECGs usable for diagnostic interpretation from a mobile device.
The aim of the PhysioNet/Computing in Cardiology Challenge 2011 is to develop an efficient algorithm able to run in near real-time within a mobile phone, that can provide useful feedback to a layperson in the process of acquiring a diagnostically useful ECG recording. At a minimum, the software should be able to indicate within a few seconds, while the patient is still present, if the ECG is of adequate quality for interpretation, or if another recording should be made. Ideally, the software should identify common problems (such as misplaced electrodes, poor skin-electrode contact, external electrical interference, and artifact resulting from patient motion) and either compensate for these deficiencies or provide guidance for correcting them.
Data to support development and evaluation of challenge entries are being collected by the Sana project, and will be provided freely via PhysioNet. The data set will include ten-second recordings of twelve-lead ECGs; age, sex, weight, and possibly other relevant information about the patients; and (for some patients) a photo of the electrode placement taken using the mobile phone. Although some of the recordings will be identified initially as acceptable or unacceptable, challenge participants and others interested will have an opportunity to assist in establishing a "gold standard" classification of the quality of the recordings in the challenge data set.
Participants may enter the challenge by completing the classification task described below. Awards will be given to the most successful participants who attend Computing in Cardiology 2011 (18-21 September 2011 in Hangzhou, China) to present their work and discuss their findings with other participants and CinC attendees; see Awards below for details.
Data for the Challenge
The challenge data are standard 12-lead ECG recordings (leads I, II, II, aVR, aVL,aVF, V1, V2, V3, V4, V5, and V6) with full diagnostic bandwidth (0.05 through 100 Hz). The leads are recorded simultaneously for a minimum of 10 seconds; each lead is sampled at 500 Hz with 16-bit resolution.
Nurses, technicians, and volunteers with varying amounts of training recorded the ECGs for this project. In the intended application, the recordists (those making ECG recordings) will not necessarily have had experience. Since the goal of this challenge is to investigate if laypersons can be assisted via software in collecting high-quality ECGs reliably, the recordings gathered for this challenge include ECGs made by volunteers with minimal training.
Data for the Challenge are available in the "Files" section below. The data are provided in both CSV format, compatible with the Challenge Android API (see the project files), as well as standard PhysioBank (compact binary) formats, readable using the WFDB software package.
Three challenge data sets have been created from the collected ECGs:
- Set A: training data, with reference quality assessments provided to the participants (available now)
- Set B: test data for events 1 and 2, with reference quality assessments withheld; (available now)
- Set C: test data for event 3 (not released to participants)
A series of events prevented us from collecting a sufficient number of ECGs with the hardware we had originally planned to use for this Challenge, and in order to permit the Challenge to go forward, the ECGs in sets A, B, and C have been collected using conventional ECG machines. (A pilot dataset containing synthetic ECGs recorded using the target hardware was posted previously and is still available.) Our originally planned data acquisition process is continuing, and we expect to have a set of ECGs acquired using that process available soon. As soon as possible, we'll assemble a set D of these ECGs, we'll develop a set of reference classifications for them, and we will classify them using the entries for events 2 and 3 as well. We will share the results of this experiment with event 2 and 3 participants, and we will post set D as a supplement to the Challenge 2011 data sets on PhysioNet. Since participants have not had an opportunity to study samples of ECGs collected using this process, however, these results will not be used for scoring any of the Challenge events.
ECG Quality Assessment
ECGs collected for the challenge were reviewed by a group of annotators with varying amounts of expertise in ECG analysis, in blinded fashion for grading and interpretation. Between 3 and 18 annotators, working independently, examined each ECG, assigning it a letter grade (A (0.95): excellent, B (0.85): good, C (0.75): adequate, D (0.60): poor, or F (0): unacceptable) for signal quality. The average grade was calculated in each case, and each record was assigned to one of 3 groups:
- Group 1 (acceptable): If the average grade is 0.70 or more, and at most one grade is F.
- Group 2 (indeterminate): If the average grade is 0.70 or more, but two or more grades were F.
- Group 3 (unacceptable): If the average grade is less than 0.70.
Approximately 70% of the collected records were assigned to group 1, 30% to group 3, and fewer than 1% to group 2, reflecting a high degree of agreement among the annotators.
Challenge Events and Scoring
ECGs from all three quality groups will be presented to the challenge participants in blinded fashion. Participants may enter one or more of the following challenge events:
- Event 1: (closed source, open data) Participants must submit classifications of the challenge test set recordings only. The score is the fraction of group 1 and group 3 ECGs in set B that are correctly classified by the participant's algorithm.
- Event 2: (open source, open data) Participants must submit Java software for performing the classification function compatible with a template API and Android phone emulator provided by the challenge organizers. The score is the fraction of group 1 and group 3 ECGs in set B that are correctly classified by the participant's algorithm.
- Event 3: (open source, closed data) Same as event 2, but using set C and a different scoring algorithm. The score is the product of classification accuracy (as in events 1 and 2) and a function of execution time in the reference Android phone (faster is better).
How to submit an entry
The final deadline for entries has now passed; the information in this section is for reference only.
To enter any of the three Challenge events, login to PhysioNetWorks (create an account first if you don't have one already) and follow the link from your PhysioNet home page to "PhysioNet/CinC Challenge 2011" to get started. Joining the project creates a Challenge Participant Page for you, where you will submit your entries and receive your scores.
Instructions for submitting entries to event 1 are on your Challenge Participant Page. Participants may submit up to five entries in event 1, at any time until the final deadline of noon GMT on Friday, 5 August 2011; their highest-scoring entry will determine their ranking in event 1.
Events 2 and 3 are open to event 1 participants who qualified by submitting an event 1 entry no later than noon GMT on Saturday, 30 April 2011. A single submission enters both events 2 and 3. Participants may resubmit their event 2 and 3 entry as many times as they wish until the final deadline of noon GMT on Friday, 5 August 2011, but only the final submission received before the deadline will be scored.
In events 2 and 3, challenge participants are required to develop algorithms capable of running in a reference Android phone. An API including a working sample algorithm is provided as a framework for these algorithms in the challenge files. To participate in events 2 and 3 you must use this API. In events 2 and 3, each open-source algorithm that can run in an Android phone will be tested in a reference phone.
Algorithms are not required to produce a classification for each record, but only correct classifications contribute to the scores (thus a missing classification is equivalent to an incorrect classification).
Note that although a few group 2 ECGs are present in the challenge data sets, they do not influence participants' scores. Given that the expert annotators disagree about their acceptability, it is unreasonable to expect participants' algorithms to classify them in any specific way, so group 2 ECGs are not counted. Nevertheless, since participants are not told which records belong to group 2, participants can improve their chances of obtaining a high score by classifying all records.
A generous donation from the GSM Association, in addition to support from Computing in Cardiology, has allowed us to increase Challenge awards this year to amounts that will offset most or all of the costs of registration, accommodation and travel to CinC 2011.
The GSMA represents the interests of the worldwide mobile communications industry. Spanning 219 countries, the GSMA unites nearly 800 of the world’s mobile operators, as well as more than 200 companies in the broader mobile ecosystem, including handset makers, software companies, equipment providers, Internet companies, and media and entertainment organisations. The GSMA is focused on innovating, incubating and creating new opportunities for its membership, all with the end goal of driving the growth of the mobile communications industry.
To be eligible for one of the major awards, you must:
- Submit a preliminary entry in event 1 no later than noon GMT on Saturday, 30 April 2011.
- Submit an acceptable abstract (about 300 words) about your work on the Challenge to CinC no later than noon GMT on Sunday, 1 May 2011.
- Submit a final entry in at least one event no later than noon GMT on Friday, 5 August 2011.
- Submit a paper (4 pages) describing your work on the Challenge no later than noon GMT on Sunday, 11 September 2011.
- Attend CinC 2011 and discuss your work. All Challenge presentations are scheduled for Tuesday, 20 September. Challenge awards will be presented during the closing plenary session on Wednesday, 21 September.
Each of the three most successful eligible participant teams, including the winners of each of the three events, will receive an award of US$2000, but no team or individual will receive more than one such award. If an eligible team achieves top results in more than one event, they will receive one award, and the other award(s) will be distributed to the next most successful team(s). Our objective is to ensure that at least three of the best entries are represented and discussed at CinC. An additional US$2000 will be divided among other participants who have contributed to the development of the data used in the challenge.
We thank the GSMA and Computing in Cardiology for their support of this year's Challenge.
Following the final deadline for submission of entries on 5 August 2011, here are the final scores for the Challenge.
Event 1 (closed source, open data set B)
The top 10 (of 49) participants in event 1 are listed above. In this event, participants submitted classifications of each ECG in data set B; they were not required to submit their code for this event, and they were not required to use Java code as in events 2 and 3. Each score above is the accuracy of the participant's most successful entry. Accuracy is the fraction of reference classifications in set B that match those in the entry; the range is 0 to 1, where 1 would be perfect matching. Up to five entries were allowed for each participant.
Event 2 (open-source, open data set B)
In this event, participants submitted Java code that the challenge organizers tested in a reference Android mobile phone using the same data and the same scoring method as in event 1. Only one entry was tested and scored for each participant; the same entry was also used for event 3.
Scores obtained by the challenge organizers using their own code are unofficial.
Event 3 (open source, closed data set C)
In the final event, the same code submitted by participants for event 2 was tested by the challenge organizers using data set C, which has not been released to participants, thus eliminating any possibility of "tuning" the code to specific ECGs in the data set. Scores for this event are calculated as a product of the accuracy as defined for events 1 and 2, and a function of mean run time in the reference Android phone.
As in event 2, scores obtained by the challenge organizers using their own code are unofficial.
The papers below were presented at Computers in Cardiology 2011. Please cite this publication when referencing any of these papers. These papers have been made available by their authors under the terms of the Creative Commons Attribution License 3.0 (CCAL). We wish to thank all of the authors for their contributions.
Update: Inspired by this Challenge, the journal Physiological Measurement has devoted a focus issue [2012 Sept;33(9)] to the subject of signal quality in cardiorespiratory monitoring, with eleven articles on this topic, including nine written by Challenge participants.
The first of these papers is an introduction to the challenge topic, with a summary of the challenge results and a discussion of their implications.
Improving the Quality of ECGs Collected Using Mobile Phones: The PhysioNet/Computing in Cardiology Challenge 2011
Ikaro Silva, George B Moody, Leo Celi
The remaining papers were presented by participants in the Challenge, who describe their approaches to the challenge problem.
CinC Challenge - Assessing the Usability of ECG by Ensemble Decision Trees
Sebastian Zaunseder, Robert Huhle, Hagen Malberg
An Algorithm for Assessment of Quality of ECGs Acquired via Mobile Telephones
Philip Langley, Luigi Y Di Marco, Susan King, David Duncan, Costanzo Di Maria, Wenfeng Duan, Marjan Bojarnejad, Dingchang Zheng, John Allen, Alan Murray
Signal Quality Indices and Data Fusion for Determining Acceptability of Electrocardiograms Collected in Noisy Ambulatory Environments
GD Clifford, D Lopez, Q Li, I Rezek
Assessment of Signal Quality and Electrode Placement in ECGs using a Reconstruction Matrix
Arie C Maan, Erik W van Zwet, Sumche Man, Suzanne MM Oliveira-Martens, Martin J Schalij, Cees A Swenne
ECG Quality Assessment for Patient Empowerment in mHealth Applications
Dieter Hayn, Bernhard Jammerbund, Günter Schreier
Real-time Signal Quality Assessment for ECGs Collected using Mobile Phones
Chengyu Liu, Peng Li, Lina Zhao, Feifei Liu, Ruxiang Wang
Rule-Based Methods for ECG Quality Control
Benjamin E Moody
Electrocardiogram Quality Classification based on Robust Best Subsets Linear Prediction Error
Kai Noponen, Mari Karsikas, Suvi Tiinanen, Jukka Kortelainen, Heikki Huikuri, Tapio Seppänen
Computer Algorithms for Evaluating the Quality of ECGs in Real Time
Henian Xia, Gabriel A Garcia, Joseph C McBride, Adam Sullivan, Thibaut De Bock, Jujhar Bains, Dale C Wortham, Xiaopeng Zhao
Recognition of Diagnostically Useful ECG Recordings: Alert for Corrupted or Interchanged Leads
Irena Jekova, Vessela Krasteva, Ivan Dotsinsky, Ivaylo Christov, Roger Abächerli
Assessment of ECG Quality on an Android Platform
Using Machine Learning to Detect Problems in ECG Data Collection
Nir Kalkstein, Yaron Kinar, Michael Na'aman, Nir Neumark, Pini Akiva
Physionet Challenge 2011: Improving the Quality of Electrocardiography Data Collected Using Real Time QRS-Complex and T-Wave Detection
Thomas Ho Chee Tat, Chen Xiang, Lim Eng Thiam
Simple Scoring System for ECG Quality Assessment on Android Platform
Václav Chudáček, Lukás Zach, Jakub Kuzilek, Jirí Spilka, Lenka Lhotská
Data Driven Approach to ECG Signal Quality Assessment using Multistep SVM Classification
Jakub Kuzilek, Michal Huptych, Václav Chudáček, Jirí Spilka, Lenka Lhotská
Anyone can access the files, as long as they conform to the terms of the specified license.
License (for files):
Open Data Commons Attribution License v1.0
Total uncompressed size: 0 B.
Access the files
- Download the ZIP file (169.7 MB)
- Access the files using the Google Cloud Storage Browser here. Login with a Google account is required.
- Access the data using Google Cloud "gsutil":
gsutil -m cp -r gs://challenge-2011-1.0.0.physionet.org DESTINATION
- Download the files using your terminal:
wget -r -N -c -np https://alpha.physionet.org/files/challenge-2011/1.0.0/
|gsma.jpg (download)||59.9 KB||2019-04-17|
|set-a.tar.gz (download)||102.9 MB||2011-07-20|
|set-b.tar.gz (download)||51.2 MB||2011-04-20|
|sim.tar.gz (download)||5.7 MB||2011-02-16|
|sim.zip (download)||5.6 MB||2011-02-16|
|top-ten.txt (download)||100 B||2019-04-17|