from PhysioNet, the research resource for complex physiologic signals


Frequently Asked Questions about PhysioNet

General

Sign-in, Accounts, and Passwords

Where is ...

Downloading

PhysioBank Files

Reading and Writing Digitized Signals

Reading and Writing Annotations

Software

Help!


 
 

How can I get an answer to my question?

Have you read this FAQ? If not, please take a few minutes to do so. It answers many common questions.

Have you tried searching for key words using the Search tool? All text on the PhysioNet web site is indexed and can be found by searching for it. To do this, type one or more terms related to your topic or question into the search box below, then click on the "Search" button to its right:

A similar search box and button appear at the top right corner of this and almost every other page on PhysioNet.

If you have not found an answer to your question in the PhysioNet FAQ or by a PhysioNet search, you may wish to ask your question by email. If everyone who took the time to ask a question by email first took the time to read How to Ask Questions the Smart Way, we would be able to take the time to answer all of the questions we receive with the detailed and pertinent answers they deserve. Since this will never happen, we give priority in answering questions to those who have read this FAQ. How can we tell who has done so? That's easy; we look for the magic word in the subject line of the email. (Important: the author of "How to Ask Questions the Smart Way" cannot answer your questions; read his disclaimer!)

Top  


   

What is all of this, anyway?

You're looking at the PhysioNet web site, or one of its mirrors. Read more about PhysioNet and the NIH-sponsored research resource to which it belongs here.

We have large collections of physiologic signals (time series) and software that can be used to study these signals, and smaller but growing collections of research papers, tutorials, and reference materials that relate to the signals and software.

Top  

Who are you?

We are a diverse group of computer scientists, physicists, mathematicians, biomedical researchers, clinicians, and educators at MIT (Cambridge, MA, USA), the Beth Israel Deaconess Medical Center/Harvard Medical School (Boston, MA, USA), Boston University (Boston, MA, USA), and McGill University (Montréal, QC, Canada). Many of us have worked together for 20 years or even longer on problems relating to characterizing and understanding the dynamics of human physiology and the implications of dynamical change in diagnosis and treatment of pathophysiology.

PhysioNet receives contributions of data, software, publications, and tutorials from researchers worldwide; see the PhysioNet Contributors page for a list.

Top  

Why is PhysioNet here?

You can't learn everything there is to know about snow by studying a single snowflake, or even a few hundred of them. In much the same way (and for some of the same reasons), physiologic signals display astonishing diversity, between individuals and even within individual subjects over time. To study them seriously requires large amounts of data that are difficult and expensive to gather and to characterize, and software that can be flexibly and efficiently modified to meet the unique requirements of new research.

PhysioNet is here first of all because we (see the previous question) needed to gather such data and to design such software for our own work. Having done so, we believe that other researchers should not be forced to do the same, and that by making our data and software available, others should be able to explore them, to develop, test, and refine hypotheses; in short, to do investigations that would not be possible otherwise.

Many researchers around the world share this vision of open science, in which investigators who need data with which to test their ideas can bootstrap their studies using large, freely available, and well-understood data collections, and in which investigators wishing to explore their data using a wide variety of methods can find verifiable, open-source, reference implementations of analysis software that can be adapted to their own studies. PhysioNet began in 1999 with our own collections of data and software, but its archives continue to grow in scope and depth thanks to the contributions of many others.

Top  

Who can use data and software from PhysioNet?

These materials have placed here for the use of researchers anywhere in the world (our visitors in the month of December 2009 came from at least 149 countries and territories on every continent, including Antarctica). Many of them are biomedical and clinical researchers in academia and industry, but others include physicists, mathematicians, computer scientists, educators, graduate and undergraduate university students, and even secondary school students.

Top  

Have the PhysioBank data been fully deidentified (anonymized), and may they be used without (further) IRB approval?

Yes.

If you are planning to contribute data to PhysioNet, it is your responsibility to ensure that they have been fully deidentified before transmitting them to us. Please review our guidelines for contributors. Our software for deidentification of free text medical records may be helpful while preparing data to be contributed.

Top  

Is all of this really free?

Yes.

We encourage contributions of data and software to PhysioNet, but only if contributors are willing to allow their contributions to be used freely. See our guidelines for contributors and our copying policy.

Top  

How can I buy a copy of ...?

See the answer to the previous question.

Printed copies of some of our books are now available at the PhysioNet Bookstore.

Top  

Please send me a copy of ...

Everything we have is free, and can be freely redistributed. You can download it yourself, or you can ask a friend to download it for you. We understand that web access can be slow or expensive in some locations; please understand that preparing and mailing materials from this web site to individual users would also be slow and expensive.

For downloading tips, read the questions and answers below, beginning with How can I download binary files?.

Top  

What are the license terms?

The software is licensed under the GNU Public License (GPL), or (if noted in the source files) other licenses that conform to the Open Source Definition. These licenses permit verbatim copying and redistribution of the source files, and generally grant other permissions as well. For further details, see Can I use your code in my commercial application? (below).

There is nothing analogous to the GPL for data, but we permit copying and redistribution of unaltered data from this site without restrictions, in the spirit of the GPL. We do not allow distribution of altered data except under conditions that make it clear that the data have been altered, because it is very important that users should be able to distinguish between original data from this site and modified versions of those data.

Other materials from this site (books, tutorials, papers, and commentary) may be reproduced freely, with appropriate credit to the original authors.

See the PhysioNet Copying Policy for further details.

Top  

Is this software Y2K-compliant?

Yes. See our statement of Y10K compliance.

This really isn't a frequently-asked question any more. The last person who asked it sent his question by email, dated 1 January 103.

Top  

My connection is slow. Is there a mirror?

Yes. See Mirrors for a list.

Top  

Can I set up a mirror?

Yes. Please use rsync as described in How to set up a mirror of PhysioNet.

Top  

Will you post a link to my web site?

Probably not, unless it is directly relevant to the content of PhysioNet. Most external links on this site reference publications and other materials that provide additional information, examples of use, or context for PhysioBank data or PhysioToolkit software. We also maintain short and highly selective lists of other data and software resources likely to be of interest to PhysioNet visitors. These lists are limited to non-commercial sites that provide access unavailable elsewhere to collections of physiologic signals or related data, or open-source software for study of such data.

Top  


   

Why should I sign in?

You are not required to login in order to use PhysioBank, PhysioToolkit, or the PhysioNet Library, all of which can be accessed freely. Use of PhysioNetWorks is also free, but it requires logging in.

PhysioNetWorks workspaces are available to members of the PhysioNet community for works in progress that will be made publicly available via PhysioNet when complete. Unlike other areas of PhysioNet, these workspaces are password-protected.

Top  

Why would I need an account and how do I get one?

Most visitors don't need accounts (see the previous question).

If you wish to create a PhysioNetWorks project, to join an existing one, or to participate in an annual PhysioNet/Computing in Cardiology Challenge, you will need an account and a password in order to establish your identity and gain access to password-protected workspaces. Owners of PhysioNetWorks projects, which are works in progress, may allow access to invited collaborators only, or they may allow access to PhysioNetWorks members only under the terms of a Data Use Agreement (DUA).

The MIMIC II Clinical Database is an example of a PhysioNetWorks project that requires a password and DUA for access.

To create an account, go to the PhysioNetWorks login page, enter your email address, and click on 'Create account'. Instructions for setting up your account and choosing your password are sent immediately to the address you enter, with the subject line "PhysioNetWorks login" and the sender address "DoNotReply" at physionet.org, so be sure to enter a valid email address at which you can receive it. If it doesn't arrive within a few minutes, check that your spam filter has not discarded it. If you forget your password, or wish to change it at any time, simply return to the login page and request a new one.

Since your access to PhysioNet's restricted or protected content will be interrupted if you lose both your password and access to your registered email address, we suggest not using a temporary address as your account name.

Top  

I can't log in!

Check your assumptions: most users don't need to log in (see the previous question and answer).

If you really do need to log in, go to the PhysioNetWorks login page and follow the instructions there.

Top  

How can I change my PhysioNetWorks password?
How can I change my MIMIC II Explorer/Query Builder password?

PhysioNetWorks users: Go to the PhysioNetWorks login page, enter your email address, and click "Reset password". Follow the instructions that will be sent to your email address by the autoresponder within a minute or two.

MIMIC II Explorer/Query Builder users: A dedicated server, mimic2app.csail.mit.edu, provides access to the MIMIC II Explorer/Query Builder, for which a separate MIMIC password is required. (Your PhysioNetWorks password does not work on mimic2app.csail.mit.edu.) To get a new MIMIC password, request one using the MIMIC project contact form. Requests for new MIMIC passwords are reviewed by the MIMIC project, and they are normally answered within one or two business days (usually not on weekends or US holidays).

Top  


 

Where can I find the specific type of data I need?

Some of the most popular versions of this question are answered in this section; read it first.

The next place to look is in the PhysioBank Archive Index. It lists all of the data collections in PhysioBank, with brief descriptions and links to longer descriptions of each.

If you are looking for records with specific combinations of signals, durations, time or amplitude resolution, annotations of specific types, or female or male subjects of particular ages, try a PhysioBank Record Search to locate relevant data. A limited amount of information about diagnoses and medications is also searchable in this way. A tutorial introduction to this tool is available here.

A PhysioNet (text) search can also be helpful. Using the search box at the top of almost any page on this web site, look for keywords that describe the data you seek.

Top  

Where can I find data for healthy subjects?

Most data in PhysioBank have been obtained from subjects with a variety of health problems. About twenty PhysioBank databases, however, include healthy subjects.

The control records (c01, c02, ... c10) from the Apnea-ECG Database were obtained from healthy volunteers during sleep; the recordings each contain a single ECG signal and are each about 8 hours long. Simultaneously recorded respiration and oxygen saturation signals are available for one of these recordings.

The CAP Sleep Database includes 16 full-length polysomnograms of healthy subjects. Signals include 3 or more EEG channels, 2 EOG channels, submentalis and bilateral anterior tibial EMG, airflow, abdominal and thoracic respiratory effort, SaO2, and ECG.

The Fantasia Database is a very well-controlled set of 2-hour recordings of ECG (with beat annotations) and respiration signals from 40 rigorously-screened healthy subjects (20 young, 20 elderly, with equal numbers of men and women in each group). Half of the recordings also include an uncalibrated continuous non-invasive blood pressure signal.

Heart rate time series from five additional groups of healthy volunteers are available in a collection of data used for a study of exaggerated heart rate oscillations during meditation (two groups of meditators recorded before and during meditation, a group of volunteers recorded during sleep, a group of volunteers recorded during metronomic (fixed-rate) breathing, and a group of elite athletes recorded during sleeping hours).

The PTB Diagnostic ECG Database includes records from 52 healthy volunteers; here is a list of them.

The MIT-BIH Normal Sinus Rhythm Database consists of ECG recordings from subjects who were found to have had no significant arrhythmias, ST changes, or known cardiac disease. Since these subjects were recorded for medical reasons, however, they are not necessarily "healthy" -- but their medical problems are not heart-related. Subjects included in the Normal Sinus Rhythm RR Interval Database were known to be healthy, however.

The Sleep-EDF Database [Expanded] contains EEG, EOG, and other signals from 42 healthy subjects. (Twenty-two of these had mild difficulty falling asleep, but were otherwise healthy.)

ECG, EMG, GSR, and respiration from seventeen healthy volunteers are included in data collected for a study of Stress Recognition in Automobile Drivers.

All six of PhysioBank's gait and balance databases include at least some data collected from healthy volunteers. Among PhysioBank's neuro- and myoelectric databases, several include data from healthy volunteers.

This is not a comprehensive list; depending on your interests, you may find other relevant data in PhysioBank. Read the descriptions of the data collections in the PhysioBank Archive Index to learn about them, and follow the links there and above for additional information.

This list also does not include data sets that are in development within projects on PhysioNetWorks. These data sets are currently accessible to members of the respective projects only. When they are complete, they will become open-access data within PhysioBank. To learn about them, join PhysioNetWorks (it's free, and it takes only a minute or two) and browse through the list of works in progress. Many project owners welcome other interested researchers to join their projects, so in some cases it may be possible to get access while development is still in progress.

Top  

Where can I find serial data (multiple recordings of the same subjects?)

These databases include multiple recordings of some or all subjects:

A few other PhysioBank databases include multiple recordings of a few subjects, but lack information about the sequence of the recordings and the intervals between them:

Studies requiring data collected at different times of the day, or during sleep and non-sleep, etc., may also be able to use segments of long continuous recordings (see the next question).

Top  

Where can I find long-duration signals and time series?

Many PhysioBank databases include at least some records that are on the order of 24 hours or longer in duration. These include:

Top  

Is the AHA Database available on PhysioNet?

No, it is currently available only from ECRI. Additional information about the AHA Database is available here.

A single sample record that was prepared as an example by the creators of the AHA Database is available in PhysioBank.

Top  

Are there any 12-lead (diagnostic) ECGs in PhysioBank?

The PTB Diagnostic ECG Database contains 549 twelve-lead ECGs from 294 subjects. Most of these ECGs are two minutes in duration. (They also include simultaneously recorded Frank XYZ leads.)

The St.-Petersburg Institute of Cardiological Technics 12-lead Arrhythmia Database contains 75 twelve-lead ECGs from 32 subjects. Each recording is 30 minutes in duration.

The PhysioNet/Computing in Cardiology Challenge 2011 addressed the problem of quality assessment of 12-lead ECGs, making use of 1500 twelve-lead ECGs (a training set of 1000 ECGs, and a test set of 500 ECGs). These ECGs are unscored, although those in the training set have been classified individually with respect to acceptability for purposes of diagnostic interpretation.

Twelve-lead ECGs are also available from sources other than PhysioBank, including ECG Wave-Maven, the CSE Database and the 12-lead ECG Library.

Top  

Are updates for CD-ROM databases of physiologic signals available here?

Yes. Find them here.

Top  

Where is [some file]?
I can't find [something]!

The search box is your friend. It's at the top right corner of nearly every single page on PhysioNet. Use the search box!

Top  


   

How can I download binary files?

The details of doing this depend on your web browser, not on anything specific to PhysioNet or to the specific files you wish to download, so the first thing you should do is to learn how to use your web browser. Most browsers have a Help button that can get you started.

In Firefox or Chrome, right-click on the link, and choose "Save Link As..." from the popup menu.

In Lynx, press d to download the target of the highlighted link.

If you are using MS Internet Explorer, it is often possible to download a file simply by left-clicking on the link to that file. This is not a foolproof method, however, since MSIE attempts to guess the file type and may attempt to open the file rather than downloading it. A more reliable method is to right-click on the link and then to choose Save Target As... from the pop-up menu that appears. In most cases, you can accept the suggested file name, but be aware that MSIE will generate a .txt extension for any file that has a name without an extension (such as the files named Makefile that are found throughout PhysioToolkit), so you will need to correct these file names.

In Safari, right-click (or, with a single-button mouse, press the Control key and click), and choose "Download Linked File".

Many other web browsers, including Galeon, Konqueror, Mozilla, Netscape, and Opera, allow you to download a file by pressing and holding the Shift key while left-clicking on the link to the file you wish to download.

Top  

Can I download an entire PhysioBank database in one step?

Yes. Before you do so, however, note that this may not be necessary.

The recommended way to read PhysioBank data files is by using either PhysioToolkit software linked to the WFDB library, or (for those who like to write their own code) your own software linked to the WFDB library. In either case, the WFDB library does the work of finding and reading PhysioBank files. If you have a local copy of a PhysioBank file, the WFDB library reads that copy; otherwise, it reads the file from PhysioNet using the same HTTP protocol that your web browser uses.

If you want to read PhysioBank files without using the WFDB library (but why?) you will probably need to reformat the files into some less storage-efficient format first, and to do that you will need to read the original files using the WFDB library. In that case, you may as well allow the WFDB library to read the original files via HTTP, and write only the reformatted files to local storage.

If (despite all of the above) you decide to download a local copy of an entire database, there are two ways to do so that are much more efficient than downloading the files one at a time using a web browser.

The first method uses rsync, which is the same free software used by the PhysioNet mirrors. Install rsync if you don't have it already, and then use the command

rsync physionet.org::

to get a list of databases available via rsync. The output of this command will contain lines such as

aami-ec13       ANSI/AAMI EC13 Test Waveforms (1 Mb)
afdb            MIT-BIH Atrial Fibrillation Database (607 Mb)
afpdb           PAF Prediction Challenge Database (195 Mb)
aftdb           AF Termination Challenge Database (3 Mb)

The entries in the first column are names of available "modules" (sets of files). To download (for example) the AF Termination Challenge Database into a subdirectory of its own within /usr/database, type:

rsync -Cavz physionet.org::aftdb /usr/database/aftdb

(You may, of course, use any directory for storage of the downloaded files. The suggested directory, /usr/database, is searched by default by the WFDB library, so it's a good choice.)

To download the MIMIC II Waveform Database Matched Subset, the recommended procedure is slightly different; see these notes.

Using rsync is particularly convenient if you have an unreliable connection; if the transfer is interrupted, simply repeat the command once the connection has been re-established, and rsync will quickly determine where it needs to resume the download. Another advantage of using rsync is that it preserves the timestamps of the original files on PhysioNet, so that if you return to PhysioNet, it will be easy to see if the original files have been updated since the last time you downloaded them. If there have been any updates, you can bring your local copy up-to-date by running the same rsync command that you used to create it, without copying anything that hasn't changed.

Note that rsync has its own IANA-assigned port (873); if you can reach PhysioNet with your web browser (port 80, HTTP) but not via rsync, your firewall may be blocking port 873.

The second method is described in the answer to the next question. Choose it if you can't use rsync, or if your connection to one of the PhysioNet mirrors (which do not generally support rsync access) is much better than your connection to the PhysioNet master server.

Top  

There are so many files in .... Can I get a zip file or a tar archive of it?

You can obtain a tar archive or zip file of any single PhysioBank record using the PhysioBank ATM.

If you would like to download an entire PhysioBank database, see the previous question.

Otherwise, you can try looking for a .zip or a .tar.gz archive in the directory that contains the files of interest, or in its parent directory. If you don't find one, however, the answer is no. It is necessary to keep individual files available, and maintaining redundant copies of these files within archives would not be the best use of available resources.

There are excellent alternatives, however, to downloading many files one at a time using a web browser. Try using a utility that can do batch-oriented HTTP transfers, such as wget, available from this site in source form for Unix, Mac OS/X, or MS-Windows, or in binary form for MS-Windows. Once you have installed wget, retrieve a batch of files using a command such as

wget -r -np http://physionet.org/physiobank/database/mitdb/

(or substitute the name of a nearby PhysioNet mirror for physionet.org above).

Top  

How can I unpack a .tar.gz archive (a "tarball")?

These files are gzip-compressed tar archives.

MS Windows: The free 7-zip file archiver can unpack .tar.gz archives as well as most other common compressed formats.

Alternatively, if you have installed Cygwin, follow the instructions below for using GNU tar.

GNU/Linux, Mac OS X: Using GNU tar, you can decompress and unpack foo.tar.gz in one step:

tar xfvz foo.tar.gz
If your browser decompressed the archive while downloading it, just unpack it:
tar xfv foo.tar

Other Unix platforms: Traditional versions of tar may not support GNU tar's z option. If you have one of these, decompress using gzip before unpacking, and then unpack the decompressed archive, like this:

gzip -d foo.tar.gz
tar xfv foo.tar
(If you don't have gzip, free versions are available for all popular operating systems from gzip.org.)

I unpacked the tarball, now where are the files?

An archive named foo.tar.gz would normally be unpacked into a subdirectory (folder) named foo within the current directory (folder). Look at the file names shown during the unpacking process to see where the unpacked files have been written.

Top  

Can I get these files via FTP?

No. It is considered insecure for an FTP server and a web server to share a file system, and it is not practical for us to maintain separate web and FTP servers.

If you are interested in batch file transfers, read the answer to "There are so many files in ....", above.

Top  

Can I look at the waveforms using only my web browser?

Yes. Go to the PhysioBank ATM and fill in the form. For a sample, click here.

Top  


   

What are PhysioBank-compatible (or WFDB-compatible) formats?

The contents of almost all PhysioNet databases are collections of flat files (not relational databases). These files can be read by programs that use the WFDB library to do so. The WFDB library reads files in a variety of formats, presenting their contents in a uniform manner to the programs that use it, so that those programs need not be concerned with the details of the storage formats used in each case. The formats that can be read by the WFDB library are referred to as "PhysioBank-compatible formats", because they are permissible for files within standard PhysioBank databases. The terms "PhysioBank-compatible" and "WFDB-compatible" are synonymous. Note, however, that the WFDB library is capable of reading a wider variety of formats than those that are actually used within PhysioBank.

Many visitors who ask this question assume that they need to understand the details of PhysioBank's file formats in order to use PhysioBank. This is not necessary, however. Numerous options exist for reading and writing files in PhysioBank-compatible formats; read the other questions and answers in this section of the PhysioNet FAQ for pointers to many of them. If you really need to know the details of the formats, however, follow the pointers in the next paragraph.

There are several types of files in standard PhysioNet databases:

Top  

What are .dat, .hea, .atr, .qrs, ... files?

Files belonging to PhysioBank databases have two-part names: the first part is the record name, and the second part (following the ".") indicates the file type. For example, a file named "chf08.hea" is a file of type .hea (see below) belonging to a record named "chf08".

All of these file types are found in PhysioBank databases:

Top  

What are .xws files and how can I view them?

These are short text files that point to specific locations within the records with which they are associated. You can view the same locations using, for example, the PhysioBank ATM. If you haven't set up a browser helper application for viewing .xws files, you can read them as text and copy the database, record, and annotator into the PhysioBank ATM, then navigate to the location of interest.

You can set up WAVE (actually, wavescript) as a helper application for your browser so that when you click on a .xws file, WAVE will open the associated record at the specified location using its built-in HTTP client code (this is much faster and more flexible than using the PhysioNet ATM). See Controlling WAVE from a web browser in the WAVE User's Guide for details.

Top  

What is a ``record name'' or an ``annotator name''?

Records are identified by record names, which contain letters, digits, and underscores. For example, the MIT-BIH Arrhythmia Database has record names consisting of three-digit numbers beginning with `1' or `2', and the European ST-T Database has record names that are four-digit numbers prefixed by `e'. Case is significant in record names that contain letters, even in environments such as MS-DOS for which case translation is normally performed by the operating system on file names; thus `e0104' is the name of a record found in the European ST-T Database, whereas `E0104' is not. A record is comprised of several files, which contain signals, annotations, and specifications of signal attributes; each file belonging to a given record normally includes the record name as part of its name. The files named RECORDS found in the PhysioBank database directories list the record names for each database.

There may be many annotation files associated with the same record; they are distinguished by annotator names. The name of an annotation file is the record name, followed by a `.', followed by the annotator name. The files named ANNOTATORS found in the PhysioBank database directories list the annotator names for the annotation files that are available here. The annotator name `atr' is reserved to identify reference annotation files supplied by the developers of the databases to document correct beat labels. You may use other annotator names (which may contain letters, digits and underscores, as for record names) to identify annotation files that you create. You may wish to adopt the convention that the annotator name is the name of the file's creator (a program or a person).

Top  

How can I run ... on all of the records in a PhysioBank database?

Write a shell script to iterate over the records. You can use the (text) file called RECORDS in each database directory (see the previous question) as the list of records to be processed; wfdbcat can be used to get this file from PhysioBank. For example:

  for R in `wfdbcat mitdb/RECORDS`
  do
    echo -n "Record mitdb/$R ..."
    sigamp -r mitdb/$R -a atr -p
  done

The example above runs sigamp on each record in the MIT-BIH Arrhythmia Database (mitdb). Use whatever scripting language you wish; the example is written in the standard POSIX sh scripting language and can be run in a terminal window on GNU/Linux, Mac OS X, or any other UNIX, or in a Cygwin window under MS-Windows. For a tutorial introduction to writing shell scripts, try the three-part Bash by Example series or the more comprehensive Unix/Linux Shell Scripting Tutorial [external links open in another window].

Top  

Where are the annotation, signal, or header files I just created?

WFDB applications can read from local files or directly from remote locations such as the PhysioNet web site, but they always write to local files. In order to read annotation, signal, or header files that you have written, it will usually be simplest to begin within the directory (folder) that was current when they were created.

If you use WFDB applications to create new annotation, signal, or header files, those files are created within the current working directory (or, in some cases, its subdirectories). Thus, for example, the output annotation file created by the command

wqrs -r 100s

is a file in the current directory named 100s.wqrs. If the record name contains additional path information, the output file is written in a location accessible by following that path from the current directory. For example, the command

wqrs -r mitdb/100

writes its output annotation file (100.wqrs) in the mitdb subdirectory of the current working directory. If mitdb doesn't exist in the current directory, wqrs creates it.

Applications that use the WFDB library behave this way so that their output files can be located by other WFDB applications. For example, given the above, the command

wave -r mitdb/100 -a wqrs

displays the annotations from the local file created by wqrs together with the corresponding signals from PhysioBank. Neither wqrs nor wave need to read local copies of the header or the (much larger) signal files, however. If no local copies exist, they are read directly from the PhysioNet server, using the additional path information in the record name (mitdb/, in this example) to find them.

Top  

How were the signals in PhysioBank digitized?

They come from many sources, but in all cases the signals have been digitally recorded or digitized from analog recordings. See the descriptions of the individual databases for details.

We are occasionally asked about digitizing paper ECGs and other hard-copy data. A brief survey of this subject is available here.

Top  

Should I use PhysioBank formats for my project?

Perhaps we're biased, but we think so, and here's why:

Some PhysioBank formats that are good choices for new projects are format 16 (very easy to read using almost any software, though not as storage efficient as format 212 if you have 12-bit or lower resolution) and format 212 (most used in PhysioBank, because it's ideal for widely available 12-bit resolution data, and it's still relatively easy to read). Format 8 is not recommended for new projects, because it does not preserve the DC offset when used in random-access mode, and because it limits the maximum slew rate of the signals that can be recorded. (It is supported for historical reasons, and it was devised as a way to circumvent memory and storage limits encountered in recording the MIT-BIH Arrhythmia Database.)

If you need even more storage efficiency than is provided by PhysioBank formats, consider using gzip or bzip2 to compress files stored in format 16 or 212, or (especially for a commercial product) consider SCP ECG.

If you need an easy-to-read format, and efficiency is not a concern, use rdsamp's output (text) format (see this note).

Top  


   

How can I find out what signals were recorded?

If you are looking for recordings with a specific type of signal, look first in the PhysioBank Archive Index, which indicates in general terms the types of signals in each of the available databases.

Within each database, there may be variations in the choice of signals from record to record. The pages that describe each database (see the links from the PhysioBank Archive Index) can help in locating subsets of records that contain specific signals of interest.

Each recording has an associated header file that lists (among other things) the names of the signals included in that recording. The wfdbdesc utility reads header files and produces a readable summary of their contents, including the signal names. Many other PhysioToolkit applications that read PhysioBank data are capable of printing or displaying signal names. The PhysioBank ATM shows the names of the signals belonging to the selected record (in the drop-down Signals list).

Top  

What do the signal names MLII, V2, ... mean?

Short answer: MLII and V2 are ECG signals. The names refer to the electrode positions, using standard nomenclature for lead names. MLII is "modified lead II", a bipolar lead parallel to the standard limb lead II, but acquired using electrodes placed on the torso (a requirement for long-term ECG monitoring). V2 is a precordial lead that is roughly orthogonal to MLII. These two leads are favored for many recordings, since MLII yields high-amplitude normal QRS complexes in most subjects, and V2 usually offers a nearly optimal frontal-plane projection of any ectopic beats that happen to be of low amplitude in MLII.

Long answer: Signals in PhysioBank databases have standardized names (see the previous question). Most of these are in common clinical use for designating signals such as arterial blood pressure (ABP), respiration (RESP), or heart rate (HR). Generally the pages that describe the databases in which these signals appear include definitions of any unusual signal names (see the links to these pages from the PhysioBank Archive Index).

Most ECG recordings contain two or more simultaneously recorded ECG signals, called "leads." Since the heart generates an electrical field that varies spatially as well as temporally, there is no uniquely determined (scalar) signal that offers a complete view of cardiac electrical activity. The standard practice among clinicians and researchers interested in the ECG is to record two or more signals (leads) derived using sensing electrodes placed at certain specific locations. Some leads are bipolar (they are potential differences between pairs of electrodes); others are unipolar (they are potentials measured with respect to an artificial "zero" reference potential typically derived by summing potentials measured at multiple locations). Confusingly, the wires that connect the electrodes to the recording equipment are also called "leads"; thus, for example, a five-lead (five-wire) harness is generally used to record a two-lead (two-signal) ECG!

The three Einthoven bipolar limb leads (designated I, II, and III) are determined by the pairwise potential differences between electrodes placed on the left arm (LA), right arm (RA), and left leg (LL); specifically, lead II is the potential difference between LL and RA. In most subjects, the axis defined by these points is roughly parallel to the mean cardiac electrical axis, so it is a lead in which QRS complexes are typically observable at nearly maximum amplitude.

In long-term ECG recordings (including most of those on PhysioNet), limb leads are not generally used, since physical activity causes significant interference in these leads. Commonly, equivalent "modified" leads are used, with electrodes placed on the torso in positions chosen so that the signals closely match the limb leads. This is possible because the cardiac electrical field is (to a good approximation) a time-varying dipole field, so that it is generally sufficient to choose positions that allow one to observe the same projections of the dipole field onto the axes defined by the limb leads.

For MLII (modified lead II), the LL equivalent electrode is ideally placed at the left iliac crest, and the RA equivalent electrode is ideally placed in the infraclavicular fossa, medial to the border of the deltoid muscle and 2 cm below the lower border of the clavicle.

Additional information about ECG lead systems can be found in many textbooks about electrocardiography. A clear and comprehensive discussion can be found in chapter 15 of Bioelectromagnetism: Principles and Applications of Bioelectric and Biomagnetic Fields by J Malivuo and R Plonsey (Oxford University Press, 1995; also available on-line here).

Top  

What do the signal names 'signal 0', 'signal 1', ... mean?

As noted in the answers to the previous two questions, signal names recorded in header files describe the signals in each record. In rare cases, however, this information is missing, usually because it was not preserved when the signals were recorded. Names of the form 'signal N' are default signal names used in such cases; they may appear in header files explicitly, or they may be displayed by PhysioToolkit software if no explicit signal names appear in a header file.

Top  

What is the format of the signal files?

Many formats are supported. Most signal files are written in "format 212", in which two 12-bit samples are bit-packed into three 8-bit bytes, or in "format 16", in which a 16-bit sample is written as two bytes, least significant byte first ("little-endian"). See signal(5) in the WFDB Applications Guide for details on formats 16 and 212 and on the other supported formats.

To determine which format is used for a given signal file, look in the associated header file. (This is a text file that usually has the same name as the signal file, except for a suffix of .hea instead of .dat.) Each line of the header file that begins with the name of the signal file describes the format and contents of a signal within the signal file. See header(5) in the WFDB Applications Guide for details.

Top  

How can I read signal files?

If you would like to read signal files within a C, C++, Fortran, Java, Perl, or Python program, see the WFDB Programmer's Guide for information on doing this using the WFDB library. Other programming languages supported by SWIG may also be usable with the WFDB library, but have not been tested. Briefly, use isigopen() to open the files, and getvec() to read them.

If you would like to do this within a Matlab or Octave program, we recommend using the WFDB Toolbox for MATLAB. For an overview of this solution and a variety of alternatives of varying degrees of complexity, see Reading and writing PhysioBank and compatible data on the Contributed software for Matlab and Octave page. Note that Matlab and Octave are not able to import most signal files directly; for exceptions, see wfdb2mat.

Another possibility is to convert the portions of interest into text format using rdsamp (described in detail in How to obtain PhysioBank data in text form). To save rdsamp's output in a file, or to read rdsamp's output using another program, see this note. Alternatively, segments of up to 100,000 samples in length of signal files found on this web server can be converted into text using the PhysioBank ATM, which can be accessed using your web browser. This may be useful if you wish to read signals using Excel or another spreadsheet (although spreadsheets in general are not recommended as tools for signal processing, visualization, or analysis; there are much better choices freely available in PhysioToolkit).

MS-Windows Media Player and similar software for reading audio and multimedia files cannot be used to read these files (or any others in PhysioBank).

Top  

How can I use Matlab's import feature to read signal files?

You can't in general, because Matlab doesn't know how to figure out which of the many supported formats is used in any given signal file, because it can't understand the most commonly used formats in any case, and because (in many cases) signal files are orders of magnitude larger than any matrix that Matlab can handle.

You can export a segment of signal files up to a million samples in length as a .mat file readable by Matlab or Octave, using the PhysioBank ATM. Longer segments may be difficult to handle, but you can make them if you wish using wfdb2mat, included in the WFDB software package and used by the PhysioBank ATM. The .mat files produced in this way can be read and plotted using plotATM.m.

See How can I read signal files? for a variety of other ways to read signal files from Matlab without using its import feature.

Top  

Is there any direct way of converting sample values to physical units using wfdb2mat?

The program plotATM.m reads the output of wfdb2mat and converts the raw samples into physical units. The conversion is very simple and easily incorporated in your own Matlab or Octave code if you prefer for whatever reason not to use plotATM.m; see plotATM.m for details.

wfdb2mat doesn't do this conversion itself since this would (1) increase the the size of the generated .mat files by a factor of 4 or 8 (depending on the ADC resolution), (2) slow down and significantly complicate wfdb2mat, because of the need to convert from native (IEEE 754 on most platforms) floating-point format to the VAX floating-point format required by Matlab, (3) make the .mat files incompatible with WFDB applications, and (4) make it unnecessarily difficult to distinguish the effects of quantization error from other sources of noise in the signal, for those who might wish to do so.

Top  

How can I use Excel's import feature to read signal files?

You can't, because Excel doesn't know how to figure out which of the many supported formats is used in any given signal file, because it can't understand any of those formats in any case, and because signal files are almost always orders of magnitude larger than any spreadsheet that Excel can handle. Spreadsheets are not suitable for studying, visualizing, or analyzing digitized signals; many better tools are freely available in PhysioToolkit.

If, despite the above, you still wish to read a piece of a signal file into a spreadsheet, see How can I read signal files?.

Top  

Where do I get rdsamp?

It's part of the free, open source WFDB Software Package. Both C sources and binaries for several popular operating systems are available.

Top  

How do I use rdsamp?

Install it, then type

rdsamp

for a brief summary of options. For details, see rdsamp(1) in the WFDB Applications Guide.

The output of rdsamp is in text format. Unless you have used the -v option, the output contains data only (no column labels) and can be plotted directly using, for example, plt. The first column contains sample numbers (or elapsed times in seconds and milliseconds if you have used the -p option), and each of the remaining columns contains samples for one signal (in raw ADC units unless you have used the -p option).

To save rdsamp's output in a file, or to read rdsamp's output using another program, see this note.

Top  

What do the sample values represent?

Analog-to-digital converters (ADCs) are usually used to produce PhysioBank signal files, which consist of sequences of integer samples in unscaled analog-to-digital converter units (adus). Samples are stored in this way not only because doing so usually requires less space than most alternatives, but also because this scheme introduces no loss of precision beyond the quantization error of the ADC. By default, rdsamp outputs sample values in unscaled adus (raw ADC units).

The header file for each record contains fields that describe the characteristics of each signal and of the ADC used to digitize it. These fields include the signal type (such as ECG, ABP, or SpO2), the physical units of the original analog signal (such as mV, mmHg, or degreesC), the gain (the number of adus per physical unit), the baseline (the sample value that would correspond to a physical value of zero, which is often but not always at the center of the ADC range, and may even lie outside of the ADC range), the adczero (the sample value at the center of the ADC range, which is 0 for a bipolar ADC and a non-zero value for an offset binary ADC), and the number of bits of ADC precision (most often 12 for PhysioBank recordings). Taken together, the values specified for these parameters allow identification of the signals, conversion of sample values from raw ADC units to baseline-corrected physical units and back again, and calculation of the ADC range in raw or physical units.

Using rdsamp's -p (physical unit output) option, or using the PhysioBank ATM (which uses rdsamp), the sample values are presented in baseline-corrected physical units. The signal types and units used appear in the first two lines of rdsamp 's output when using rdsamp's -v (verbose output) option. Note that in these cases, the sample values are given to exactly three decimal places by default regardless of the precision of the integer samples, although additional precision can be obtained using rdsamp's -P option.

The database of Evoked Auditory Responses in Normals across Stimulus Level is the first (and so far, the only) PhysioBank database containing 24-bit and 32-bit signals. All other current PhysioBank databases were recorded using ADCs with resolutions of 16 bits or fewer. A raw sample value of -32768 has a special meaning: it signifies that no valid observation of the associated signal was made during the corresponding sampling interval. This value is the most negative number that can be represented in 16 bits, so (in data with 16 or fewer bits of ADC resolution) it is less than any valid sample value.

Top  

What does the message "init: can't open header for ..." mean?

This message can be produced by any application linked to the WFDB library, including rdsamp and rdann. In order to read data files, these applications need to find a header (.hea) file for the input record you specify. The message indicates that the header file was not found in any of the expected places, or that it was unreadable. There are three common reasons why this can happen:

Top  

I can't run rdsamp. Can you please send me a copy of ... in text format?

Yes. Go to the PhysioBank ATM, request the data of interest, and save the results in your browser.

Top  

How can I get more than 100,000 samples?

The PhysioBank ATM "Show samples as text" and "Export signals as CSV" tools are intended to offer short segments of data in text form without the need for anything more than a web browser. They are not intended to be methods for obtaining large amounts of data.

The ATM's "Export signals as .mat" tool also limits the amount of data converted, but since .mat format is significantly more compact than text or CSV formats, the limit is 1,000,000 samples per signal. Larger amounts may be difficult to load into Matlab.

You may download an entire record in EDF or PhysioNet (binary) formats using the ATM's "Export signals as EDF", "Make tarball of record", or "Make zip file of record" tools (or by simply downloading the files you need from the PhysioBank archive with your web browser).

If you need more data in text, CSV, or .mat format than allowed by the ATM's tools, first convert a short segment in the desired format, then read the notes immediately below the ATM's control panel to learn how to run the ATM's format conversion applications (rdsamp and wfdb2mat) on your own computer, without limitations on the length of the output. If you install the WFDB Software Package and run rdsamp on your own computer, for example, you can convert entire records to text if you wish.

The PhysioBank ATM limits you to 100,000 samples in text or CSV formats at a time, because signal files converted to text can be very large, and reading them would not be possible with standard web browsers. If you do not wish to use any of the alternatives above, you may concatenate successive segments obtained by multiple requests to the ATM; this will allow you to obtain the data you wish without significantly affecting other PhysioNet users or crashing your web browser.

Top  

How can I create a PhysioBank-compatible record from my own data?

There are many ways to create a PhysioBank-compatible record. Here is an easy way to do so:

  1. If your signals are still in analog format, digitize them. For ECGs, we recommend using a sampling frequency of at least 120 Hz, with at least 8 bit resolution over a ±5 mV range (ideally, 250 Hz to 1 KHz, with 12 bit or higher resolution over a ±10 mV range). As is necessary whenever digitizing any signal, use an appropriate antialiasing filter (a low-pass filter with a cutoff no higher than about 40% of the sampling frequency).
  2. Write the samples into a file in text form, as a column of decimal numbers. If you have digitized more than one signal, use a separate column for each signal. (The software that was used to digitize your signals may include a means for doing this.)
  3. Read about wrsamp to see how to prepare a binary signal file and a header file from the text file. Typically, you will need to use a command such as
      wrsamp -F 128 -G 102.4 -i data.txt -o Rec01 0
    This example reads a text file named data.txt, and creates the files needed for a record named Rec01, namely, a signal file named Rec01.dat and a header file named Rec01.hea. (See What are .dat, .hea, .atr, .qrs, ... files? for definitions of signal and header files.) The arguments of -F and -G specify that the signal was sampled at 128 Hz and that the signal was amplified in such a way that a step of 1 millivolt would appear as sample values that differ by 102.4 units. The final argument (0) indicates that the leftmost column in the input (column 0) contains the data.

Records that belong to PhysioBank never have names that include upper-case characters, so you may wish to follow the example above and include at least one upper-case character in the names of any records you create, to avoid any possibility of confusing them with PhysioBank records.

There are shortcuts that may be useful if your data happen to be in a format for which a converter is available:

If you wish to annotate your record, see How can I annotate a record?

Top  


   

What is an annotation?

Informally, an annotation is a note about some feature of a signal. On this web site, an annotation is a tag (label) that "points" to a specific sample of a digitized recording.

Most PhysioBank databases include annotations for each record. In some cases, these may be reference annotations that have been independently reviewed by one or more (usually, two) human experts; in others, they may be machine annotations generated by automated signal-processing and analysis software. The documentation for each database indicates what types of annotations are available.

Usually, annotations mark events that are localized in time (such as individual heart beats); sometimes, they are used to indicate persistent attributes (such as the beginning of a period of sleep). In recordings that contain two or more simultaneously recorded signals, an annotation can "point" to all signals at once, or to a specific signal.

Each annotation can be thought of as an object having six attributes: the time (the number of sample intervals that precede the sample that the annotation marks); an annotation type (anntyp [sic], usually displayed as a mnemonic annotation code; see the next question); three numeric attributes (subtyp [sic], chan, and num); and an optional string (the aux string). Only the time attribute has a fixed meaning; all of the others can be redefined to fit the characteristics of the data and the needs of the investigator.

Annotations are kept in files that exist independently of the signals that they annotate; this means, among other things, that multiple sets of annotations (created by different applications or people) can coexist, and that annotations can be read even if the signals to which they refer are not available.

Within an annotation file, annotations are stored in a compact binary format. See the questions and answers below for information about reading annotation files.

Top  

What do the annotation codes (N, V, S, F, ...) mean?

WFDB applications such as those used by the PhysioBank ATM display annotations using these and other codes. When these codes are used to annotate ECGs, N is a normal sinus beat, V is a ventricular ectopic beat, S is a supraventricular ectopic beat, and F is a fusion of a normal beat and a ventricular ectopic beat. These and many others are described here.

Top  

What is the format of the annotation files?

Most annotations occupy two bytes, of which 10 bits contain the time interval (in units of sample intervals) from the previous annotation, and 6 contain an annotation type code. Special type codes allow for annotations at intervals that exceed 1023 sample intervals, and for other numeric and text fields to be associated with individual annotations. See annot(5) in the WFDB Applications Guide for details.

Top  

How can I read annotation files?

If you would like to read annotation files within a C, C++, Fortran, Java, Perl, or Python program, see the WFDB Programmer's Guide. Other programming languages supported by SWIG may also be usable with the WFDB library, but have not been tested. Briefly, use annopen() to open the files, and getann() to read annotations from them.

If you would like to do this within a Matlab or Octave program, we recommend using the WFDB Toolbox for MATLAB. For an overview of this solution and a variety of alternatives of varying degrees of complexity, see Reading and writing PhysioBank and compatible data on the Contributed software for Matlab and Octave page. Note that Matlab and Octave are not able to import annotation files directly.

Another possibility is to convert the portions of interest into text format using rdann (described in detail in How to obtain PhysioBank data in text form). rdann can be downloaded as part of the WFDB Software Package and run on your own computer. To save rdann's output in a file, or to read rdann's output using another program, see this note. Alternatively, annotation files found on this web server can be converted into text using the PhysioBank ATM, which can be accessed using your web browser. The format of rdann's text output is described below.

Top  

What does the error "annopen: can't read annotator ... for record ..." mean?

This message can be produced by any application linked to the WFDB library that attempts to read annotation files. In order to do so successfully, these applications need to find the annotation file for the annotator and input record you specify. The message indicates that the annotation file was not found in any of the expected places, or that it was unreadable. There are several common reasons why this can happen:

Top  

Where do I get rdann?

It's part of the free, open source WFDB Software Package. Both C sources and binaries for several popular operating systems are available.

Top  

How do I use rdann?

Install it, then type

rdann

for a brief summary of options. For details, see rdann(1) in the WFDB Applications Guide.

The format of rdann's output is described in the answer to the next question.

To save rdann's output in a file, or to read rdann's output using another program, see this note.

Top  

What are the columns in rdann's output?

If you add the -v option at the end of the command line, rdann prints a set of column headings above the first annotation line.

The output contains one annotation per line; from left to right, each line contains the time of the annotation in hours, minutes, seconds, and milliseconds; the time of the annotation in sample intervals; a mnemonic for the annotation type; the annotation subtyp [sic], chan, and num fields; and the auxiliary information string, if any. The meanings of the annotation type mnemonics and of the other fields are discussed here.

For example, if we read the first five seconds of the reference (atr) annotations for record 200 of the MIT-BIH Arrhythmia Database using the command

rdann -r mitdb/200 -a atr -t 5

then we obtain this output:

    0:00.186       67     +    0    0    0      (B
    0:00.625      225     V    1    0    0
    0:01.352      487     N    0    0    0
    0:01.913      689     V    1    0    0
    0:02.677      964     N    0    0    0
    0:03.186     1147     V    1    0    0
    0:03.980     1433     N    0    0    0
    0:04.472     1610     V    1    0    0

Each of these eight lines contains one annotation. The third column shows the annotation mnemonics, and by referring to the table of mnemonics.we can see that the '+' in the first annotation indicates that it marks the underlying rhythm of the beats that follow; the rhythm type is ventricular bigeminy, specified by the "(B" that appears in the aux field at the end of the line; see this table for descriptions of rhythm annotation strings such as "(B". The remaining seven lines each mark a QRS complex, associated with either a normal (N) or premature ventricular (V) beat. Times in the first and second columns indicate when the events marked by the annotations occur. For example, the first V beat occurs 0.625 seconds (625 milliseconds), or 225 sample intervals, after the beginning of the record. (A quick calculation shows that one sample interval is 2.777... milliseconds for this record, or that its sampling frequency is 360 Hz. Sample intervals may vary between records.) The subtyp, chan, and num fields in columns 4, 5, and 6 are usually zero in reference annotation files, but occasionally one or more of these fields is used to indicate additional information, as in this case, in which the subtyp field in the V annotations indicates which of several ventricular ectopic beat morphologies has occurred. See the documentation for the associated database to see how to interpret these fields.

In some cases, the times in the first column may be enclosed in square brackets [like this]. This format indicates that the times are given as times of day (in the local time zone where the recording was made). Bracketed times may also include the date (in DD/MM/YYYY format), if this information is available. If the time of the beginning of the recording is not available, the times in the first column are not bracketed, and in this case they represent the elapsed time from the beginning of the recording.

Top  

I can't run rdann. Can you please send me a copy of ... in text format?

Yes. Go to the PhysioBank ATM, request the data of interest, and save the results in your browser.

Top  

How can I create an annotation file?
How can I annotate a record?

If the signals you wish to annotate are not already in a PhysioBank-compatible format, including .dat and .hea files, follow the instructions in How can I create a PhysioBank-compatible record from my own data?

If your record contains ECG or blood pressure signals, you may wish to make a beat annotation file. There are several ways to create one using PhysioToolkit software:

You should always review the beat annotation file generated by any of these detectors; although all of them work well in most cases, there is wide variability among recordings, and any detector will make errors if the data quality is insufficient. There are, once again, several ways to do this:

All four of these detectors mark all detected beats as normal (N). If your record includes abnormal beats, change the N annotations for these beats to the correct annotations (a complete list of annotation types can be found here). This can be done manually using WAVE.

Another possibility is to use OSAS, free software for QRS detection and beat classification available from its author (follow the link for details). This may be particularly helpful if your records contain more than a handful of abnormal beats, since OSAS can find most abnormal beats and annotate them appropriately, but it is still necessary to review the automatically-generated annotation file and correct any errors.

If you have annotations (or their equivalent) that must be converted into PhysioBank-compatible annotation file format, it may be easiest to convert them first into a text format that can be read by rr2ann, which can then be used to produce the desired (binary) annotation file. If you wish to use any of the optional annotation attributes (subtyp, chan, num, or aux), rr2ann will not be sufficient. In this case, you may wish to convert your data first into rdann's output (text) format; this can be read as input by wrann, which will convert the data into PhysioBank-compatible annotation format. If you do this, note that the first column (time in hours, minute, and seconds) must be present but need not be valid, since wrann determines the annotation times from the second column (time in sample intervals); note also that entries in the last column may be omitted for any annotations that have an empty aux field. Both rr2ann and wrann read text-format data from their standard input.

Top  


   

I double-clicked on the program icon, and nothing happens!
I typed the program name in the 'Run...' dialog, and nothing happens!

Don't do this!

With few exceptions, PhysioToolkit applications run in text mode (i.e., they do not include a graphical user interface). These programs are intended to be run within a terminal emulator using a command-line interface. In most cases, if you attempt to run them by clicking on their icons or names, or by entering the program name in the MS-Windows Run... dialog box, these programs will open a DOS box, print a usage summary, and exit, usually much too fast for you to read anything.

By far the best way to use these programs under MS-Windows is to install a Unix-compatible terminal emulator and shell in which to run them. The best of these is also free; if you have not already done so, download and install the Cygwin software package. This package includes bash (the GNU Bourne Again Shell), and a terminal emulator in which to run it. After a standard installation of Cygwin, you can launch a terminal emulator and bash by clicking on the Cygwin icon that will have been installed on your desktop.

If you do not wish to use Cygwin, it is possible to run text-mode applications under MS-Windows within a DOS box, but there are many limitations of command.com that may prove frustrating. In particular, command.com supports a relatively small space for environment variables that is not secure against buffer overruns, and has idiosyncratic filename globbing behavior.

Top  

What is a "standard input" or a "standard output"?

These concepts are common to all text mode applications (see the previous question). A program's standard input is whatever it reads from the keyboard (i.e., whatever you type into its terminal emulator window once the program begins to run). A program's standard output is whatever it prints in its terminal emulator window. There are (of course) exceptions, and the exceptions are what make these ideas useful!

First, it's possible to redirect either or both of the standard input and the standard output before the program begins to run, by adding appropriate parameters to the command line. So, for example, a program named pour can read its standard input from a file named teapot, and then write its standard output to another file named teacup, using a command such as:

pour <teapot >teacup

(For an explanation of this command, see the answer to the next question.)

Second, most applications have an additional standard error output that is also printed in the terminal window, intermingled with the standard output. The standard error output is reserved for warning and error messages. If you redirect the standard output to a file, the standard error output still appears in the terminal window (and is not copied into the file). In most cases, this is useful behavior, since it allows you to see quickly if there have been any errors or warnings without the need to look through what may be lengthy output. If you wish, you can capture the standard error output in its own file using a command such as:

frobnicate <input >output 2>errors

Top  

How can I save the output of ... in a file?
How can one program read another's output?

If you are running programs from a command prompt (by typing commands into a terminal emulator window or an MS-DOS box), these things can be done easily.

If you have ever used GNU/Linux, Unix, or MS-DOS, you may have captured the output of a program by redirecting it to a file, like this:

foo >bar

The > operator redirects foo's standard output (which would normally appear on-screen) into a file named bar. If bar exists already, its contents are replaced. If you wish to append foo's output to whatever is already contained in bar, use a command such as this instead:

foo >>bar

There is an analogous operator that arranges for a program's standard input (which would normally be read from whatever you type on the keyboard) to be read from a file instead:

baz <bar

Here, the < operator arranges for baz to read its input from a file named bar. If bar was created by foo, then this command allows baz to read foo's output.

You can combine input and output redirection in a single command using the pipe (|) operator:

foo | baz

This command runs foo and sends its standard output directly to baz, without requiring an intermediate file. True multitasking operating systems such as Unix, GNU/Linux, and Mac OS X allow both programs to run (apparently) simultaneously; under MS-DOS or MS-Windows, the first program runs to completion before the second one begins execution.

You can use these techniques whenever you run programs from a command prompt, whether those programs are among those available here or obtained from some other source. You can use the same techniques with programs you write yourself; the only requirement is that your programs must read from the standard input and write to the standard output (i.e., they must not attempt to bypass the standard input/output mechanism by reading directly from the keyboard or writing directly to the screen).

These operators (>, >>, <, and |) are supported by all shells (command interpreters) under Unix, GNU/Linux, Mac OS X, and MS-DOS (including those that run within MS-DOS boxes or other types of terminal emulators under MS-Windows). For further information, please refer to the documentation for your shell or command interpreter.

Top  

My question is about WAVE (or gtkwave). Is there a WAVE FAQ?

Yes, look here for answers to many frequently asked questions about WAVE. The gtkwave project is no longer active, since WAVE now runs on all of the popular platforms, including those formerly supported by gtkwave only.

Top  

I tried to compile ... but the compiler can't find wfdb.h (or ecgcodes.h, or ecgmap.h).

These files are included with the WFDB library. Most of the PhysioToolkit applications use at least one of them; if you are trying to compile such an application, you will need to have installed the WFDB library and its *.h files first. The easiest way to do this is to install the WFDB Software Package, which includes the WFDB library and many of the PhysioToolkit applications. Find instructions for doing so in the quick start guide for your platform (FreeBSD, GNU/Linux, Mac OS X (Darwin), MS-Windows, and Solaris), or on the WFDB Software Package introductory page.

If you have already installed the WFDB Software Package and your compiler is still complaining, the WFDB *.h files may not be installed in any of the directories where your compiler is looking for them. Use wfdb-config to find out where they are.

Top  

I tried to compile ... but the compiler complains that isigopen (or iannopen, or strtim, or wfdbinit) is undefined.

These are among the functions defined in the WFDB library; most of the PhysioToolkit applications use at least one of these functions. If you are trying to compile such an application, it must be linked to the WFDB library. If you have not yet installed the WFDB library, see the answer to the previous question.

For details on how to link to the WFDB library, see Compiling a Program with the WFDB Library in the WFDB Programmer's Guide.

Top  

I'm writing a program to work with PhysioBank data, but my compiler can't link to the WFDB library. What should I do?

If you are using one of the precompiled versions of the library, be sure that you have the correct version for use with your compiler and operating system. If there is none available, you have two reasonable choices:

Top  

Where on this site can I find software for my favorite operating system/compiler?

Look in PhysioToolkit, the repository for all software available on this site. With very few exceptions, the software available here is portable among all popular operating systems, including GNU/Linux, Mac OS X, MS-Windows, and Unix. Since all of it is provided in source form, you can compile it (using free or proprietary compilers) into binaries that can run under any of these operating systems.

For convenience, some PhysioToolkit software is also available as ready-to-run binaries for a variety of operating systems.

Generally, the same sources can be compiled without modification under any supported OS or compiler; you will not find separate sets of sources for different compilers or platforms. Following conventions used by most free or open-source software, look for files named README or INSTALL in each software package; these files indicate what's included in the package, and how to compile it from the sources.

Most PhysioToolkit software is written in portable (ANSI/ISO standard) C. ANSI/ISO C code can be compiled by all standard C++ compilers. There is a small amount in other languages, including Fortran 77 and Matlab/Octave m-code. If you don't have a C or C++ compiler, we strongly recommend the excellent and free GNU Compiler Collection (gcc), which includes C, C++, and Fortran 77 compilers (among others), and is available for a vast range of platforms, including GNU/Linux, Mac OS X, MS-Windows, and all versions of Unix.

If you wish to write your own software to work with PhysioBank data, the WFDB library provides standard, portable interfaces in C, C++, and Fortran for doing so. The wfdb-swig package provides Perl, Python, Java, and C# interfaces to the WFDB library. Matlab can use any of several compatible APIs.

Although it is possible to compile PhysioToolkit software using proprietary compilers, you are generally on your own if you choose to do so; we don't use these compilers ourselves, and we can't help you learn how to use them.

Top  

Can I use your code in my commercial application?

Yes. There are two different categories of PhysioToolkit code, and the rules for using them are slightly different.

The WFDB library is free under the GNU Lesser General Public License (LGPL). The LGPL permits you to use (or sell, or give away) the library with your own code. The only significant restriction is that you must make the sources for the library itself freely available. You do not need to disclose the sources for your own code simply because you have used the WFDB library with it.

All of the remaining PhysioToolkit software (the applications) is free under the GNU General Public License (GPL). What this means in simple terms is that you can sell it or give it away to others, but if you do so, you must distribute the sources under the same terms as those under which you received them.

If you incorporate GPL code into your own code, the resulting code must be distributed under the GPL or not at all; this is the so-called "viral" property of the GPL. What this implies is that you cannot simply make minor (or even major) modifications to free code and then sell it without honoring the original terms under which you received it.

There are ways to use GPL code together with proprietary code, however. For example, software that reads output from a GPL program (or that writes data to be read by a GPL program) does not automatically fall under the GPL. As another example, you may incorporate GPL code in a plugin for a proprietary program, but the sources for the plugin itself would have to be made available under the GPL.

Contributors of software may choose another license conforming to the Open Source Definition (OSD), so that, in the future, other licenses may apply. Other OSD licenses have provisions very similar to those outlined above.

Top  

How should I report a bug?

First, be sure that it is a bug. Try to reproduce it. Try doing so on another computer if possible.

If you have not read How to Report Bugs Effectively, please take a few minutes to do so. (Important: do not send bug reports about PhysioNet to the author of How to Report Bugs Effectively; he is an innocent bystander.)

Bug reports should provide enough specific information to permit duplicating your problem. At a minimum, this information includes:

  1. the name and version number of the software in which you found the bug, and the location on PhysioNet where the software can be found
  2. the name and version number of your operating system (e.g., Mac OS X 10.4, Fedora Core 4, Windows XP Professional)
  3. the exact command or sequence of events needed to replicate the problem
  4. an exact copy of any text output, including any errors or warning messages encountered
  5. the symptoms of the bug (how the output varied from what you expected, e.g., "v0 is smaller than it should be by a factor of 400")

Do not send binary input or output files or core dumps unless requested. If you can reproduce the problem using input data available on PhysioNet, please tell us how to do so.

Carefully written bug reports are very valuable to us; we want our software to work reliably, we are grateful for information that helps us to fix defects, and we acknowledge the help of those who send us useful bug reports. If you wish to remain anonymous, please let us know when you write.

If you are able, by inspection of the sources, to locate the cause of a problem, tell us what you discover. If you can fix the problem yourself, send us a patch against the latest sources. These things, though very much appreciated, are not essential components of a useful bug report, however; what is essential is an accurate description of the symptoms of the bug. In some cases, what we think of as a feature may be what you think of as a bug; please help us understand what looks wrong to you. Without this context, a patch may be of little use to us.

All software on this site is provided in source form. If the documentation for the software in which you have found a bug does not provide an email address for bug reports, find it at the top of the source file. Please send all bug reports to both the author/maintainer and PhysioNet.

Top  


   

Some links don't work, but I don't see any error. Why not?

On PhysioNet, links to external sites (URLs that point outside of the PhysioNet domain) are designed to open the external URL in a separate window or tab. In most cases, this window or tab will open on top (in front) of the window that contains the link, but your browser and your window manager or operating system may override this behavior, especially if the second window was already open and was hidden. If clicking on a link doesn't seem to do anything, check to see if there is a second browser window that is hidden behind other windows, or iconified (minimized, closed).

If you are sure that a link is broken, please send a note about it to webmaster@physionet.org.

Top  

I'm having trouble viewing images on this site. Why?

Most of the graphics on PhysioNet, including all of the dynamically generated graphics, are PNG images. PNG has been a W3C recommendation since 1996, and is one of only three standard image types that are rendered by all current graphical web browsers, most of which have supported PNG for ten years or more. The other supported types are JPEG (which uses lossy compression and is best suited for continuous-tone graphics such as photos) and GIF (which uses a lossless compression algorithm that is inferior to PNG's). If you have an obsolete browser, upgrading it should fix this problem.

The QuickTime plugin sometimes interferes with some browsers' built-in capability of rendering PNG images, however, notably when using MSIE. To avoid these problems, update or uninstall QuickTime, or use another browser such as Chrome or Firefox.

Top  

I'm having trouble printing PostScript files from this site. Why?

PostScript versions of books and papers available here are ready to be printed on a PostScript printer without any additional formatting. Some users have experienced problems, particularly with older PostScript versions of the WFDB Applications Guide (which consists of several PostScript documents concatenated together into one file). Software that attempts to insert additional PostScript code, or that attempts to reformat these files rather than simply printing them as is, is generally the cause of these difficulties. MS-Windows users can use GSView to view or print these files. Under UNIX, GNU Linux, or Mac OS X, simply print the files using lp or lpr, or view them using gv. (Both GSView and gv require GhostScript to render PostScript or PDF input.) If your printer has insufficient memory, it may stop after printing part of the file; in this case, try using GSView or gv to print the file in sections.

If a PDF version of the file is available here, you may also wish to try printing it using GSView, gv, or xpdf (all three of these are free and open-source), or Adobe Acrobat Reader (free binaries, closed-source).

Top  

I don't understand how to use the software or data on this site.

Go back to the beginning of this FAQ and read it carefully. Still confused? Read on....

Are you looking for something specific? Examples might include:

If so, try using the Search tool. All text on the PhysioNet web site is indexed and can be found by searching for it. To do this, type one or more terms related to your topic or question into the search box below, then click on the "Search" button to its right:

A similar search box and button appear in the top right corner of this and almost every other page on PhysioNet.

Have you found something relevant to your interest, but don't know how to use it? If so, look for tutorial materials that can help you get started. Browse through PhysioNet's list of tutorials, or use a PhysioNet search to find information to help you get started.

If you have a question about a specific page in the PhysioNet web site, click on the "webmaster@physionet.org" link at the bottom of that page; doing this opens a preaddressed email window, with the URL of the page filled in as the subject, which will help us to understand the context of your question and to give you a relevant answer.

Before writing, please formulate specific questions ("I don't understand how to use the data." or "The software doesn't work, please help me!" are examples of non-specific questions that cannot be usefully answered). Whoever replies to your question cannot read your mind; if you don't say clearly what you need to know, you will not get a satisfactory answer.

Don't be offended if the reply to your question is "Read the FAQ!" (this page). If the answers aren't here, or if they aren't clear, write again, and try to be more specific, or to point out what's confusing or missing in the FAQ. Doing so will not only help you to get a useful answer, but it will also help us to write a better FAQ.

Top  

What's a man page?

A man page is a concise description of how to use a piece of software, intended to be read using man or one of its work-alikes. Think of man pages as pages from a reference manual.

All Unix platforms, including GNU/Linux and Mac OS X, as well as Cygwin/MS-Windows, include a program called man that can be used to find and display man pages. This is the standard form of documentation for all Unix software. Almost all PhysioToolkit applications have man pages. The near-universality of man pages means that you are very likely to be able to learn about any program by typing its name as an argument to a man command, as in:

man tar

which will display the man page that describes the standard tar command. On most platforms, the output of man is sent through a program such as more, which allows you to read it one screenful at a time; you may usually advance to the next screenful by pressing the space bar, go back a screenful by typing 'b', or exit by typing 'q'.

The format of man pages is fairly rigid, which allows a variety of software to extract useful information from them for purposes of indexing, cross-referencing, etc. They are not intended as tutorial material, but once you are familiar with their format, reading them is usually the quickest way to learn how to use the software they document.

The largest collection of man pages on PhysioNet is the WFDB Applications Guide, which includes not only the man pages for the roughly 70 applications in the WFDB Software Package, but also those for a number of contributed applications that are compatible with WFDB Software. These pages can be read within your web browser; if you download and install the WFDB Software Package on your own computer, you can use man to read the local copies of these man pages that will be installed together with the software itself.

A distinctive feature of PhysioNet's man pages is the Sources section at the end of each one, with one or more URLs that give the location of the software sources. This feature is particularly handy if you are reading a man page in your web browser and would like to refer to the source of a program in order to see how it is implemented.

The Computer Science Department of McGill University offers a gentle introduction to reading man pages that will help you get started if you haven't used man pages previously.

Although you won't find the acronym "RTFM" used elsewhere on this site, it refers to the usefulness of Reading The Fine Manual (i.e., the man pages) to inform yourself about software that may be unfamiliar. Try it!

Top  

Are PhysioNet or its mirror maintainers responsible for the content of external sites?

No.

Top  

Why isn't my question here?

This FAQ is revised frequently, and we may not have got to your question yet. It's possible that yours is an infrequently, maybe even never-before, asked question. No matter, we'd like to hear it, and we'll try to answer it as quickly as possible. Please send us feedback by following the link below!

Top  

What's the magic word?

"Please."

Top