# Term-Preterm EHG Database

Published: Aug. 21, 2012. Version: 1.0.1

Gašper Fele-Žorž, Gorazd Kavšek, Živa Novak-Antolič and Franc Jager. A comparison of various linear and non-linear signal processing techniques to separate uterine EMG records of term and pre-term delivery groups. Medical & Biological Engineering & Computing, 46(9):911-922 (2008).

Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.

### Abstract

The Electrohysterogram records (uterine EMG records) included in the Term-Preterm ElectroHysteroGram DataBase (TPEHG DB) were obtained from 1997 to 2005 at the University Medical Centre Ljubljana, Department of Obstetrics and Gynecology. The records were obtained during regular check-ups either around the 22nd week of gestation or around the 32nd week of gestation. The women participating in the study represented a sample of the general population. In all, almost 1300 records were obtained during these years, and a preliminary database was built and used for studies by Ivan Verdenik, Gorazd Kavšek, Marjan Pajntar and Živa Novak-Antolič [1],[2].

### Data Description

The TPEHG DB posted here contains 300 uterine EMG records from 300 pregnancies (one record per pregnancy) carefully selected from the original database of which:

• 262 records were obtained during pregnancies where delivery was on term (duration of gestation at delivery > 37 weeks):
• 143 records were obtained before the 26th week of gestation and
• 119 were obtained later during pregnancy, during or after the 26th week of gestation;
• 38 records were obtained during pregnancies which ended prematurely (pregnancy duration ≤ 37 weeks), of which:
• 19 records were obtained before the 26th week of gestation and
• 19 records were obtained during or after the 26th week of gestation.

During the selection of records, all records with apparent recording artifacts, all records from pregnancies where labor was induced, and all records where delivery was performed using a Cesarean section, were rejected.

Each record is composed of three channels, recorded from 4 electrodes:

• the first electrode (E1) was placed 3.5 cm to the left and 3.5 cm above the navel;
• the second electrode (E2) was placed 3.5 cm to the right and 3.5 cm above the navel;
• the third electrode (E3) was placed 3.5 cm to the right and 3.5 cm below the navel;
• the fourth electrode (E4) was placed 3.5 cm to the left and 3.5 cm below the navel.

The differences in the electrical potentials of the electrodes were recorded, producing 3 channels:

• S1 = E2–E1 (first channel);
• S2 = E2–E3 (second channel);
• S3 = E4–E3 (third channel).

The individual records are 30 minutes in duration. Each signal has been digitized at 20 samples per second per channel with 16-bit resolution over a range of ±2.5 millivolts.

Each signal was digitally filtered using 3 different 4-pole digital Butterworth filters with a double-pass filtering scheme. The band-pass cut-off frequencies were:

• from 0.08Hz to 4Hz;
• from 0.3Hz to 3Hz;
• from 0.3Hz to 4Hz.

The records in the database contain both the original and filtered signals. The records are in WFDB format. Each record consists of two files, a header file (.hea) containing information regarding the record and the data file (.dat) containing signal data.

The comment section in the header files (.hea) includes clinical information, such as:

• record number;
• pregnancy duration;
• gestation duration at the time of recording;
• maternal age;
• number of previous deliveries (parity);
• previous abortions;
• weight at the time of recording; *
• hypertension; *
• diabetes; *
• placental position; *
• bleeding first trimester; *
• bleeding second trimester; *
• funneling; *
• smoker. *

* These eight items of clinical information were added to the header files in August 2012. No other changes were made.

The signal data in the data files (.dat) are in the following order:

• first channel, unfiltered (S1);
• first channel, (S1) filtered using a 4-pole band-pass Butterworth filter from 0.08Hz to 4Hz;
• first channel, (S1) filtered using a 4-pole band-pass Butterworth filter from 0.3Hz to 3Hz;
• first channel, (S1) filtered using a 4-pole band-pass Butterworth filter from 0.3Hz to 4Hz;
• second channel, unfiltered (S2);
• second channel, (S2) filtered using a 4-pole band-pass Butterworth filter from 0.08Hz to 4Hz;
• second channel, (S2) filtered using a 4-pole band-pass Butterworth filter from 0.3Hz to 3Hz;
• second channel, (S2) filtered using a 4-pole band-pass Butterworth filter from 0.3Hz to 4Hz;
• third channel, unfiltered (S3);
• third channel, (S3) filtered using a 4-pole band-pass Butterworth filter from 0.08Hz to 4Hz;
• third channel, (S3) filtered using a 4-pole band-pass Butterworth filter from 0.3Hz to 3Hz;
• third channel, (S3) filtered using a 4-pole band-pass Butterworth filter from 0.3Hz to 4Hz.

When using filtered channels, note that the first and last 180 seconds of the signals should be ignored since these intervals contain transient effects of the filters.

An accompanying file (tpehgdb.smr) summarizes clinical information of each record, describing whether the corresponding pregnancy ended on term (> 37 weeks) or prematurely (≤ 37 weeks), and whether the record was obtained before the 26th week of gestation or during or after the 26th week of gestation.

The columns in the tpehgdb.smr file represent:

• Record - the name of the record;
• Gestation - pregnancy duration (in weeks);
• Rec. time - gestation duration at the time of recording (in weeks);
• Group - record group according to gestation duration at the time of recording (<26 weeks, >=26 weeks) and pregnancy duration (PRE: pre-term, TERM: term);
• Premature - true (t), if delivery was premature (before 37 weeks of gestation); false (f), otherwise;
• Early - true (t), if the record was obtained before the 26th week of gestation; false (f), otherwise.

The TPEHG DB as posted here was the database used during a study of separating uterine EMG records of term and pre-term delivery groups using various linear and non-linear signal processing techniques [3]. Please cite reference [3] if using the records of the TPEHG DB.

During a study on comparison of various linear and non-linear signal processing techniques to separate uterine EMG records of term and pre-term delivery groups [3], among the other, following non-linear signal processing techniques to calculate following features for the records were used:

• The root mean square value for each filtered signal;
• The median frequency for the power spectrum of each filtered signal;
• The peak frequency for the power spectrum of each filtered signal;
• The sample entropy for each signal.

The feature values for all filtered signals of the records of the database are included in the feature values files (.fvl) of the database. For further details on how the features were calculated, see [3].

The .fvl files are organised according to the filter used:

• tpehgdb_features__filter_0.08_Hz-4.0_Hz.fvl - the features as calculated using the filter from 0.08Hz to 4Hz;
• tpehgdb_features__filter_0.3_Hz-3.0_Hz.fvl - the features as calculated using the filter from 0.3Hz to 3Hz;
• tpehgdb_features__filter_0.3_Hz-4.0_Hz.fvl - the features as calculated using the filter from 0.3Hz to 4Hz.

The columns in the .fvl files represent:

• Record - the name of the record;
• Chann - the channel number (1, 2 or 3);
• Gestation - pregnancy duration (in weeks);
• Rec. time - gestation duration at the time of recording (in weeks);
• Group - record group according to gestation duration at the time of recording (<26 weeks, (>=26 weeks) and pregnancy duration (PRE: pre-term, TERM: term);
• RMS - the root mean square value of the signal;
• Fmed - the median frequency of the power spectrum;
• Fpeak - the peak frequency of the power spectrum;
• Samp. en. - the sample entropy of the signal, calculated at m=3 and r=0.15 (see [3]);
• Premature - true(t), if delivery was premature (before 37 weeks of gestation); false(f), otherwise;
• Early - true(t), if the record was obtained before the 26th week of gestation; false(f), otherwise.

### Release Info

Version 1.0.0 was released in November 2010.  Eight items of clinical information were added to the header files in August 2012. No other changes were made.

### Contact

Franc Jager
Laboratory of Biomedical Computer Systems and Imaging
University of Ljubljana
Faculty of Computer and Information Science
1000 Ljubljana, Slovenia
email: franc.jager@fri.uni-lj.si

### References:

1. Ivan Verdenik. Multilayer prediction model for preterm delivery. PhD thesis. University of Ljubljana, Medical faculty, Ljubljana, 2002.
2. Gorazd Kavšek. Electromyographic activity of the uterus in threatened preterm delivery. MsC thesis. University of Ljubljana, Medical faculty, Ljubljana, 2001.
3. Gašper Fele-Žorž, Gorazd Kavšek, Živa Novak-Antolič and Franc Jager. A comparison of various linear and non-linear signal processing techniques to separate uterine EMG records of term and pre-term delivery groups. Medical & Biological Engineering & Computing, 46(9):911-922 (2008). [PDF]

