Background & Summary

According to the World Health Osrganization (WHO), preterm birth, or premature birth, is defined as: Babies born alive before 37 weeks of pregnancy are completed1. The WHO estimates that about 15 million babies are born prematurely each year, i.e., preterm birth occurs in about 10 percent of all pregnancies. Besides medically indicated or induced preterm birth, and preterm premature rupture of membranes2, other pathological processes might be responsible for initiating preterm labor, such as intrauterine infection or inflammation, burst blood vessels, uterine ischemia, uterine over-distention2, as well as other risk factors, such as hypertension, diabetes, conization, uterine abnormalities, alcohol and drug use, smoking, and life style3.

Despite exhaustive research, accurate prediction of preterm birth based on these factors remains far from certain. One promising diagnostic tool for better prediction of preterm birth, weeks or even months before delivery, is a low-cost, fully- or semi-automated analysis of the uterine electromyogram, recorded from the abdominal wall of a pregnant woman, also termed as electrohysterogram (EHG). The mechanical uterine contractions present during pregnancy which are of central importance to diagnose labor are the result of discontinuous bursts of action potentials. The EHG records contain these measurable changes of the electrical potentials of the uterus thus allowing efficient non-invasive quantitative assessment of the contractions4,5,6,7,8,9,10. Applying different analysis methods showed that the EHG records contain sufficient information to diagnose labor more accurately than other traditional clinical methods6,9,11,12, and provide adequate data to predict preterm labor8,9,11,12,13,14.

The appearance of publicly available Term-Preterm EHG DataBase (TPEHG DB)15,16 (https://physionet.org/content/tpehgdb/) in 2011, containing 300 preterm and term spontaneous EHG records (see Table 1), recorded early (around the 23rd week) or later (around the 31st week) during pregnancy, allowed in-depth studies of non-linear signal processing techniques and machine learning approaches for accurate classification between entire preterm and term EHG records with the goal to predict preterm birth. Due to highly imbalanced sets of preterm and term EHG records (38 versus 262), researchers used a synthesis-partition over-sampling approach, based on the SMOTE17, or ADASYN18 algorithm, in order to balance the sets. Using this over-sampling approach, a number of studies using the TPEHG DB have reported near-perfect results in distinguishing between preterm and term EHG records19,20,21,22,23,24,25. In 2021, an important study26 revealed that over-sampling applied after data partitioning, i.e., partition-synthesis over-sampling approach, needs to be applied to achieve realistic classification performance, and realistic preterm birth prediction in the case of imbalanced sets. Recently, many interesting studies related to preterm birth prediction using the TPEHG DB were published using traditional feature engineering27,28,29,30,31,32,33 and deep learning34,35,36,37 approaches. A nice review of the literature dealing with the use of EHG records for the task of predicting premature birth and for understanding the underlying physiological processes during pregnancy can be found in38.

Table 1 Comparison of publicly available databases/datasets containing EHG records for EHG research, discrimination between pregnancy and labor EHG records, discrimination between preterm and term EHG records, and for predicting preterm birth.

Another important publicly available EHG database, the Icelandic 16-electrode Electrohysterogram Database16,39 (https://physionet.org/content/ehgdb/) published in 2015, containing 122 EHG records (45 pregnancies), recorded during the 3rd trimester of pregnancy or during labor (see Table 1), and with human-expert annotated contractions, allowed distinguishing between pregnancy and labor. Several excellent studies have been published using this database. The studies were dedicated to discrimination40 and classification41 between pregnancy and labor groups of records, understanding human uterine electrical propagation42, analysis of uterine synchronization43, attenuation of maternal respiration signal44, automatic uterine contraction detection45 and contraction clustering46, recognizing uterine contractions using convolutional neural networks47, and to detection of preterm birth48.

A study dedicated to characterization of contraction and non-contraction (dummy) intervals of uterine EHG records, and automatic classification of preterm and term spontaneous EHG records23, yielded another publicly available dataset, i.e., Term-Preterm EHG DataSet with Tocogram (TPEHGT DS)16,23 (https://physionet.org/content/tpehgt/). The TPEHGT DS was published in 2018 and contains 13 preterm and 13 term spontaneous EHG records (see Table 1), recorded later (around the 31st week), with simultaneously recorded tocogram (TOCO) signal (measuring mechanical uterine activity obtained by external tocodynamometer), and with human-expert annotated contraction and non-contraction (dummy) intervals. Using this dataset, several novel approaches for classification of preterm versus term births were reported. These include the use of entire EHG records49,50, or individual contraction and/or dummy intervals23,51,52,53.

Unfortunately, none of these databases/dataset provide sufficient number of EHG records for reliable assessment of the accuracy of predicting imminent preterm birth. Merging EHG records from different databases/dataset may be questionable due to the differences in signal acquisition protocols. Since the EHG records of the TPEGH DB and TPEHGT DS were acquired under the same acquisition protocol, and using the same recording device, several authors merged the EHG records from these two database/dataset in cases of traditional feature engineering24,25,32,33 and deep learning approach37.

Current existing methods using uterine EHG records for predicting preterm birth solely base on classification between EHG records of which pregnancies ended in preterm spontaneous or term spontaneous delivery mode, and do not take into account other delivery modes like induced and cesarean section delivery. A robust and realistic approach for accurate prediction of preterm birth that base on the analysis of EHG records should take into account also the characteristics of EHG records of term induced and term cesarean section deliveries. Moreover, in the last 10 years the number of induced and cesarean deliveries has increased even if there is no apparent medical reason54,55. The latest related studies, using EHG records of induced or cesarean section delivery modes, focuses on characterization of antepartum, labor, and post-partum records56, characterization of bursts in late antepartum records of pregnant women with complete placenta previa57, differentiation between term spontaneous labor and induced late-term labor58, prediction of labor induction success in the first hours after induction of labor59, prediction of cesarean section and spontaneous vaginal delivery modes60, and predicting uterine atony after spontaneous or cesarean deliveries using post-partum EHG records61. However, these delivery-mode prediction methods relied on antepartum or labor EHG records recorded after the 37th week of gestation and did not take into account earlier recorded (before the 37th week) EHG records nor preterm birth prediction.

For these reasons, we decided to build a new EHG dataset with term induced, cesarean, and induced and cesarean EHG records, i.e., Induced Cesarean EHG DataSet (ICEHG DS)62, under the same acquisition protocol as was used for obtaining uterine EHG records of our previously developed TPEHG DB and TPEHGT DS. The ICEHG DS contains 126 EHG records (91 pregnancies), recorded early (around the 23rd week) and/or later (around the 31st week) during pregnancy (see Table 1), ending in induced, cesarean, or induced and cesarean delivery. Publicly available ICEHG DS62 will allow researchers further studies in order to answer the following important questions: (1) Can the induced and cesarean section delivery modes be predicted early, already in the 23rd, or in 31st week of pregnancy? and, (2) Can the characteristics of the EHG records of induced and cesarean section delivery modes influence the understanding of the underlying mechanisms involved during pregnancy, and more important, the understanding of the mechanisms responsible for preterm birth? Moreover, the ICEHG DS, used alongside the TPEHG DB and TPEHGT DS, will provide a robust and more realistic assessment of the performance of automated preterm birth prediction. To address some of these questions, the EHG records of the ICEHG DS, alongside the EHG records of the TPEHG DB and TPEHGT DS, have already successfully been used in one of our studies63. Characterization and separation of all later recorded preterm and term spontaneous, induced, cesarean, and induced and cesarean, groups of EHG records of these three database/datasets, showed that the peak amplitude of the normalized power spectra of EHG signals in the frequency band 0.125–0.575 Hz (which approximately matches the Fast Wave Low band), efficiently separate between the later preterm group and all other later term delivery groups (p = 2.5·10−8), and efficiently separate between the later preterm group and any of other later term delivery groups (p ≤ 4.0·10−3)63.

In summary, the areas of EHG research which have the potential to benefit from this new ICEHG DS are the following: (1) Development of efficient automated methods for pregnancy monitoring via visualization of electrical uterine activity in time and/or frequency domain; (2) Characterization and understanding physiological mechanisms involved during pregnancy that lead to induced and/or cesarean section delivery modes; (3) Development of non-invasive automated methods for prediction of induced and/or cesarean section delivery modes; (4) Mathematical modeling of electrical uterine activity; and, (5) Identification of simple and efficient EHG biomarkers for predicting pregnancy outcome. In this paper, we provide a detailed description of the ICEHG DS.

Methods

Data collecting

In the period from 1997 and 2006, a large number of uterine EHG records (a total of 1,211) were collected at the Clinical Department of Perinatology, University Medical Center Ljubljana, Ljubljana, Slovenia. Records were collected from the general population during routine checkups, and from the patients admitted to the hospital with the diagnosis of impending preterm labor. The records were collected either early, around the 23rd week of gestation (early records), and/or later in the pregnancy, around the 31st week of gestation (later records). The decision for the 23rd and 31st week was for the following reasons: the period from 22nd to 24th week of pregnancy (or the end of the second trimester) is an estimated border at which termination of a pregnancy is considered as an abortion, or as a delivery (i.e., extreme preterm delivery); while the 31st week of pregnancy (within the third trimester) is an estimated border after which a newborn can survive outside the uterus. (Note that these borders differ from country to country.) It was expected that characterization of the EHG records collected at these two milestones during pregnancy will provide valuable insight into changes of the physiological mechanisms involved along pregnancy. From this entire pool of uterine EHG records, in 2011 and in 2018, we developed the TPEHG DB15,16 (https://physionet.org/content/tpehgdb/) and TPEHT DS16,23 (https://physionet.org/content/tpehgt/), respectively, and made them publicly available in the Physionet repository. At these times, we were interested only in those EHG records with spontaneous preterm and spontaneous term delivery, and not in those ending in induced or cesarean section delivery. The availability of TPEHG DB and TPEHGT DS resulted in a large number of valuable studies dedicated to predicting preterm birth as outlined in Background & Summary section.

The EHG records selected for the dataset described in this paper, i.e., Induced Cesarean EHG DataSet (ICEHG DS), are also coming from the pool of EHG records collected between 1997 and 2006. Obtaining of the uterine EHG records was approved by the National Medical Ethics Committee of the Republic of Slovenia (No. 32/01/97). All women gave their written signed consent for the EHG data to be shared in a repository. The selected records for the ICEHG DS are those collected early and/or later for the pregnancies which were expected to have a normal progression toward the spontaneous start of labor and vaginal term delivery, but ultimately ended either in term vaginal delivery that failed to start spontaneously and labor had to be induced (induced records), in term delivery by emergency cesarean section without prior induction of labor (cesarean records), or in term delivery by emergency cesarean section after a failed induction (induced-cesarean records). The ICEHG DS is stored in the PhysioNet repository62.

Recording protocol

The recording protocol and the recording equipment were those which were also used during collecting the records of the TPEHG DB15,16 and TPEHGT DS16,23. The recording equipment consisted from a custom made physiological signal measurement device (conforming to the required ISO standards) connected to a personal computer with an integrated eight channel A/D converter. The records were collected from the abdominal surface using four Ag2 Cl electrodes. The electrodes were placed symmetrically above and under the navel, at the distance of 7 cm (see Fig. 1). The reference electrode was attached to the left woman’s thigh. Prior to the attachement of the electrodes, the corresponding area of 12 × 12 cm was cleaned using the acetone and ether. The precise electrode attachment positions were determined by an electrode attachement model made for this purpose. In order to lower the resistance between the electrodes, the electrode attachment positions were additionally cleaned. The four surface electrodes of the contact area of 20 mm2 were spread with contact conducting gel (electrode gel). In order to improve the quality of the measurements, a special protocol was used64. According to this protocol, the measured resistance between each pair of electrodes had to be lower than 20 kΩ. If this requirement was not reached, the electrode attachement procedure was repeated.

Fig. 1
figure 1

Positions of electrodes. The electrodes were placed symmetrically above and under the navel in two raws spaced at a distance of 7 cm23.

The acquired EHG records are of length of approximately 30 minutes and consist of three bipolar EHG signals (Fig. 1). The first acquired bipolar EHG signal was measured between the uper two electrodes, S1 = E2-E1, the second bipolar EHG signal between the left two electrodes, S2 = E2-E3, and the third bipolar EHG signal between the lower two electrodes, S3 = E4-E3. Prior to sampling, the signals were filtered using an analog anti-aliasing low pass three-pole Butterworth filter with the cutt-off frequency of 5.0 Hz. The sampling frequency, FS, was 20 Hz. The resolution of the signal acquisition equipment was 16 bits with the amplitude range of ±2.5 mV (A/D value of 13107 units corresponds to 1.0 mV). The sampled signals were stored on the personal computer hard disk in real time into ASCII files, while general data about the records into separate record-configuration ASCII files. No annotations of the records were provided during recording.

A record ID was assigned to each EHG record. Information on the recording time (the week of pregnancy at the event of recording) and the accompanied clinical information: age, weight, placental position, and height, were noted and stored into an .xlsx table, and added into the corresponding record-configuration ASCII file, for each participating women. Moreover, at the event of delivery, the following data were added into each record-configuration ASCII file: type of delivery (induced, cesarean, or induced-cesarean), gestation age, newborn weight, and ID of the pair record if EHG records were collected early and later during pregnancy.

Data processing

In order to provide additional version of the EHG signals of the uterine EHG records obtained during recording without extremely slow signal drifts (the analog anti-aliasing filter passed frequencies from 0.0–5.0 Hz), the original EHG signals stored in the ASCII files were filtered using a four-pole digital band-pass Butterworth filter with the cut-off frequencies at 0.08 Hz and 5.0 Hz, applied bidirectionally to eliminate the non-linear phase shift. (The Butterworth filter was selected due to its nice transfer characteristic having no ripple in the pass- and stop-bands.) This processing was performed in MATLAB using readmatrix, butter, filter, flip, and plot functions. The filtered signals were then added into the original ASCII signal files of the records. An example of the EHG signals prior to and after this filtering for a selected record of the ICEHG DS is shown in Fig. 2. After that, the ASCII signal files containing original and filtered EHG signals, and the contents of the record-configuration ASCII files of the participants, were converted into the WFDB (WaveForm DataBase Software Package) record format (https://www.physionet.org/content/wfdb/) using the wrsamp WFDB application to produce binary signal (.dat) files containing very original and filtered EHG signals, and ASCII header (.hea) files. The .hea files contain information about the general data of the records, and in the comments section the accompanied clinical information of the participants. No further processing was performed in addition to this conversion. Further processing of the records is expected to be performed by the users of the dataset according to their research aims.

Fig. 2
figure 2

The electrohysterogram (EHG) signals of the record icehg1185 (induced, delivery in the 41st week, recorded early in the 22nd week of pregnancy). Black: original signals, blue: filtered signals. Signal samples of the first and last 150 seconds of the filtered signals are set to zero.

Data Records

The ICEHG DS is available in the PhysioNet repository62. All together, the dataset contains 126 three-signal 30-minute surface EHG records coming from 91 pregnancies that were recorded early around the 23rd week (62 records) and later around the 31st week (64 records) of pregnancy. Precisely, the dataset includes 38 and 43, early and later, induced EHG records of 59 pregnancies (see Table 2); 11 and 8, early and later, cesarean EHG records of 13 pregnancies (see Table 3); and 13 and 13, early and later, induced-cesarean EHG records of 19 pregnancies (see Table 4). The mean times of gestation in weeks were 39.8 ± 1.4 for induced, 39.7 ± 1.1 for cesarean, and 39.4 ± 0.9 for induced-cesarean records.

Table 2 General data and accompanied clinical information (the contents of the .hea files of the EHG records of the ICEHG DS) of the participants with pregnancies ending in induced delivery.
Table 3 General data and accompanied clinical information (the contents of the .hea files of the EHG records of the ICEHG DS) of the participants with pregnancies ending in cesarean delivery.
Table 4 General data and accompanied clinical information (the contents of the .hea files of the EHG records of the ICEHG DS) of the participants with pregnancies ending in induced-cesarean delivery.

Names of the records are the following: icehgXXX[X], where XXX[X] represents record ID. The entire list of records is contained in the ASCII file named RECORDS. The records are stored in the sub-directories with regard to the period of recording and according to the delivery mode, i.e., early_induced, early_cesarean, early_induced-cesarean, later_induced, later_cesarean, and later_induced-cesarean. The lists of records for each group of EHG records per delivery mode (induced, cesarean, or induced-cesarean) are contained in the accompanied ASCII files named RECORDS_induced, RECORDS_cesarean, and RECORDS_induced-cesarean. Each raw in these three files corresponds to a pregnancy and contains the name of early and/or name of later EHG record of the pregnancy; while if considering columns in these three files, the first column contains the names of all early EHG records, and the second column the names of all later EHG records, given the delivery mode. To better explain the contents of the latter three files, as for an example, the content of the RECORDS_cesarean file is shown and described in Fig. 3.

Fig. 3
figure 3

The contents of the RECORDS_cesarean file. Each raw correspons to a pregnancy ending in cesarean section and contains the name (and sub-directory name) of early and/or name (and sub-directory name) of later EHG record of the pregnancy. Columns correspond to all early, and all later, cesarean EHG records. A zero indicates that no early, or later, EHG record for that pregnancy exists.

Each EHG record is composed from the following three files:

  • A figure (icehgXXX[X]_fltrd.jpg) showing the three original EHG signals and their filtered versions (an example is shown in Fig. 2);

  • a binary signal (icehgXXX[X].dat) file containing the three original EHG signals (S1, S2, and S3) and their filtered versions;

  • a header (icehgXXX[X].hea) ASCII file containing the general data of the record and accompanied clinical information of the participant.

The signal data in the .dat data files are in the following order:

  • original, unfiltered, signal S1;

  • filtered signal S1 using a four-pole band-pass Butterworth filter from 0.08 Hz to 5.0 Hz applied bidirectionally;

  • original, unfiltered, signal S2;

  • filtered signal S2 using a four-pole band-pass Butterworth filter from 0.08 Hz to 5.0 Hz applied bidirectionally;

  • original, unfiltered, signal S3;

  • filtered signal S3 using a four-pole band-pass Butterworth filter from 0.08 Hz to 5.0 Hz applied bidirectionally.

The top most part of the .hea header files is the general data of the EHG record including: record name, sampling frequency, length of the record in samples, list of signals with their specifications according to the WFDB format, calibration constants, and signal labels. The rest of the header files is the comments section containing the accompanied clinical information of the participant. An example of the comments section of the .hea files for a selected record is shown in Fig. 4. This comments section contains the following information: record ID, type of delivery (Induced, Cesarean, or Induced-cesarean), gestation age in weeks, recording time in weeks, age of the participant in years, weight at the recording time in kg, placental position (front/end), height of the participant in cm, newborn weight in g, and ID of the pair record (if records were collected early and later during pregnancy). Tables 24 contain the general data and accompanied clinical information (the contents of the comments section of the .hea files) of the participants ending in induced (Table 2), cesarean (Table 3), and induced-cesarean (Table 4) delivery.

Fig. 4
figure 4

The comments section of the icehg1185.hea file of the record icehg1185 (induced, delivery in the 41st week, recorded early in the 22nd week of pregnancy, the pair record, recorded later, is icehg1184).

The records of the ICEHG DS are also available in MATLAB format (.mat signal and .hea header files) in the icehgdsmat sub-directory.

Technical Validation

The EHG records were recorded in clinical environment. A researcher stayed with the participants throughout during recording. There were no limitations for the participants regarding talk or changing the position. The only request was not to make fast moves. The researcher continuously monitored signals and the equipment, and periodically, attachment of the electrodes. If the signal traces seemed to be very noisy (spikes, sudden step changes, signal bursts not related to contractions), all the electrode connections were verified and connections improved. During recording, the researcher did not record, nor annotate, contractions experienced by participant, fetal movements, other contraction-like electrical activities (signal bursts), noise due to movements of the participant (movement artifacts) like spikes and sudden step changes, nor other noise due to, e.g., smile or cough. Unfortunately, severe noise and artefacts appeared in some records, therefore not all EHG records were usable for the final dataset.

The EHG records of the final dataset were carefully selected from the pool of EHG records of which pregnancies ended in induced, cesarean, or induced and cesarean delivery. For the final selection of the records, the original EHG signals, and the filtered versions of the original EHG signals (using a four-pole digital band-pass Butterworth filter with the cut-off frequencies at 0.08 Hz and 5.0 Hz, applied bidirectionally), were thoroughly checked, i.e., visually inspected in time domain for the signal quality. Only those EHG records with relativly clean signals were included in the final ICEHG DS. Those EHG records showing lose of signal, extreme spikes, sudden step changes, or severe noise (bursts) of unreasonable high amplitude and duration, in their signals, or in the filtered versions of the signals, were rejected. An example of clean EHG signals (no lose of signal, no spikes, no sudden step changes, or severe noise) of a selected EHG record is shown in Fig. 2.

The technique of visual inspection of signals used in this study for selection of the final EHG records of the ICEHG DS is the same as that which was used for selecting the EHG records of our previously developed TPEHG DB and TPEHGT DS. Both, the TPEHG DB and TPEHGT DS, have already been successfuly used by many research groups, and resulted in a large number of valuable studies (see Background & Summary section). Moreover, the EHG records of the ICEHG DS have already successfully been used in one of our studies63. Therefore, we conclude that the technique of visual inspection of signals is reliable.

Usage Notes

The ICEHG DS is intended to study physiological mechanisms involved during pregnancy that lead to induction, cesarean section, or both. Characterization and separation of the EHG records of the ICEHG DS can answer a question whether the induced and cesarean section delivery modes can be predicted early, already in the 23rd, or later, in the 31st week of pregnancy.

Moreover, the ICEHG DS is intended to provide more realistic pool of EHG records ending in term delivery. Since the same acquisition protocol and the same recording device were used in TPEHG DB15,16 (https://physionet.org/content/tpehgdb/), TPEHGT DS16,23 (https://physionet.org/content/tpehgt/), and in the ICEHG DS62, the EHG records of the ICEHG DS can be used alongside term spontaneous, early and later, EHG records of the TPEHG DB and TPEHGT DS to better understand the underlying physiologic mechanisms leading to different kinds of term delivery modes. Furthermore, adding the preterm spontaneous, early and later, EHG records of the TPEHG DB and TPEHGT DS can answer a question how the characteristics of EHG records of the induced and cesarean section delivery modes influence the understanding of the underlying mechanisms responsible for preterm birth. Such a composed pool of EHG records from all three database/datasets will provide a robust and more realistic evaluation of non-invasive automatic or semi-automatic methods for predicting preterm birth. (Note that the EHG records of these three database/datasets contain the same three bipolar signals S1, S2, and S3).

Table 5 summarizes the numbers of uterine EHG records contained in the ICEHG DS, and in our previously developed TPEHG DB and TPEHGT DS, with regard to the period of recording (early, later) and delivery mode. The percentages of EHG records in the three EHG database/datasets per delivery mode match the estimated percentages of types of deliveries in the real world only to a certain degree. According to WHO1, about 10 percent of all pregnancies end in preterm birth. Considering the three EHG database/datasets, they contain 11.3% of preterm spontaneous EHG records. According to NHS Maternity Statistics65, spontaneous delivery is the most common delivery, and has decreased from 66% to 47% in the period from 2011/12 to 2021/22. The three EHG database/datasets contain 72.1% of preterm spontaneous and term spontaneous EHG records. Furthermore, according to NHS Maternity Statistics65, induced deliveries has increased from 22% to 33% in the period from 2011/12 to 2021/22, and cesarean deliveries has increased from 12% to 20% in the period from 2011/12 to 2021/22. The three EHG database/datasets contain 17.9% of induced, 4.2% of cesarean, and 5.8% of induced and cesarean EHG records.

Table 5 The numbers of uterine EHG records contained in the TPEHG DB, TPEHT DS, and ICEHG DS.

In the EHG records of the ICEHG DS three bipolar original EHG signals are stored. The first, S1, was measured between the upper two electrodes (see Fig. 1), the second, S2, between the left two electrodes, and the third, S3, between the lower two electrodes. Signal S1 and signal S3 estimate the uterine electrical activity in the horizontal direction, while signal S2 in the vertical direction. In order to better characterize the electrical activity in the vertical direction, the users of the ICEHG DS (and of the TPEHG DB and TPEHGT DS) may synthetically derive the fourth bipolar signal, S4, to estimate the uterine electrical activity in the vertical direction between the right two electrodes E4 and E1, S4 = E4 - E1. Since S1 = E2 - E1, S2 = E2 - E3, and S3 = E4 - E3, and using E4 = S3 + E3 and E1 = E2 - S1, it follows that:

$$S4=E4-E1=S3+E3-E2+S1=S1-S2+S3.$$
(1)

Similarly, bipolar signals estimating the uterine electrical activity in both diagonal directions, between the electrodes E2 and E4, and between the electrodes E1 and E3, can be derived.

The discrepancies between synthetically derived signal, S4, and the signal as it would be actually measured between the right two electrodes E4 and E1, are negligible. The estimated discrepancy between the calculated, S4, and actually measured signal, i.e., the standard deviation between the samples of these two signals (calculated throughout the 30-minute signals), is less than the difference between two adjacent integer values (0.076 μV) of the signal amplitudes (1 mV/13107 = 0.076 μV, where 13107 is the calibration constant relating to 1 mV), and close to the quantization error (0.038 μV) of the A/D converter.

The unipolar EHG signals, as measured at the electrodes E1, E2, E3, and E4, are not stored in the EHG records, nor it is possible to synthetically derive them form the bipolar signals S1, S2, and S3.

It is good idea to set the values of signal samples of the first and last 150 seconds of the filtered version of the signals to zero (or to reject them) due to the transient effects of the Butterworth filter which was used bidirectionally for filtering the signals.

Besides the generic WFDB (WaweForm Software Package) software package (https://www.physionet.org/content/wfdb/), which was used to derive the EHG records of the ICEHG DS, the users can also use PhysioNet’s WFDB for MATLAB and Octave (https://www.physionet.org/content/wfdb-matlab/) and WFDB for Phython (https://www.physionet.org/content/wfdb-phython/) for the efficient further analysis of the records of the ICEHG DS. The records of the ICEHG DS are already readily available in MATLAB format in the icehgdsmat sub-directory. Moreover, the users can use LightWAVE, PhysioNet’s on-line signal viewer and annotation editor (https://www.physionet.org/lightwave/).