The Icelandic 16-electrode electrohysterogram database

External recordings of the electrohysterogram (EHG) can provide new knowledge on uterine electrical activity associated with contractions. Better understanding of the mechanisms underlying labor can contribute to preventing preterm birth which is the main cause of mortality and morbidity in newborns. Promising results using the EHG for labor prediction and other uses in obstetric care are the drivers of this work. This paper presents a database of 122 4-by-4 electrode EHG recordings performed on 45 pregnant women using a standardized recording protocol and a placement guide system. The recordings were performed in Iceland between 2008 and 2010. Of the 45 participants, 32 were measured repeatedly during the same pregnancy and participated in two to seven recordings. Recordings were performed in the third trimester (112 recordings) and during labor (10 recordings). The database includes simultaneously recorded tocographs, annotations of events and obstetric information on participants. The publication of this database enables independent and novel analysis of multi-electrode EHG by the researchers in the field and hopefully development towards new life-saving technology.


Background & Summary
Preterm birth is defined as birth before 37 completed weeks of gestation. On average, 12% of babies are born preterm in lower-income countries and 9% in higher-income countries 1 . Babies who are born preterm often require special care and face greater risks of serious health problems, including cerebral palsy, intellectual impairment, chronic lung disease, and vision and hearing loss. Preterm birth is the leading cause of newborn deaths (babies in the first four weeks of life) and the second leading cause of death in children under five years (after pneumonia). Preterm birth rates are increasing in almost all countries 1 . Currently, there is no effective way of preventing preterm birth. The main reason is that no good objective method currently exists to evaluate the stepwise progression of pregnancy through to labor, neither at term nor preterm. Various techniques have been adopted to monitor and/or diagnose labor, but they are either subjective or indirect and do not provide an accurate prediction of when labor will take place [2][3][4][5] . Studies have shown that external monitoring of uterine electrical activity using an electrohysterogram (EHG) is representative of uterine contractility 6 and show promising results in predicting preterm labor [3][4][5]7,8 . Most of the early studies on EHG used two to five electrodes 2,6 and therefore concentrated on the activity in a single location of the uterus. In 2007, a collaborative group from France and Iceland, involving biomedical researchers, engineers and medical doctors, started using monopolar electrodes in a 4-by-4 configuration on the abdomen aimed at providing information on uterine electrical activity propagation 9 . Guided by this preliminary work, the 16-electrode system was used to perform pregnancy and labor recordings in Iceland at Landspitali University Hospital, Akureyri Hospital and the Akureyri Primary Health Care Centre. In total, a database of 122 EHG recordings was created between 2008 and 2010. The majority of participants were measured multiple times during the same pregnancy and took part in two to seven recordings. These multiple, or longitudinal, measurements were aimed at observing the evolution of contractions during pregnancy and towards labor.
Parts of the data have already been used for developing and analyzing various signal processing methods and have led to several publications 8,[10][11][12][13][14][15][16][17][18][19][20][21][22][23] . In particular, the work has concentrated on efforts to accurately distinguish true labor contractions from normal pregnancy contractions, with some success 8,[10][11][12][13][14][15][16][17][20][21][22][23] . Compared to linear methods, non-linear methods may provide a superior way to differentiate between pregnancy and labor contractions 8,11,22 and multichannel recordings seem to improve this classification rate 10 . Interest in EHG propagation parameters has been increasing among researchers 24 and efforts have been made to exploit the information on the propagation and/or synchronization of different locations of the uterus which the extended geometry of the measurement system provides. The results indicate that propagation parameters can be important in accurately recognizing true labor 25,26 . Parts of the data were included in the BioModUE_PTL project which led to the development and validation of a biophysics based multiscale model of the EHG, going from the cell to the electrical signal measured on the abdomen 27 . None of these past publications, however, have described the database in detail and studies have so far only used parts of the data presented.
We provide open access to the database so that the international scientific community can freely generate greater understanding of the mechanics of the uterus and develop applications that improve obstetric care and hopefully accurately predict preterm labor. This paper describes the recording methods used and gives a detailed description of the Icelandic 16-electrode electrohysterogram database.

Data collection
The recordings were performed between 2008 and 2010 in Iceland. Pregnancy recordings, defined as recordings performed at antenatal care clinics on participants in the third trimester and not suspected to be in labor, were performed at Akureyri Primary Health Care Centre and Landspitali University Hospital. Labor recordings, defined as measurements performed on participants suspected to be in labor, present in the labor wards and who delivered within 24 h, were performed at Landspitali University Hospital and Akureyri Hospital. Participants were invited to take part in the recordings during antenatal care visits or at the labor wards and had normal singleton pregnancies and no known risk factors for preterm birth. Informed consent was obtained from every participant and the protocol was approved by the National Bioethics Committee in Iceland (VSN 02-006-V4). After each pregnancy recording, the participant was invited to take part in another recording one to two weeks later. All data can be found in PhysioNet (Data Citation 1).

Recording protocol
Reusable Ag/AgCl electrodes with a 13.0 mm outer diameter and an 8.0 mm inner diameter were used for the recordings. An alignment frame, a double sided hypoallergenic adhesive sheet and a silicone backing were designed and manufactured to enable a standardized electrode setup with a 17.5 mm distance between adjacent electrode centers. The alignment frame was used to align and attach an uncovered side of the double sided adhesive sheet to the silicone backing. The dimensions of the double sided adhesive sheet and silicone backing can be seen in Fig. 1 along with the back view of the silicone backing attached to the double sided adhesive sheet.
The electrodes were then placed into the holes in the silicone backing and attached to the adhesive sheet. The abdominal skin of the participant was carefully prepared using an abrasive paste and alcohol solution. After filling the electrode holes with electrode gel, wiping the excess gel away with a straight edged card and uncovering the other side of the double sided adhesive sheet, the electrode-adhesivesilicone matrix was attached to the abdomen. The electrode numbering scheme, as seen when looking at the abdomen of the participant, can be seen in Fig. 2.
The desired position on the abdomen was with the third vertical line of electrodes (electrodes 9 to 12) placed on the median axis of the uterus and the 10th-11th pair of electrodes half way between the uterine fundus and pubic symphysis. The navel was avoided by displacing the matrix up or down whilst staying as close as possible to the desired position. The skin over the iliac crests on both sides was prepared in the same way as the abdomen and a patient ground electrode and reference electrode with electrode gel were then attached on each side using adhesive washers, with inner diameters corresponding to the inner diameter of the electrodes. The locations of the reference and patient ground electrodes were not standardized to certain sides for the recordings. The electrode positions can be seen in Fig. 3.
A tocodynamometer was also attached to the abdomen during recordings. For pregnancy recordings, the participants were seated in recliner chairs and a support, such as a small pillow, was positioned under the right side of the participants to prevent potential aortocaval compression syndrome. For labor recordings, the participants were lying on their beds in the maternity wards and the researcher did not try to affect their positioning. A photo of the setup during a recording can be seen in Fig. 4.
The intended duration of a pregnancy recording was one hour and the intended duration of a labor recording was at least half an hour, but the participant could stop the measurement at any time.
The measurements were performed using a sixteen channel multi-purpose physiological signal recorder (Embla A10), most commonly used for investigating sleep disorders. An anti-aliasing filter with a high cut-off frequency of 100 Hz was used but no high pass filter was used. The signal sampling rate was 200 Hz and the signal was digitized to 16 bits. The sixteen monopolar electrode signals were originally stored in the EDF (European Data Format) format by the Somnologica software used to control the Embla A10.
All the recordings were performed by the same person (A.A.). Each participant was assigned an ID number and for each recording, information on participant age, body mass index (BMI) gestational age, placental position, gravidity, parity, history of cesarean section, eventual mode of delivery and gestational age at delivery was noted. During a recording, the researcher recorded participant movements, equipment

Data processing
No processing of data was performed beyond converting between file formats. The aim is that those that use the data can do so with a fresh start. Future additions to this database may include pre-treated signals and segmented contractions.
The EDF files obtained during the recordings were converted into WFDB (WaveForm DataBase--www.physionet.org/physiotools/wfdb.shtml) compatible signal (.dat) and header (.hea) files using the edf2mit WFDB application (www.physionet.org/physiotools/wag/edf2mi-1.htm) and Cygwin software (cygwin.com). Annotation (.atr) files containing events during recordings were created manually by using the WFDB-compatible signal and header files and WAVE, an X Window System client application (www. physionet.org/physiotools/wfdb.shtml#WAVE). The wfdb2mat WFDB application (www.physionet.org/ physiotools/wag/wfdb2m-1.htm) was used to convert the WFDB-compatible signal and header files into MATLAB (.mat) and corresponding header (.hea) files. The rdann function from the WFDB Toolbox for MATLAB (www.physionet.org/physiotools/matlab/wfdb-app-matlab) was used to create MATLAB  (.mat) files containing the information from the annotation files. The tocograph paper traces were scanned to JPEG images and a time axis corresponding to the recording was inserted onto the scanned images. All information that could possibly lead to the identification of the participant, such as personal data and dates, was manually removed from these images using a graphics editor.

Data Records
The data records in the Icelandic 16-electrode electrohysterogram database are stored in a PhysioBank database in PhysioNet 28 (Data Citation 1).
A total of 122 recordings were performed on 45 participants. Of the 45 participants, 32 were measured more than once during the same pregnancy and the highest number of recordings for a participant was seven recordings. Ten recordings were performed during labor and five participants took part in measurements during both pregnancy and labor. The lowest gestational age was twenty nine weeks and five days (29w5d-pregnancy recording) and the highest gestational age was forty one weeks and five days (41w5d-labor recording). The average recording duration for pregnancy recordings was 61 min (range 19-86 min) and the average recording duration for labor recordings was 36 min (range 8-64 min).
File names in the database are of the form ice###_*type*_*record number*, where ice### is the ID of the participant (e.g., ice001), *type* refers to the type of recording: p (pregnancy) or l (labor), and *record number* is the number of recording for that particular participant (e.g., 1of3).
Each recording has three associated files: -A scanned tocograph with a manually inserted recording time axis (.jgp file). Each small square represents 30 s. The types of event are: -C-Contraction. Used when the participant feels a contraction or there is a very likely contraction on the tocograph (not always used when there is an obvious contraction on the tocograph).
-(c)-Possible contraction. Used when there is not a very likely contraction but the participant has pressure sensation or a contraction is suspected on the tocograph.
-pm-Participant movement. -The zip file icelandic16ehgmat.zip that includes MATLAB (.mat) versions of all the signal files along with header files (file names of the form ice###_*type*_*record number*m) and MATLAB (.mat) versions of the annotations (file names of the form ice###_*type*_*record number*m_ann). This is provided for the convenience of users that want to analyze the data in MATLAB.
-RECORDS.txt containing a list of the recordings by record name, with one record name per line. Table 1 (available online only) contains the clinical information from each recording (information from the header files excluding comments) along with the recording duration and whether or not the recording has a corresponding annotation file. This information can also be found in info.txt in the database.

Technical Validation
The EHG signal has been shown to be representative of uterine contractions 6 and EHG is, in general, a well-proven technique 5 . The proof of concept and technical validation for the recording method used for the database was made in a preliminary study in 2007 (ref. 9).
The preliminary study applied recognized EHG techniques to a new recording setup involving a 4-by-4 grid. This was mainly to better observe and analyze the spatial characteristics and propagation of the electrical activity during contractions rather than just the activity in a single location of the uterus as had been done before 2,6 . The results from the preliminary study showed a very acceptable SNR (signal to noise ratio) of bipolar signals (difference between adjacent monopolar electrodes) 29 . The use of the monopolar signals singly was however considered problematic, even with adaptive filtering methods due to external common mode noise (maternal ECG, respiration movements etc.). Efforts to use recent techniques such as empirical mode decomposition (EMD) and canonical component analysis (CCA) to clean up the signal have since met with some success 30 . The preliminary study data was also used to present a moving picture of the electrical activity during contractions. The activity was clearly observed and correlated well with the simultaneous tocograph trace 9 .
In the preliminary study, electrodes were placed one at a time for the 4-by-4 grid, which was a timeconsuming task and achieving the desired inter-electrode distance required great operator precision. To address these issues, a placement guide system was specifically developed (described in Methods). The system has a standardized setup ensuring a consistent distance between electrodes. The database therefore contains recordings made in a very similar way to the preliminary study, with the same electrode configuration, the same electrodes and same recording device but with a slightly smaller inter-electrode distance (17.5 versus 21 mm). The system has a shorter electrode attachment time than for the one by one electrode attachment method from the preliminary study and the placement of the electrodes into the silicone backing can be done before a recording, shortening the setup time for an actual recording to around five minutes. This enables recordings to be performed when there is little time and reduces any inconveniences for participants and health professionals. The data from the preliminary study is not included in the database presented here.
All recordings were performed by the same researcher (A.A.) using the same protocol. The researcher (A.A.) stayed with the participants during recordings, recorded events first hand and monitored the equipment and electrode readouts continually. The tocodynamometer was recalibrated to 20 if readouts were zero. In the Embla A10 machine, an electrode that is unconnected or floating will give a signal which very quickly goes to saturation and is therefore easily recognized. If during a recording, an electrode gave a trace that was visually very unlike the traces of other electrodes or displayed values notably different from other values, it was pressed more firmly onto the abdomen to ensure connection to the skin. If all electrode traces seemed suboptimal (e.g., very noisy visually), the patient ground and reference electrode connections were inspected and the electrodes pressed more firmly onto the skin. In a few cases when suboptimal traces did not improve, the electrodes were reconnected and the recording then resumed. We therefore assume that although there may be parts of the data where the contact is faulty between the skin and the electrode, they are few and moreover they are easily recognizable.
Parts of the data have already been used for developing and analyzing various signal processing methods and have led to several publications 8,10-23 . The technical quality of the data has been thoroughly checked throughout this work and has never been found to be lacking.

Usage Notes
Although EHG has become, in general, a well-proven technique, the EHG signals are known to be problematic in the sense that they are very low frequency and very low amplitude.
The individual monopolar signals contain the measurement of the electric potential at each site in the matrix as referenced to the patient ground, with the reference electrode as a reference for the patient ground circuit. The patient ground and reference electrodes were always more than 20 cm away from each electrode in the matrix (varying between participants and gestational ages) and positioned over the iliac crest where little electrical activity is suspected. No use was made of the other electrodes as reference, average or otherwise. The raw signals therefore contain everything that creates a potential difference between the monopolar electrode and the patient ground. This includes the maternal and fetal ECG and some EMG from striated muscles. The signal also contains artifacts related to movements of the participant, fetal movement, and even fetal respiration has been observed. Fetal hiccups can give periodic spikes that are clearly visible. Our system proved very robust to 50 Hz line noise and this was never observed during measurements. Neither active shielding nor active grounding was used in the recordings.
The setup resulted in a high level of electrode offset varying very slowly over time. There were considerable differences between electrodes during the same measurement sessions, with a typical value of the offset being several millivolts while the EHG signal is at the level of several tens of microvolts. Any processing of the signals could include high pass filtering with a very low cut-off frequency and/or a creation of a new reference using adjacent electrodes (i.e., bipolar or Laplacian).
During pregnancy recordings, the fetal or maternal heart rates were not recorded. During labor recordings, fetal heart rate was usually monitored by the clinical team, but was not collected as a part of the study.
The annotations and the tocograph complement each other. Some contractions that are present in the annotations are not obvious on the tocograph and some obvious contractions on the tocograph are not in the annotations. This explains in part why some recordings are without annotations. Noted fetal movement could last for differing amounts of time and participants did not always notify if they felt fetal movement. Sometimes fetal movement can be seen on the tocograph.
The first sample of a signal file is indexed with 1 in the .mat files but with 0 in the .dat files and so there is an index number discrepancy of one between the two file formats (.mat and.dat).
The MATLAB files contain the absolute raw units. Division by 131.068 gives the physical units in mV. Even though some pregnancies ended in cesarean, the participant was on occasion already in spontaneous labor. These incidences are explained in the comments sections of the header files.
If a recording was close to birth, then the timing of the recording with regard to the birth is in the comments of the header file.
Our aim is to publish this database as is, without giving the user any detailed directions and thus encouraging open-minded exploitation of the data. Making raw data available is the best way to do this. There are however some pointers about how to make sense of the data which we would like to communicate to the users.
1. The data is sampled at 200 Hz. The EHG signals are generally assumed to be of very low frequency, from almost dc up to 3 Hz maximum. Decimating the raw signal (after low pass filtering) is advised, before or after creating a bipolar/multipolar signal or other de-noising. This will create more manageable files and it is often better to work with signals when the frequency of the signal to be observed is not as far away from the sampling frequency as in this case. This will also cut down calculation time in any complex analysis. Results have shown that decimation of this signal has little or even a positive effect on the performance of analysis methods 20 . 2. Please be aware that there are inherent imprecisions in the synchronization between the EHG recordings and the annotations and tocographs. There can be differences in when participants start to feel a contraction or fetal movements and so differences in when participants notify about events. Events were therefore occasionally approximated to the nearest whole minute. There are also delays internal to the tocodynamometers. Due to factors such as these, the inserted recording times on the tocographs and the annotation times may be up to ± 30 s from the actual recording or event times. 3. A user doing intensive work using these signals will have to develop an effective methodology to keep track of the signals, the way they have been pre-treated and the associated clinical parameters. Inspiration on how to organize such work can be sought in the structure of the SQL database framework which we developed 31 .
Users can view the data and annotations through two web interfaces: LightWAVE (a JavaScript viewer-www.physionet.org/lightwave) and ATM (Automated Teller Machine-www.physionet.org/cgibin/atm/ATM). ATM´s toolbox includes software that can convert WFDB signal files to text, CSV, EDF or .mat files and can show samples and annotations as text.
The WFDB Software Package can be used to work with the recordings (www.physionet.org/ physiotools/wfdb.shtml). For example, the command 'rdann -r *record name* -a atr -f 0' can be used to read the annotation files (www.physionet.org/physiotools/wag/rdann-1.htm). The WFDB Toolbox for MATLAB and Octave can also be used to work with the recordings (www.physionet.org/physiotools/ matlab/wfdb-app-matlab).

Author Contributions
A.A. wrote most of the text in the Data Descriptor, was the main developer of the placement guide system, participated in designing the recording protocol and performed the EHG measurements. T.S. was responsible for applications to the National Bioethics Committee and the Data Protection Authority in Iceland, evaluated and recruited possible participants and was responsible for the clinical aspects of the recordings. J.T. assisted in the development of the placement guide system, participated in designing the recording protocol and taught A.A. how to perform EHG measurements. C.M. heads the French group and participated in designing the recording protocol. B.K. heads the Icelandic group, assisted in the development of the placement guide system and participated in designing the recording protocol. Both C. M. and B.K. lead the original development of 4-by-4 electrode EHG recordings, managed the associated grants and supervised the students working on the project. Table 1 is only available in the online version of this paper.

Additional information
Competing financial interests: The authors declare no competing financial interests.