Introduction

Severe neonatal hyperbilirubinemia (SNH, total serum bilirubin (TSB) ≥ 342 µmol/L) can cause bilirubin-induced neurological disorders (BIND).1,2,3 The risk of BIND increases with longer duration of exposure to SNH.4 Early phototherapy to reduce the TSB level has been shown to reduce/prevent BIND.5,6

The current gold standard for measuring TSB level in many developing countries to guide management is by laboratory method which is often time-consuming, involving time spent in collection of blood specimens from patients, manual transportation of specimens to hospital laboratories and awaiting results of measurement. An alternative method is the transcutaneous bilirubinometer (TcB) that estimates TSB using optical spectroscopy and provides point-of-care (POC) instantaneous, noninvasive estimation of cutaneous bilirubin concentration. However, the major limitations of TcB are underestimation of TSB at levels above 250 µmol/L, and inaccurate estimation of TSB after phototherapy has commenced and in multi-ethinic population with variable shades of skin pigmentation.7

The Bilistick System (Bilimetrix s.r.l., Trieste, Italy) is an in vitro POC diagnostic system for measuring TSB level in capillary or venous blood samples. A previous study on 118 neonates from Egypt and Italy showed that TSB measured by the Bilistick System correlated closely with those measured by hospital laboratories (Pearson’s correlation coefficient of r = 0.963).8 However, not many neonates (n = 20) with TSB ≥ 300 µmol/L were recruited in that study, the accuracy of the Bilistick system was not reported, and no comparison was made on the turn-around-time (TAT) between the two methods.

In Malaysia, neonatal jaundice affects more than 60% of live births with 17.8% of them requiring phototherapy and 0.8% developing SNH.9 Clinically detected jaundiced neonates are referred to health clinics or hospital emergency departments (ED) for measurement of TSB. Due to the shortage of hospital beds, jaundiced neonates are admitted only when they require phototherapy or exchange transfusion.10 The TAT of TSB measured in hospital laboratories usually takes several hours causing delay in diagnosis and treatment, and over-crowding of ED. The objectives of this study were to determine: (a) the accuracy of TSB levels measured by the Bilistick system against those measured by the hospital laboratories, and (b) the TAT of the Bilistick versus the hospital laboratory method.

Methods

This was a prospective study carried out in the neonatal nurseries of two government hospitals (Selayang Hospital and Ampang Hospital) in Malaysia over an 8-month period (26 May 2017 to 19 January 2018). The Malaysian Research Ethics Committee of the Ministry of Health of Malaysia (NMRR-16-2569-32628-IIR), the Research and Ethics Committee of Universiti Tunku Abdul Rahman (U/SERC/23/2017), and the respective hospital authorities approved this study. Written parental consent was obtained for participation in this study.

The inclusion criteria were well term-gestation (>36 weeks) clinically jaundiced neonates. The exclusion criteria were preterm neonates, and term neonates without jaundice or with illness. Before commencement of this study, all investigators underwent training using the video tutorial provided by the Bilistick manufacturer.8

From each participating neonate, a venous blood sample was collected for simultaneous measurement of TSB by the hospital laboratory (1.0–1.5 mL in a plain test tube) and the Bilistick system (25 µL in a plain transfer pipette to a Bilistick test strip). After the sample was loaded to the Bilistick Reader, the TSB reading was displayed on the screen of the Bilistick device. Each hospital was provided with two Bilistick Readers for the study. The coefficient of variability of the Bilistick System reported by the manufacturer was between 0.1% at serum bilirubin of 1.7 mg/dL (29 µmol/L) to 1.7% at 31.4 mg/dL (533.8 µmol/L).

Measurement of TSB at the hospital laboratories

The hospital laboratories in both hospitals were located away from the wards at different levels of the buildings. To prevent degradation of serum bilirubin by ambient light, each test tube was wrapped with a piece of paper tape during transport to the laboratories by human porters.. At the laboratories, each blood sample was centrifuged and 250 µL of serum pipetted for measurement of TSB by the 2,5-dichlorophenyl diazonium method using a Cobas modular instrument (Roche) in the Ampang Hospital, and by the Jendrassik–Grog Chemical assay method using AU2700 of Beckmann Coulter in the Selayang Hospital. The results of the TSB measured and time of report were loaded automatically into the hospital computer systems and accessible by the investigators. The coefficient of variability of the laboratory method for TSB was 0.1% in each hospital.

Data collection

The following variables were collected from each participating neonate: gender, birth weight, gestational age, the time when a venous blood sample was collected, and the time when TSB was measured and reported by the hospital laboratory and by the Bilistick system, respectively.

Definitions

The TAT of TSB measured by hospital laboratories was defined as the time interval when a specimen of venous blood was drawn from a neonate to the time when TSB was reported on the hospital computer systems. The TAT of TSB measured by the Bilistick system (TAT-Bilistick) was the time interval when a specimen of venous blood was drawn from a neonate to the time when TSB measured by the Bilistick system was displayed on the device’s screen.

Outcome measures

The primary outcome measures were as follows: (1) TSB results of each neonate measured by the hospital laboratory and Bilistick, and (2) TAT of both methods.

Sample size

Based on the study of Coda Zabetta et al.,8 the sample size needed was 97, as the mean difference of TSB between the Bilistick and laboratory methods of 118 neonates was reported to be 10.3 µmol/L (standard deviation = 24.1 µmol/L), and the standardized effect size was Δ = 0.427. However, in order to detect a desired standardized effect size of Δ = 0.2, with 95% level of confidence with a power of 90% in a one-sided test, a sample size of at least 429 was required,11 as shown below:

$$n = \frac{{2\left( {Z_\alpha + Z_{1 - \beta }} \right)^2}}{{\Delta ^2}} = \frac{{2\left( {Z_{0.05} + Z_{1 - 0.1}} \right)^2}}{{\Delta ^2}} = \frac{{2\left( {1.649 + 1.28} \right)^2}}{{\left( {0.2} \right)^2}} \approx 429$$

where \(\Delta = \frac{{{\mathrm {mean}}\,{\mathrm {difference}}\,{\mathrm {of}}\,{\mathrm {TSB}}\,{\mathrm {between}}\,{\mathrm {the}}\,{\mathrm {Bilistick}}\,{\mathrm {and}}\,{\mathrm {Laboratory}}\,{\mathrm {methods}}}}{{{\mathrm {standard}}\,{\mathrm {deviation}}}}\).

Statistical analysis

The data were analyzed using SPSS (IBM, V.24.0). Continuous variables were presented as mean and standard deviation (SD), and categorical variables as number and percentage. Pairs of TSB measured by both methods were analyzed using Pearson correlation coefficients and linear regression for each hospital and both hospitals combined. Bland–Altman plot was constructed to compare the variability of TSB measured by the two methods for each hospital and both hospitals combined. The predictive indices of Bilistick TSB for different laboratory TSB levels (gold standard) used for initiation of treatment at different ages of neonates recommended by the Malaysian national clinical practice guidelines10 was calculated using standard sensitivity, specificity, and accuracy calculations from a confusion matrix12 as follows:

 

Actual class

≥Lab TSB

<Lab TSB

Predicted class

≥Bilistick TSB

TP

FP

<Bilistick TSB

FN

TN

where \({\mathrm {Accuracy}} = \frac{{{\mathrm {TP}} + {\mathrm {TN}}}}{{{\mathrm {TP}} + {\mathrm {TN}} + {\mathrm {FP}} + {\mathrm {FN}}}}\), \({\mathrm {Sensitivity}} = \frac{{{\mathrm {TP}}}}{{{\mathrm {TP}} + {\mathrm {FN}}}}\) and \({\mathrm {Specificity}} = \frac{{{\mathrm {TN}}}}{{{\mathrm {TN}} + {\mathrm {FP}}}}\)

TP = true positive, TN = true negative, FP = false positive, FN = false negative. p-values of less than 0.05 were considered statistically significant.

Results

During the study period, 882 jaundiced neonates were referred to the investigators (Fig. 1). In the Selayang Hospital, blood samples of 63 (15.1%) neonates had error code readings in the Bilistick system with no TSB readings, and six neonates had their blood specimens rejected by the hospital laboratory. In the Ampang Hospital, 52 (19.7%) neonates had error code readings in the Bilistick system. The error codes in the Bilistick Reader indicated that the likely cause of the error was due to high hematocrit of the specimens or blood clots in the specimens. Only 561 neonates had both written parental consent and valid paired TSB results. We present here the data of these 561 neonates.

Fig. 1
figure 1

Recruitment of patients from two hospitals. Note: TSB total serum bilirubin

The mean birth weight of these neonates was 3038 (±482) g, mean gestational age was 38.4 (±1.6) weeks, median age was 3 days, 49.7% were males, and majority (80.2%) were Malays. Table 1 shows the mean TSB results measured by the two methods. The mean Bilistick TSB were lower than laboratory TSB. There is a positive linear relationship between the TSB measured by the two methods (Fig. 2). The Pearson’s correlation coefficients of TSB measured by the two methods are: r = 0.830 for Ampang Hospital, r = 0.952 for Selayang Hospital, and r = 0.901 when both hospital results were combined (p < 0.001). The linear regression equations of each hospital and when combined are shown in Fig. 2. The three regression lines intersected at Bilistick TSB levels which ranges from 258.3 to 300.5 µmol/L, where they give the most consistent predicted laboratory TSB levels at 280.7–320.9 µmol/L for each hospital and when both combined. For example, the laboratory TSB value at 290 µmol/L would be predicted in neonates from Selayang Hospital, Ampang Hospital, and both hospitals combined when their Bilistick TSB were 268.7, 268.4, and 268.0 µmol/L, respectively. The regression model based on the combined hospital data provides a simple method of conversion between Bilistick TSB and laboratory TSB for the health providers (Supplementary Figure S1).

Table 1 Descriptive statistics of total bilirubin levels measured by the two methods
Fig. 2
figure 2

Positive linear relationship between laboratory TSB (LabTSB) and Bilistick TSB levels in the two hospitals. Stars represent Selayang Hospital, and open circles (o) represent Ampang Hospital. For the Selayang Hospital, the regression line (*) (R2 = 0.906) between Lab TSB and Bilistick TSB is: Lab TSB = 29.38 + 0.97 × Bilistick TSB. This is represented by the dash–dot–dot–dash line. For the Ampang Hospital, the regression line (R2 = 0.690) between Lab TSB and Bilistick TSB is: Lab TSB = 45.72 + 0.91 × Bilistick TSB. This is represented by the dash line. For overall, the regression line (R2 = 0.812) between Lab TSB and Bilistick TSB is: Lab TSB = 35.39 + 0.95 × Bilistick TSB. This is represented by the solid line

Figure 3 shows the Bland–Altman plots of the TSB. The mean [laboratory TSB −Bilistick TSB] of all neonates is; 26.48 (±29.41) µmol/L. The 95% limits of agreement (−31.1577, 84.11772) contain 94.7% (=531/561) of the difference in TSB of all neonates. The [laboratory TSB – Bilistick TSB] values of Ampang Hospital had a wider scatter than those of Selayang Hospital. The same 95% limit of agreement contains a higher proportion (97.1%) of the difference in TSB in Selayang Hospital, than the 92.9% of difference in TSB in Ampang Hospital. Figure 4 shows that a majority of the [Bilistick TSB – laboratory TSB] values are in the negative range when plotted against laboratory TSB, indicating that the Bilistick TSB tend to underestimate the laboratory TSB. At laboratory TSB above 300 µmol/L, this trend continues, and the under-estimation was greater in Ampang Hospital.

Fig. 3
figure 3

Bland–Altman plot of [laboratory TSB − Bilistick TSB] of each of the jaundiced neonates (n = 561) versus their respective mean TSB. The mean [laboratory TSB − Bilistick TSB] is: 26.48 ± 29.41 µmol/L. The 95% limits of agreement (−31.1577, 84.11772) contain 94.65% (=531/561) of the difference in scores of all neonates. The same 95% limits of agreement contain 97.13% (=339/349) and 92.92% (=197/212) of the difference in scores for Selayang Hospital and Ampang Hospital, respectively. Stars (*) represent Selayang Hospital, and open circles (O) represent Ampang Hospital. Note: TSB total serum bilirubin, SD standard deviation

Fig. 4
figure 4

Relationship between [Bilistick TSB – Laboratory TSB] and Laboratory TSB. Majority of the [Bilistick TSB – Laboratory TSB] is in the negative range, indicating Bilistick tend to underestimate the laboratory TSB. At laboratory TSB above 300 µmol/L, this trend persists, and the underestimation is greater in measurement taken in Ampang Hospital. Stars (*) represent Selayang Hospital, and open circles (o) represent Ampang Hospital

Table 2 shows the predictive indices of laboratory TSB by Bilistick TSB at different recommended TSB threshold used by the clinicians for management of jaundiced neonates in Malaysia.10 The sensitivity and accuracy of the Bilistick TSB to predict laboratory TSB are achieved at lower Bilistick cut-off levels. The Bilistick has a 99% accuracy and 100% sensitivity to predict laboratory TSB of ≥80 µmol/L and of 360 µmol/L at Bilistick TSB levels of ≥55 µmol/L, and ≥315 µmol/L, respectively.

Table 2 Predictive indices of laboratory TSB levels between ≥80 to ≥360 µmol/L at various Bilistick TSB cutoff values (n = 561)

The median TAT of laboratory TSB was 98 min (range 24–424) in Selayang Hospital, 114 min (range 34–1039) in Ampang Hospital, and 105 min (range; 24–1039) in both hospitals. When compared with the TAT of the Bilistick method of 2 min, the hospital laboratory methods were significantly longer (p < 0.0001).

Discussion

This study compared the accuracy and TAT of TSB measured by the Bilistick against hospital laboratory methods in 561 jaundiced neonates with a wide range of hyperbilirubinemia in two hospitals. The Bilistick TSB had significantly much shorter TAT than the laboratory methods, and high positive correlation with the latter; however, it generally underestimated laboratory-measured TSB levels.

At the time of preparation of this manuscript, two additional Bilistick studies were reported in the literature.13,14 None of them had compared the TAT of Bilistick TSB with laboratory TSB. Similar to their findings, we found the combined coefficient of correlation of TSB between the Bilistick and laboratory methods of our two hospitals (r = 0.901) was lower than that reported by Coda Zabetta et al. where r = 0.963.8 Furthermore, we found that the mean difference in TSB between the two methods was much larger in our study (26.5 µmol/L) than those reported by Coda Zabetta (10 µmol/L)8 and Thielemans (20 µmol/L).13 One possible explanation could be the difference in sample sizes, as both of them had much smaller sample sizes than ours. We also did not measure the hematocrit nor the humidity in this study which Thielemans et al.13 showed that these could affect the accuracy of Bilistick readings.

Between the two hospitals where we conducted our study, there was also a difference in the extent of correlation between the two methods. Some possible explanations for this difference include: (a) variability in the accuracy of the Bilistick readers; (b) variability in the users’ levels of skills in handling the Bilistick readers, as suggested by the higher wastage of Bilistick test strips used to obtain TSB results in the Ampang Hospital, and more senior experienced doctors participated in this study in the Selayang Hospital; (c) larger number of patients recruited in the Selayang Hospital; and (d) different methods and machines used for measurement of TSB in the two hospital laboratories.

In Malaysia, due to shortage of hospital beds, all normal term neonates are discharged home at 6–12 h after birth. They are then visited daily by maternal and child health clinic (MCHC) nurses during the first week of life for detection of jaundice.9 Once jaundice is detected, they are referred to the MCHC or hospital ED for measurement of TSB. Jaundiced neonates are admitted only when their TSB reaches phototherapy level. For clinicians and MCHC nurses, it is not good enough just to have a POC with high correlation coefficient with hospital laboratory TSB. They need a low-cost, user-friendly device with short TAT and high accuracy to help make on-the-spot decision during home visits regarding admission and timeliness of treatment of jaundiced neonates, without under-diagnosing severe hyperbilirubinemia with resultant BIND or over-diagnosing and over-admission with resultant further over-crowding the wards.

In view of these reasons, we have constructed a reference chart (Supplementary Figure S1) and a table (Table 2) to guide health care providers in clinics or home visits on the timing and need for admission for phototherapy. In Table 2, we calculated the Bilistick TSB level for predicting laboratory TSB of 80 µmol/L because phototherapy is indicated in early-onset jaundice when this TSB is reached by 6 h of life.15,16 We have also included the Bilistick TSB level to predict laboratory TSB of 360 µmol/L which is the level for exchange transfusion for later onset jaundice.10 At these cutoff levels, the Bilistick has high sensitivity and accuracy to predict these laboratory levels based on our study findings.

The limitations of this study included: (a) no test carried out to compare the accuracy of both laboratory methods using a common standard sample, and (b) loss of data (15.9%) from some potential patients due to failure of the Bilistick system to measure TSB in neonates with high hematocrit, and when blood clotting occurred in the specimens before insertion into the Bilistick Reader. Although similar, our failure rate was less than were reported by Theilemans et al. where 48.6% of their 173 patients tested produced no TSB readings by the Bilistick system.13 One possible explanation could be due to the level of skills of the testers: the testers in our study were doctors, whereas in the study of Theilemans, their testers were non-medical health workers.

In conclusion, the POC Bilistick method markedly reduced the TAT of TSB and had high positive correlation with those measured by laboratory methods. It tends to underestimate laboratory TSB levels but has high accuracy when its cutoff measurements are set at lower levels to predict the latter.