Effect of image quality fluctuations on the repeatability of thickness measurements in swept-source optical coherence tomography

This study investigated the effect of image quality fluctuations on the repeatability of thickness measurements of the peripapillary retinal nerve fibre (PP-RNFL) and ganglion cell-inner plexiform (GC-IPL) layers using swept-source optical coherence tomography (SS-OCT). Three consecutive OCT scans each were performed on 56 healthy subject. Finally, 168 SS-OCT results were analysed. Based on the tertile values of the mean absolute difference of image quality score, all subjects were divided into the following three groups—low-(LIQD), moderate-(MIQD), and high-(HIQD) image quality score difference groups. A linear mixed model and intraclass correlation coefficients (ICCs) were used for analyses. Despite high ICC values (> 0.9), several sectors showed significant differences in the ICC values in intergroup comparisons. For LIQD-HIQD and MIQD-HIQD, most PP-RNFL sectors showed significant differences. For GC-IPL sectors, the LIQD-HIQD comparison showed significant differences in the temporosuperior (p = 0.012), inferior (p < .001), and temporoinferior (p = 0.042) sectors. Significant differences existed in the average GC-IPL (p = 0.009), nasoinferior (p = 0.035), and inferior GC-IPL sectors (p < .001) for MIQD-HIQD comparison. With higher image quality fluctuations, the repeatability of SS-OCT decreased in several sectors, which are considered clinically relevant in evaluating glaucoma status. Therefore, maintaining high-quality image status is essential to enhance the reliability of SS-OCT.


Comparison of PP-RNFL and GC-IPL thicknesses among the three groups.
shows results for the comparison of PP-RNFL and GC-IPL thicknesses among the three groups at each measurement sector. The linear mixed model showed no significant differences in PP-RNFL and GC-IPL thicknesses of different sectors among the three groups (Table 2). However, when the difference in image quality between OCT examinations was large, GC-IPL tended to be thick; this tendency was not seen in the peripapillary sectors.
Correlations between image quality and SS-OCT results at each measurement sector. Correlation analyses between image quality and OCT results at each measurement sector were performed for repeated measurements (Table 3). After adjusting for age and sex, five sectors showed significant negative correlations between image quality and PP-RNFL (average PP-RNFL, superotemporal, superior, inferior, and temporoinferior sectors) or GC-IPL (average GC-IPL, temporosuperior, nasoinferior, inferior, and temporoinferior sectors).
Comparisons of repeatability among the three groups at each measurement sector. ICC of three consecutive measurement values was calculated and compared among the groups ( Table 4). The overall repeatability was high in all sectors for all groups (ICC > 0.8). The ICC values were the lowest for the HIQD group in every measurement sector. Figure 1 shows the representative results for difference in thickness at each measurement sectors of PP-RNFL by image quality difference. With increase in the image quality difference value, the difference between the measured values increased accordingly. Results of between-group comparisons showed significant differences in repeatability at only two sectors (temporoinferior for PP-RNFL; inferior for GC-IPL) in the LIQD and MIQD groups. In addition, results of comparisons between LIQD and HIQD groups, and between MIQD and HIQD groups, showed significant differences in repeatability at most sectors for PP-RNFL, except at the superior, nasal, superior nasal, and nasoinferior sectors. On comparison of repeatability in GC-IPL sectors, significant differences were seen at the temporosuperior, inferior, and temporoinferior sectors between LIQD and HIQD groups, and at the average GC-IPL, nasoinferior, and inferior sectors between MIQD and HIQD groups. No sector showed significant differences in repeatability when compared between LIQD and MIQD groups. The proportion of sectors affected by image quality fluctuations was higher in PP-RNFL than in GC-IPL.

Discussion
The results of this study, which investigated the association between image quality fluctuations and repeatability of SS-OCT measurements, showed that repeatability decreases with an increase in image quality fluctuation in several sectors of PP-RNFL and GC-IPL. These observations were made in healthy subjects with an OCT image quality > 60, which was calculated as per manufacturer's recommendation for clinical use. Therefore, it Table 1. Comparison of demographics and clinical characteristics among groups. LIQD low image quality difference group, MIQD moderate image quality difference group, HIQD high image quality difference group, SD standard deviation. *Analysis of variance or chi-square test; all values are represented as mean ± SD or ratio. www.nature.com/scientificreports/ can be said that our study was conducted under settings wherein the factors affecting OCT results, such as low image quality (image quality score < 60) and structural alteration by ocular disease, were controlled. In addition, when the study groups were compared based on the mean absolute difference among three consecutive OCT measurements, no significant differences were noted in the measured thickness at any of the measurement sectors (Table 2). This result also indicates that there was no large deviation in the measured values of our data set. Nevertheless, even with good image quality (recommended for clinical use) and high repeatability (based on ICC), the measurement repeatability was affected by image quality fluctuations in several sectors, especially in comparisons involving the HIQD group. Moreover, this phenomenon affected sectors that are considered important in glaucoma management. Thus, it is crucial to maintain not only a high level of image quality but also a constant value of image quality for the clinical application of SS-OCT. Interestingly, although the HIQD group had the lowest ICC value of each measurement sector among the three groups, not all sectors showed significant differences on comparison with the LIQD or MIQD groups. In addition, only five sectors of the clock-hour map for PP-RNFL (superotemporal, nasal, inferonasal, inferotemporal, and temporoinferior sectors) showed ICC values under 0.9. If repeatability is exclusively determined by image quality, the repeatability of the OCT results obtained from subjects of HIQD group should be lower regardless of location of the measurement sectors. Segmentation is important for analysing the thickness of the retinal layer using OCT results. Although image quality is a critical factor for segmentation, ocular structural factors such as axial length, shape of optic disc, or tortuosity of retinal vessel also affect segmentation 3,10,11 . The superotemporal, inferonasal, inferotemporal, and temporoinferior sectors contain retinal blood vessels, which contribute to the structural variation of the parapapillary area. Thus, the anatomic structure around the optic disc, which varies largely even in healthy eyes, could have influenced the repeatability.
Inter-individual diversity in the optic disc shape and peripapillary structures contribute to inaccuracies in the measurement of PP-RNFL thickness by OCT. In contrast, the macular area is well-known for its inter-individual similarities [12][13][14] . Such inaccuracies might influence clinical decision-making in glaucoma management. Therefore, several studies have emphasised on the usefulness of GC-IPL parameters for the diagnosis of glaucoma in myopic eyes [15][16][17] . In the present study, the repeatability of GC-IPL sectors was relatively less affected by image quality fluctuations as compared to PP-RNFL sectors. This result further supports the usefulness of macular GC-IPL thickness evaluation for estimating glaucoma status, although further studies on patients with glaucoma are required to confirm this occurrence. Previous studies have shown a positive correlation between image quality www.nature.com/scientificreports/ and OCT-based measurement of macular or PP-RNFL thickness [18][19][20][21] , i.e., a reduction in image quality decreases the macular or PP-RNFL thickness, thereby leading to incorrect OCT interpretations of glaucoma progression. In this study, image quality correlated significantly in several sectors for both PP-RNFL and GC-IPL thickness, and this result did not change even after adjusting for age and sex. Therefore, image quality remains an essential factor in the interpretation of SS-OCT results. Unlike the correlation results reported previously, the negative correlation between the thickness values and image quality may be due to repeated measurements, small sample size, or unknown intrinsic characteristics of SS-OCT. It is possible that a study on patients with glaucoma may yield negative correlation between the thickness values and image quality.  9 . Both studies inferred that substantial differences in the signal strength lower the repeatability. Our study presents similar results using SS-OCT. Compared to previous studies, the use of three consecutive measurements for statistical analysis provide more reliability to this study, and this strategy is more appropriate for identifying the impact of image quality fluctuation on OCT results. This study has several limitations. First, although the data were collected prospectively, the number of subjects included was relatively small. Second, the effect of image quality fluctuation on repeatability was studied in healthy subjects. A similar study on patients with glaucoma will help to understand the clinical significance of image quality fluctuations on SS-OCT results. Third, the results of our study cannot be applied directly to other studies focused on other types of OCT. This is because the image quality score which was used for calculating image quality fluctuation in the present study was developed by the manufacturer of DRI OCT, although it is not difficult to predict that the accuracy of segmentation of the OCT will be lowered if the quality of the image deteriorates. Further studies involving other types of OCT seem necessary to clarify the effect of image quality fluctuation on repeatability in each type of OCT. Despite these limitations, our findings are meaningful because this is the first study to investigate the effect of image quality fluctuation on repeatability in SS-OCT using prospectively collected data. www.nature.com/scientificreports/ In conclusion, this study reported that higher image quality fluctuation leads to lower repeatability of SS-OCT results in several sectors of PP-RNFL and GC-IPL. Interestingly, the identified sectors were clinically important for glaucoma management. In addition, the repeatability of GC-IPL sectors was relatively less affected than that of PP-RNFL sectors by image quality fluctuations. Thus, maintaining a high-quality image status is vital to enhance the reliability of SS-OCT for PP-RNFL and GC-IPL measurements, more so in the PP-RNFL region.

Methods
This study collected raw data retrospectively from the dataset used in a previous study to compare the repeatability and agreement between SD-OCT and SS-OCT in healthy eyes 5 . The institutional review board of Yonsei University Severance Hospital, Seoul, Korea, approved this study (1-2019-0043), and the need for written informed consent was waived because of the retrospective study design. The study adhered to the tenets of the Declaration of Helsinki. The detailed characteristics of the subjects in dataset have been described previously 5 . Normal subjects who had visited the glaucoma clinic at our hospital between August 2014 and December 2014 were enrolled Medical history, Snellen best-corrected visual acuity (BCVA), slit-lamp biomicroscopy findings, intraocular pressure (IOP; Goldmann applanation tonometry), and indirect ophthalmoscopy findings were obtained. In addition, the following data were acquired: axial length estimated using the IOL Master (Carl Zeiss Meditec AG, Jena, Germany); central corneal thickness calculated using ultrasound pachymetry (DGH-1000; DGH Technology Inc., Frazer, PA, USA); optic disc and RNFL thickness measurements performed using a + 90 diopter (D) lens, colour disc, and red-free photography (VISUCAM200, Carl Zeiss Meditec AG, Jena, Germany). Optic nerve function had been estimated using a Humphrey Visual Field analyser (24-2 Swedish Interactive Threshold Algorithm; Carl Zeiss Meditec, Inc., Dublin, CA, USA).
Healthy subjects of age > 19 years with a BCVA ≥ 20/25 and no evidence of glaucomatous optic disc changes, RNFL defects, or visual field changes with IOP < 21 mmHg were included retrospectively. The eye that was analysed in each patient was selected randomly. Exclusion criteria were the presence of cataract grade of Lens Opacities Classification System III > 3, axial length > 24.5 mm, refractive errors with spherical equivalent > ±5D, Thickness measurement using SS-OCT for repeatability. In this study, we used the DRI OCT-1 system (Topcon, Tokyo, Japan, analysis software version 9.1.2.28693), which had a high-speed wavelength tuning laser source with central wavelength of 1,050 nm. This SS-OCT system had an image acquisition speed of 100,000 A-scan/second, with an axial and transverse resolutions of 8 and 20 µm, respectively. Three consecutive SS-OCT scans were acquired on the same day with an interval of at least 5 min between the scans. A single technician performed all scans using an internal fixation target. Pupillary dilation was performed in all subjects. A three-dimensional (3D) optic disc and 3D wide scan protocols were used to measure PP-RNFL and GC-IPL thicknesses, respectively. The 3D optic disc scan covered a 6 × 6-mm area on the optic disc and comprised 512 A-scans × 256 B-scans. PP-RNFL thickness was measured in a 3.4-mm-diameter scan circle centred on the optic disc. The 3D wide scan protocol covered a 12 × 9-mm rectangular area centred between the optic disc and fovea and comprised 512 A-scans × 256 B-scans. PP-RNFL thicknesses was measured in each quadrant (evenly spaced 4 sectors), 12 clock-hour sectors (evenly spaced 12 sectors), and as an average. The quadrant PP-RNFL sector names started with the number 4, while the clock-hour sector names started with the number 12. The average GC-IPL thickness and measurement in each of six sectors (evenly configured sectors centred on the fovea) were collected. Built-in automated segmentation algorithms were used to distinguish each retinal layer. Two investigators (S.Y.L. and Y.H.) independently reconfirmed the image quality, segmentation, and alignment of the measurement window. SS-OCT images with image quality scores > 60 were selected for analysis according to the manufacturer's recommendation. The mean absolute difference among three consecutive OCT measurements were calculated as follows: where IQ n -image quality score at the nth measurement. The subjects were stratified into three groups based on the tertile values of the mean absolute difference of image quality score-LIQD (n = 18), MIQD (n = 19), and HIQD (n = 19). Because subjects in the LIQD group Mean absolute difference of image quality score: www.nature.com/scientificreports/ were included in the first third when the mean absolute difference of image quality score was listed in ascending order, they had similar image quality scores among the three consecutive OCT results. In contrast, subjects in the HIQD group showed substantial variation among the three image quality scores because these subjects were the last third subjects.

Statistical analyses.
Analyses of variance and chi-square tests were performed for the comparison of continuous and categorical variables between the groups. A linear mixed model compared the thickness values among the three groups. To determine the repeatability of three consecutive measurements, intraclass correlation coefficients (ICCs) were used. The degree of repeatability was decided according to the ICC value-almost perfect (0.81-1), substantial (0.61-0.8), moderate (0.41-0.6), fair (0.21-0.4), and slight (0-0.2) 22 . To compare the between-group ICC values, the z-score test was used [22][23][24] . Pearson's correlation coefficients with and without adjustment of age and sex were used to investigate correlation between the image quality and thickness value. Correlation coefficients were estimated using a linear mixed-effects model to consider three datasets in one individual. All statistical analyses were performed using SAS version 9.4 software (SAS Institute Inc., Cary, NC, USA) by a statistician (H.S.L). Statistical significance was defined as p value < 0.05.