Introduction

The choroid is composed predominantly of blood vessels, and the choroidal thickness (CT) can be adjusted relatively quickly depending on ocular and refractive abnormalities.1 Because it is located posterior to the retina, limited studies have been performed on the choroid. In vivo estimations of CT have been performed since the development of enhanced depth imaging spectral-domain optical coherence tomography (EDI SD-OCT).2 Significant differences have been identified in CT measurements made using SD-OCT in various diseases, including central serous chorioretinopathy (CSC), age-related macular disease, and polypoidal choroidal vasculopathy (PCV), compared with measurements from a normal group. These findings are anticipated to be beneficial in clarifying the pathophysiology of various ocular diseases.3, 4

Despite advances in mechanization and research about CT, an automated measurement method based on recognition of the choroidal boundary obtained using SD-OCT has not been developed. A few recent studies have reported on the automated segmentation of choroid, whereby experienced observers with a good knowledge of the anatomical structure of the choroid as yet measured CT manually.5, 6 Consequently, the reliability of manual measurements has been demonstrated several times, and CT measurements made by experts show high repeatability.7, 8

As indistinct choroidal junctions decrease the reliability of the measurement of CT, studies have been conducted to determine the factors affecting manual CT measurement. Lin et al9 suggested that image inversion optimized images, while Chhablani et al10 demonstrated that high-density scanning was favorable for CT measurement. A recent study reported that measurement of CT based on 1 : 1 pixel images had a higher probability of overestimation than measurement of CT based on 1 : 1 micron images.11

The purpose of this study is to compare reliability of manual CT measurement in normal eyes with those in previous studies and to examine the effects of CT on intraobserver and interobserver repeatability. Furthermore, the current trial investigated the influence of image scale setting on the repeatability of the measurement.

Materials and methods

This was a retrospective, cross-sectional study performed on normal eyes after obtaining the approval of the institutional review board (Research Governance Committee) at Ewha Womans University Mokdong Hospital (Seoul, Korea). A normal eye was defined as displaying continuity of the retinal pigment epithelium (RPE) without the presence of abnormal retinal lesions on slit-lamp funduscopy. This study included individuals aged between 20 and 60 years who visited the Department of Ophthalmology at Ewha Womans University Mokdong Hospital and underwent EDI-OCT from March, 2012 to November, 2012. Medical records were reviewed for collecting data including age, gender, ocular history, and the results of 90-diopter lens slit-lamp biomicroscopy. The following subjects were excluded: those with a history of laser treatment, intravitreal injection, and other interventions on the retina; the presence of retinal lesions such as retinal hemorrhage, cotton wool spots, and other macular lesions observed on slit-lamp examination; and changes in the uniformity and continuity of the RPE due to macular diseases such as drusen, choroidal neovascularization, CSC, and other diseases observed by OCT. However, patients with retinal diseases contracted in the contralateral eye were included as subjects. Subject eyes with a quality score of <30 points were also excluded. The quality score ranged from 0 (poor quality) to 40 (excellent quality) using the Heidelberg Eye Explorer software (Heidelberg Engineering Co., Heidelberg, Germany). Lower-scored scans that exhibited noise in the whole image were generally caused by poor fixation or media opacity, such as a cataract. When both eyes satisfied all of the criteria, the right eye was included as the subject eye. The initial image taken during the study period was preferentially used when multiple OCT images were obtained of the subject eye and these images satisfied all criteria.

Two engineers experienced with the spectral-domain OCT device (SD-OCT; Spectralis, wavelength: 870 nm; Heidelberg Engineering Co.) performed the EDI-OCT scan as described previously.2 Transfoveal horizontal and vertical line sections, each composed of 100 averaged scans were obtained. Employing the manual caliper provided with intrinsic Heidelberg software, subfoveal thickness was measured at the center. Two subfoveal choroidal thickness (SFCT) measurements from a horizontal scan and a vertical scan were averaged, and those averaged values were used for statistical analysis. Two ophthalmologists measured the CT as observers (A and B). These two observers were experienced trainees from the same institution and referred the results to one retinal specialist for the determination of the chorio-scleral junction. They had undergone intense training and had measured the CT of 2800 scans of 700 subjects over a 3-month period. Observers A and B independently measured the CT for evaluation of interobserver repeatability (A1 and B) and observer A, who was blinded to the previous measurements, performed an additional measurement on the other day for determination of intraobserver repeatability (A1 and A2). In addition, retinal thickness (RT) was also manually estimated for comparisons of repeatability. Furthermore, all measurements were conducted for both scales; that is, a 1 : 1 pixel ratio and 1 : 1 micron ratio. In general, all OCT images are displayed as a 1 : 1 pixel ratio rather than a 1 : 1 micron ratio, which reflects the physical dimensions being stretched approximately threefold horizontally. Two observers measured the CT by magnifying the image 200% to clearly visualize the RPE, after identifying the chorio-scleral junction under low magnification. When the chorio-scleral junction was indefinite, the contrast of the image was adjusted to obtain a clear distinction.

RT was defined as the perpendicular distance from the internal limiting membrane at the deepest portion of the foveal pit to Bruch’s membrane. CT was also defined as the vertical distance from the hyperreflective line of Bruch’s membrane to the inner sclera. The SFCT was defined as the CT measured perpendicular to the hyperreflective line of subfoveal Bruch’s membrane (Figure 1).12 When the hyperreflective line of Bruch’s membrane was indistinguishable from the RPE, the CT was measured from the outermost hyperreflective line of the RPE. The inner sclera was designated as the outermost hyperreflective chorio-scleral interface (CSI).13 When the CSI was not visible on the thick choroid, the outermost hyporeflective line in the sclera was measured.7 Since a large choroidal vessel, which was immediately in front of the sclera, is usually accompanied by an invisible CSI, the outermost choroidal vascular hyporeflective line formed a boundary.

Figure 1
figure 1

Representative optical coherence tomography (OCT) images. Note the choroid–sclera junction (white arrowheads). The choroidal thickness (white arrow) is defined as the vertical distance from the hyperreflective line of Bruch’s membrane to the inner sclera. (a) 1 : 1 pixel image; (b) 1 : 1 micron image of the same scan.

Subgroup analysis was performed to compare the repeatability according to the magnitude of the CT measurements. Based on the mean CT±1 standard deviation (SD) introduced by Margolis and Spaide,12 the subjects were divided into three groups: thin CT ≤211 μm; intermediate CT >211 to ≤363 μm; and thick CT >363 μm. For categorization, the initial measurements from the two observers (A1 and B) of the 1 : 1 pixel image were averaged and the subjects’ eyes were classified into three groups based on these mean CT. These measurements (A1 and B) were also used to determine the interobserver repeatability (observers A and B), while the blinded two measurements of observer A (A1 and A2) were used to obtain the intraobserver repeatability.

Mean (±SD) values of the CT were calculated in each subgroup and for the combined subgroups. Data were tested for normality using a Kolmogorov–Smirnov test, and a paired sample t-test was used to compare significant differences between the measurements of the two observers. The coefficient of variation (CV) was obtained for the comparison of SD in terms of thickness. The reliability of the CT and RT measurements was evaluated by calculating interobserver intraclass coefficients (ICCs).14 ICC values >0.75 were considered to represent excellent repeatability.15, 16 To estimate reliability, the coefficient of repeatability (CR) was also assessed.17 Interobserver gaps, the differences in measurements of two observers, was calculated in all cases and compared with the CRs of the previous report for estimating relative errors.7 According to Rahman et al,7 a CR>35 μm between the two observers in both the 1 : 1 pixel and 1 : 1 micron scales was considered to represent a significant interobserver gap. Bland–Altman plots were used to determine the agreement.17 Statistical analysis was performed using MedCalc for Windows, version 12.4.0.0 (MedCalc Software, Mariakerke, Belgium). P-values <0.05 were defined as statistically significant.

Results

A total of 426 subjects underwent EDI-OCT retinal scans using SD-OCT during a 9-month period. Of these, 189 eyes of 189 individuals were included in the study, excluding 172 eyes with macular lesions and 65 eyes with a poor-quality score due to a cataract or other conditions. The subjects consisted of 102 male and 87 female subjects, and the mean age was 45.1±11.1 years (range 20–60). Twenty-six contralateral eyes contracted retinal diseases include CSC (14 eyes), retinal vein occlusion (8 eyes), epiretinal membrane (2 eyes), and PCV (2 eyes). For both image modes, the mean SFCT did not significantly differ between two observers (P=0.5663 for the 1 : 1 pixel image, P=0.2839 for the 1 : 1 micron image, respectively). The mean SFCT was 315.3±89.2 μm in the 1 : 1 pixel images and 312.6±88.4 μm in the 1 : 1 micron images based on the initial measurements (A1 and B) of the two observers. According to the Kolmogorov–Smirnov test, both values followed a normal distribution (P=0.9446 for the 1 : 1 pixel image, P=0.9238 for the 1 : 1 micron image, respectively).

The subjects’ eyes were classified based on the mean CT obtained by averaging the initial measurements from the two observers of the 1 : 1 pixel image. Based on these mean SFCT, the subject eyes were categorized into three subgroups as follows: 26 eyes in the thin CT group, 111 eyes in the intermediate CT group, and 52 eyes in the thick CT group. The mean SFCT of each subgroup was 180.0±30.3 μm in the thin CT group, 295.4±40.0 μm in the intermediate CT group, and 425.3±53.9 μm in the thick CT group, respectively, for the 1 : 1 pixel image. The mean SFCT for the 1 : 1 micron image was 180.0±32.2, 292.2±38.9, and 421.6±55.4 μm in the thin, intermediate, and thick CT groups, respectively. An increasing trend in the SD was exhibited regardless of image setting, as the mean thickness increased from thin CT group to thick CT group. When comparing the SD using the CV, a distinctly greater variance was observed in interobserver measurements as the average thickness increased (Table 1).

Table 1 Summary of the intraclass correlation coefficients (ICCs), 95% limits of agreement (LoAs), coefficients of repeatability (CRs), and coefficients of variation (CVs)

In intraobserver analysis, all subgroups and the combined subject eyes exhibited ICC values >0.75, indicating excellent repeatability. However, relatively lower repeatability was observed for the 1 : 1 pixel images from the thick CT group (ICC=0.747; Table 1). Moreover, the ICC showed a decreasing trend as the thicknesses increased, regardless of the image settings. The ICCs of the two observers were >0.9 for all subject eyes combined and the thin CT group. The ICCs for the 1 : 1 pixel and 1 : 1 micron images were 0.809 and 0.813 in the intermediate CT group, respectively, and 0.747 and 0.758 in the thick CT group, respectively. Twenty-seven subjects (14.3%) displayed a significant interobserver gap (CR>35 μm), 1 (3.8%) of 26 eyes in the thin CT group, 13 (11.7%) of 111 eyes in the intermediate CT group, and 13 (25.0%) of 56 eyes in the thick CT group. Among these 27 eyes, none in the thin CT group, 4 in the intermediate group, and 9 in the thick CT group were identified as eyes with indistinct CSI.

The correlation coefficients (r) measured by a single observer were >0.9 for all subgroups. However, the correlation coefficients of interobserver repeatability were 0.9 for overall subjects and thin CT groups, and were even <0.7 for intermediate and thick CT groups on both image settings. The r for the 1 : 1 pixel and 1 : 1 micron images were 0.896 and 0.935 in the thin CT group, respectively; 0.680 and 0.688 in the intermediate CT group, respectively; and 0.624 and 0.645 in the thick CT group, respectively. All r were statistically significant (P<0.0001; Table 2). The intra- and interobserver ICCs of manually estimated RT showed excellent repeatability, regardless of the image settings (ICCs>0.9). Furthermore, all the correlation coefficients of RT showed a significant positive correlation (P<0.0001; Table 3).

Table 2 Correlation coefficients of the choroidal thickness measurements from two observers
Table 3 Intraclass correlation coefficients (ICCs), coefficients of repeatability (CRs), and Pearson’s correlation coefficients (r) of the retinal thickness measurements

In the Bland–Altman plot, the intermediate and thick CT groups had a greater tendency for errors compared to the thin CT group. This tendency was greater for the interobserver measurements than for the intraobserver measurements, regardless of the image settings (Figure 2).

Figure 2
figure 2

Bland–Altman plot of two subfoveal choroidal thickness (CT) measurements. These graphs plot the difference against the average of two measurements, with 95% limits of agreement (broken lines) and the mean difference (black line). (a, c) Shows intraobserver repeatability and (b, d) indicates interobserver repeatability. Bias increased from CT≥300 μm regardless of the image mode performed by a single observer (a, c). An increasing tendency in bias was exhibited from ≥200 μm in the CT measurements performed by the two observers (b, d).

Discussion

The precise measurement of CT is crucial in understanding the pathophysiology of various ocular diseases. However, the current study suggests that an increased CT leads to a decrease in the reliability and accuracy of the CT measurements. The intraobserver repeatability was also suggested to be slightly better than interobserver repeatability. Manual measurement of RT also exhibited excellent repeatability. This result suggests that the interobserver differences arose from the difficulties of choroidal boundary estimation, not from errors by the observers. Previous studies excluded eyes with indistinct CSI or extrapolated the CSI.18, 19 Although exclusion of eyes with an indefinite boundary could result in higher repeatability, it might not reflect the practical aspects and cause selection bias. Hence, the current study adopted the extrapolation method, because the indistinct CSI should be considered a significant factor in measurement error. The image quality score was used as a control.

A myriad of recent investigations have also examined the reliability of manual measurements of CT in healthy subjects imaged by OCT using the EDI technique.7, 8 Rahman et al7 acquired high repeatability for SFCT measurements using SD-OCT in healthy young individuals (mean age 38±5 years). Their study obtained a mean CT of 332±90 μm in the right eye and 332±91 μm in the left eye, which is very similar to the mean±SD in the present study (315.3±89.2 μm for 1 : 1 pixel images and 312.6±88.4 μm for 1 : 1 micron images). The Bland–Altman plot also showed a decreasing trend towards agreement when the CT increased above average, similar to the current investigation. However, this finding was considered insignificant. In a comparison for all thicknesses, the CRs of interobserver repeatability ranged from 30.9 to 33.2 μm, which were better than in this investigation. A discrepancy in interobserver measurement of 35 μm, based on Rahman et al’s7 study, was used to identify images for further inspection of potential factors contributing to the low CR. This highlighted indistinct choroidal scleral limit.

Shao et al8 performed EDI-OCT for 3233 subjects who participated in the Beijing Eye Study 2011, whereby they showed a relatively high intra- and interobserver reproducibility of SFCT measurements by EDI-OCT. Their study revealed intraobserver ICCs of 1.00 (P<0.001) and interobserver correlation coefficient (r) of 0.99, which were markedly higher than the present study. But they are not directly comparable with the present study, because of older age (mean age 64.3±9.6 years) and thinner SFCT (mean CT 254.6±107.3 and 253.8±107.4 μm), which also showed excellent interobserver repeatability in this study. Two previous investigations revealed high reliability and a strong correlation when comparing repeatability after scanning with three different SD-OCT devices.18, 19 Yamashita et al,18 however, only considered eyes for which a distinct choroid boundary was seen by both observers, and Branchini et al19 also averaged the measurements of two observers. Consequently, their studies had a greater tendency to overestimate repeatability than ours. In contrast, this study included subject eyes that were more difficult to measure due to old age, high myopia, and the presence of CSC in the fellow eye.

In this study, there was a greater possibility of error as the SFCT increased and this tendency was manifested from the thickness of >200 μm. As the choroid thickens, the mistaken slope of the vertical line may affect repeatability. Any oblique placement of the cursors marking the perpendicular line will induce a measurement error, which is greater for longer lines across thicker choroids. Such errors in manually set vertical lines can be improved by using imaging programs that provide automatically set lines. Chen et al20 compared the reproducibility of SFCT measurements assessed in 36 normal eyes using Heidelberg Eye Explorer software and ImageJ software (Bethesda, MD, USA), and verified that the reproducibility was greater in measurements made with ImageJ software. Difficulties presented in the image transfer are thought to be enhanced with the development of computerized measurement techniques. Measuring SFCT from below the macula may also exaggerate the errors between observers, especially in thicker choroid. This unsophisticated measuring method can be ameliorated by measuring from different points. Consequently, three-dimensional mapping of the SFCT using the ETDRS layout is expected to reduce measurement errors.21 However, the above-mentioned multiple measurements are clinically burdensome and still display limitations, including a vague junction of the choroid and sclera.

As the CT becomes larger, the signal transmission declines. It results a decrease in contrast and an indefinite chorio-scleral junction. Though this study established strict inclusion criteria affecting image quality, the contrast at the chorio-scleral junction declined regardless of quality and it consequently led to a variable junctional setting between observers who were even trained together over a 3-month period. In the present study, 27 (14.3%) eyes ultimately showed a significant interobserver gap. In the thick CT group, 13 eyes displayed a significant interobserver gap and 9 (69.2%) eyes had vague CSI. However, none of the eyes in the thin CT group and 4 of 13 eyes in the intermediate CT group, which demonstrated a significant interobserver gap, had vague CSI. This result suggests that indistinct CSI, which was frequent in eyes with thick CT, may be an important impeding factor in the measurement of CT. In this respect, a recent study was remarkable in that it quantified the outer choroidal contrast using custom analysis software.9 As the investigators focused only on image direction and different OCT devices, further studies are necessary to assess contrast in the thick choroid. Furthermore 1060 nm OCT, the latest technology, has greater specificity for the choroid and sclera through enhanced transmittance, enabling better resolution of the structure of the choroid and the chorio-scleral junction than the type of OCT used in the present study.22 Therefore, the future commercialization of 1060 nm OCT is anticipated to provide more useful information for analyzing the pathogenesis of macular abnormalities.

In current study, the difference in reproducibility between 1 : 1 pixel and 1 : 1 micron images was also compared. The product manual does not state which scale is recommended for measuring the CT. However, most previous reports measured the CT using the 1 : 1 pixel setting. Although the 1 : 1 pixel ratio shows all acquired pixels, the 1 : 1 micron ratio arranges the pixels using the same scale horizontally and vertically. As a result, the 1 : 1 micron ratio must be vertically compressed by approximately threefold to reflect the physical dimensions. Although the 1 : 1 pixel scale setting can show a subtle structural change more clearly, a little out of the perpendicular measuring line may induce a large error during manual measurement. As the vertical line extends longer in the thick choroid, the errors are increased. However, in the present study slightly greater reproducibility and correlations were indicated by the statistical evaluation of the 1 : 1 pixel images compared to the 1 : 1 micron images. Although Kim et al11 reported an overestimation in 1 : 1 pixel images, these errors did not appear to have any significant influence on the reproducibility.

The present study has several limitations. Since the data were collected in a retrospective manner, the intervisit repeatability was not assessed. The intervisit repeatability might have shown that the increased magnitude of CT leads to relatively lower repeatability, based on the diurnal variation in CT reported in recent articles.23 And all SFCT in this study were measured with averaging values of two sections. Although many investigations have obtained the CT by averaging methods at the present time, several recent works have shown new possibilities in measuring modalities such as high-penetration swept source OCT, choroidal contrast quantification, and even automated segmentation. Those modalities might be expected to improve the repeatability of CT measurements; however, the current study suggests that further evaluations are needed to verify the repeatability of novel methods, particularly in the thick choroid.6, 9, 22, 24, 25

In conclusion, CT measurement in eyes with a very thick choroid may be inaccurate. Increased CT leads to a lower interobserver compared with intraobserver repeatability, and this tendency was consistent despite the thorough peer-review process. Intraobserver CT measurements could be useful for investigation of disease characteristics and progression. However, attention should be paid during interpretation of numerical CT values reported by different observers or at different facilities, especially in thick choroids.