Introduction

Conjunctival hyperaemia is caused by vasodilation of the conjunctival blood vessels against the white background of the sclera. The vasodilation produces the red appearance of the white of the eye, and so the condition is sometimes referred to as ‘red eye’, whereas an apparently healthy eye with no vasodilation is referred to as a ‘white eye’ (Figure 1).

Figure 1
figure 1

Typical ‘normal’ white eye (courtesy of Dr Trefford L Simpson, Centre for Contact Lens Research, School of Optometry, University of Waterloo, Canada).

Increased conjunctival hyperaemia is a clinical sign for a wide range of ocular disease, inflammation, and irritation. Among the many conditions it has been recorded with are meibomian gland dysfunction and marginal blepharitis,1 conjunctivitis,2, 3, 4 contact lens wear,5, 6 cosmetics,7 hypertension, diabetes,8 acute angle-closure glaucoma, autoimmune disease, chemical injury,9, 10 episcleritis, uveitis,9, 11 sickle cell disease,12 and pharmaceutical drug use.13

Clinical grading scales that allow the assessment of severity have been developed for many ocular conditions, including the anterior chamber angle,14 iris neovascularisation,15 retinal nerve fibre layer atrophy,16 focal narrowing of retinal arterioles in glaucoma,17 diabetic retinopathy,18 hypertensive arteriosclerosis,19 tarsal abnormalities,20 and lens opacities.21 Similar scales have been developed to grade conjunctival hyperaemia.22, 23, 24, 25, 26 These bulbar redness scales have utilised verbal descriptions, photographs, or paintings that illustrate an increasing level of conjunctival hyperaemia, and they have been particularly used in clinical studies of contact lens wear and dry eye.22, 27, 28, 29, 30, 31 The grading scale is typically divided into four or five grades. However, the scales can be interpolated into decimal intervals to increase their sensitivity.32, 33 Papas34 showed that by decimalising the Cornea and Contact Lens Research Unit (CCLRU) grading scale for bulbar redness, the grading approximates an interval scale. The problem of an ordinal grading scale, producing unequal grading divisions, has also been considered by using digitised morphing of the grading scales, or by removing the subjective input of the observer through image analysis of the image.34, 35, 36, 37, 38, 39, 40

Although conjunctival hyperaemia is accepted as an important clinical sign of ocular disease or inflammation, and grading scales are frequently used to assess the severity or degree of change in bulbar redness, no previous studies have been presented that consider the normal, unstimulated level of conjunctival hyperaemia. An understanding of what can be considered normal is crucial when assessing any presenting conjunctival hyperaemia. In this paper, we report the prevalence of conjunctival hyperaemia in healthy, non-contact lens wearing eyes, in a cross-sectional study and inter-observer agreement of the CCLRU bulbar redness scale.

Materials and methods

Prevalence study

A total of 121 healthy subjects (male=58, female=63, median age=28 years, range=16–77) participated. All subjects had no current or previous ocular disease or systemic disease, medication, or allergy known to affect bulbar redness. Subjects with subclinical minor ocular conditions, such as marginal blepharitis, may have been included. As such, our sample represents a typical population that may be present in a clinic. Contact lens wearers were included, if the contact lenses had not been worn during the previous 2 weeks. A duration of 2 weeks has been considered as sufficient time for any contact lens-related conjunctival hyperaemia to have resolved.38 Conjunctival hyperaemia was assessed by two trained observers (JL, MS) using the CCLRU grading scale, interpolated to 0.1 unit increments. This photographic scale was developed by the CCLRU at the University of New South Wales, Australia and comprised four images that increase in severity of the condition, and are labelled as follows: 1, very slight; 2, slight; 3, moderate; 4, severe. Only the right eye of each subject was examined using a slit-lamp bio-microscope (× 10 magnification) under diffuse, white illumination. The subject's position of gaze was directed to allow grading of four quadrants: superior, nasal, inferior, and temporal. The bulbar redness score was defined as the average of the scores of the four quadrants.

Inter-observer study

A further 20 subjects (male=8, female=12, median age=21 years, range=19–28) were recruited to assess the inter-observer agreement between the two observers, at the completion of the prevalence study. The study procedure was repeated using the same selection criteria and grading procedures. The four quadrants of the right eye of each subject were independently graded by the two observers (JL, MS) (ie masked from each other's observations), and the order of subject assessment by observer was randomised.

Data analysis

As the prevalence (Figure 2a–e) and the inter-observer difference (Figure 4) data were approximately normally distributed, and this grading scale approximates an interval scale,34 and Barbeito and Simpson41 have argued that parametrical statistical tests can be applied to such data, we used parametric statistical tests. Inter-observer agreement was determined as the 95% limits of agreement,42 which is 1.96 times the standard deviation of the inter-observer difference scores (ie grade from observer 1 minus grade from observer 2).

Figure 2
figure 2

Distribution of (a–d) the redness scores for each quadrant, (e) bulbar redness scores, and (f) the quadrant mean redness scores.

Figure 4
figure 4

Inter-observer agreement in bulbar redness (n=20) was good, with a coefficient of agreement of 0.4 units.

Results

Prevalence study

As shown in Figure 2, a significant difference in redness was found between quadrants (repeated measures ANOVA, F3,360=281, P<0.0001). Post hoc paired t-tests found significant differences in redness between all quadrants (t120>4.3, P<0.0001), with the nasal (2.3±0.4) (mean units±SD) and temporal (2.1±0.4) quadrants redder than the superior (1.6±0.4) and inferior (1.7±0.4) quadrants.

The average bulbar redness was 1.93 (±0.32) units (Figure 2e). Figure 3 shows that bulbar redness appeared to increase slightly with age by about 0.05 units per decade (r119=0.23, P=0.01); however, a multiple regression analysis (F2,118=9.4, P=0.0002) found that most of the apparent effect of age was explained by males having redder eyes than females by 0.22 units (age: t118=1.48, P=0.14; gender: t118=3.43, P=0.0008), there being more older subjects who were male and more younger subjects who were female. One observer had a slight tendency to record higher redness scores, which was not accounted for by differences in subject ages or genders (difference in average bulbar redness 0.22 units, F1,113=4.1, P=0.045).

Figure 3
figure 3

Bulbar redness appeared to increase with age (n=121), but the effect was explained by the greater bulbar redness of males.

Inter-observer study

No significant difference was found between the grading of the two observers, overall or for each quadrant (t19<1.54, P>0.14), except for the temporal quadrant (t19=2.54, P=0.02). Figure 4 shows the inter-observer comparison of redness scores for the average bulbar redness, and the tendency for one observer to give higher scores to redder eyes (r2=0.38, P=0.004). The 95% limits of agreement was 0.38 units for bulbar redness, and varied between quadrants with a maximum of 0.85 (inferior) and minimum of 0.50 (nasal) units. Agreement may have been improved by controlling gaze eccentricity, which may be appropriate for a research study, but is not practical in clinical practice.

Discussion

The average bulbar redness of 121 people with healthy (white) eyes was 1.9 units. As the upper 95% confidence limit was 2.6 units, a CCLRU bulbar redness grade of more than 2.6 may be considered abnormal. This average grade and upper confidence limit was higher than our a priori expectations. In similar studies of healthy eyes, the median corneal staining grade was 0.1 units and the upper confidence limit was 0.5 units,43 whereas the average upper palpebral conjunctiva grade was 1.2 units and the upper confidence limit was 2.0 units.44 Although the typical conjunctival staining is consistent with the generalised verbal grading proposed by Woods23 and implied by the written descriptions associated with the CCLRU grading scale and other grading scales (eg Efron, 1997), typical palpebral conjunctival grades and bulbar redness grades appear higher. We consider two alternative explanations for bulbar redness: either the normal ‘white eye’ appearance is redder than previously assumed, or the calibration of the grading scale is wrong.

As McMonnies and Ho45 and McMonnies et al46 described how conjunctival hyperaemia can vary with factors such as lack of sleep, eyestrain, wind, dust, smog, smoke, and alcohol, we screened our subjects for these factors. However, 11 potential subjects who did not meet our selection criteria, mainly because of the use of medications, showed no apparent difference in bulbar redness compared to the 121 healthy eyes. Figure 1 demonstrates the effect of image magnification on the perception of conjunctival hyperaemia. In our study, the eye was observed under diffuse light, using a slit-lamp microscope at × 10 magnification. Conjunctival hyperaemia is often observed unaided at a distance of about 1 m (eg facing a patient across a desk). Even at our low slit-lamp magnification, smaller blood vessels become evident and may influence the observer's perception of the clinical grade. However, the CCLRU grading scale was designed with the expectation that the observer would use a slit lamp. Efron26 reported a similar higher than expected hyperaemia grading, indicating that a grade of more than 2 (with the Efron Grading Scale) is abnormal. The Efron scale offers a similar grading range to the CCLRU scale, but is pictorial. It seems likely, then, that the normal ‘white’ eye is redder in appearance than commonly determined through casual observation.

Turning to the second hypothesis, if it is assumed that ‘normal’ should be located around the lower grades on the scale (eg Woods, 1989) to provide room for progression of the condition; then the high average bulbar redness and range of 1.2–2.9 units among these 121 healthy eyes suggest that the CCLRU scale may have an inadequate or misplaced dynamic range. A good grading scale must be both sensitive to the severity of the condition and specific in determining what is normal. Although the average bulbar redness grade for our subject population was higher than we expected, the wide variance of the distribution (Figure 2e) and the relatively small inter-observer 95% limits of agreement indicate that the grade is able to distinguish between degrees of conjunctival hyperaemia. Figure 3 shows that no eye received a bulbar redness grade of 1 unit or less. The photographic image used as an example for a grade 1 (very slight) is particularly white in appearance, and may illustrate unusually low conjunctival hyperaemia. Although this grade may not be needed for grading the normal appearance, it ensures that the scale provides for the abnormal condition of a very white eye, such as that produced by anaemia. It also suggests that the CCLRU grading scale for conjunctival hyperaemia may need to be extended to values greater than its current maximum grade of 4 units.

It is interesting to note the variation in redness across the four conjunctival quadrants, and the age- and gender-related differences in average bulbar redness. The nasal and temporal quadrants have the highest redness scores, possibly reflecting their exposure to environmental conditions. The same variation was noted by McMonnies et al28 and Papas et al.31 For the purpose of analysis, the four quadrants were averaged to produce our bulbar redness score. As this bulbar redness includes the results of the superior and inferior quadrants, if the observer had chosen to grade conjunctival hyperaemia from the exposed conjunctiva only, then the bulbar redness scores would have been higher. If bulbar redness had been based on the temporal and nasal quadrants only, the average bulbar redness would have been 2.21±0.36 (range 1.2–3.1) units, and the upper 95% confidence limit 2.92 units. When a bulbar redness score is recorded, the quadrants viewed should be recorded and care must be taken when comparing bulbar redness scores. In our study, males tended to have redder eyes than females by about 0.2 units, and bulbar redness increased by 0.05 units per decade. These findings are similar to McMonnies and Ho,45 who observed 470 non-contact lens wearers (227 males and 252 females). Using the McMonnies scale, with its six grades and no decimal interpolation, they found an average difference of 0.5 units between genders, and a grading change of 0.16 units per decade. The gradual increase in redness with age may be attributable to a reduction in arteriolar wall muscle tone, but there is no obvious explanation for the difference between genders.

A difference in bulbar redness of 0.4 units or more between observers may be considered significant, because such a difference would be greater than the inter-observer 95% limits of agreement found for our two observers. No significant difference was found between the two observers, except for the temporal quadrant. There was a tendency for observer 1 to give lower scores than observer 2 to eyes that were more red (Figure 4). In our study, the two observers were trainee optometrists. Before the study commenced, the two observers and one of the other authors (PJM, an experienced user of clinical grading scales) discussed grading strategies and compared the bulbar redness grades assigned to a series of human subjects. No measurement of the inter-observer agreement was made before data collection. Trained observers have better inter- and intra-observer agreement.32, 47 In a similar study on clinical grading of the upper palpebral conjunctiva of non-contact lens wearers,44 the inter-observer 95% limits of agreement at the beginning was 0.76 units, but improved to 0.24 units at the end of the study. In another similar study on corneal staining,43 the inter-observer 95% limits of agreement was 0.36 units, and they reported no differences in agreement between the start and end of the study. In studies on grading bulbar redness of photographs, Papas34 found an inter-observer 95% limits of agreement of 0.8 units for seven experienced observers, and Chong et al48 found inter-observer 95% limits of agreements of 0.32–0.42 units for five experienced observers. Thus, our 0.4-unit inter-observer 95% limits of agreement is comparable to two previous studies that also used real eyes,43, 44 and similar to48 or smaller than34, 49 studies that used photographs to assess inter-observer agreement.

Intra-observer agreement is also important in the assessment of grading scales. Inter-observer agreement compares two (or more) independent observers, whereas intra-observer agreement describes the repeatability of an observer, the ability to give the same result at each time of assessment. Both can be used to interpret changes in grading scale scores and to determine sample sizes necessary for future studies. Intra-observer 95% limits of agreements have been reported ranging from 0.78 to 1.52 units in studies using photographs,38, 47, 50, 51 values that appear to be larger than comparable studies of inter-observer agreement. As intra-observer agreement was not found in our study, the significance of a change between observations of real eyes made by a single observer is not known.

In conclusion, although the bulbar conjunctival hyperaemia of a white eye may be redder than expected, this probably reflects the normal physiological detail visible by slit-lamp microscopy and not an error in the design of the grading scale. Given that normal bulbar redness can range from 1.3 to 2.6 units, it is more important that the clinician make note of the baseline appearance, as a change in bulbar redness score of 0.4 units or more may be significant.