Electrophysiological Studies in Thyroid Associated Orbitopathy: A Systematic Review

Dysthyroid optic neuropathy (DON) is the commonest cause of blindness in thyroid associated orbitopathy (TAO). While diagnosis remains clinical, objective tests for eyes with early or equivocal findings are lacking. Various electrophysiological studies (EPS) have been reported, yet the types and parameters useful for DON remain inconclusive. We performed a systematic literature search in MEDLINE, EMBASE and the Cochrane databases via the OVID platform up to August 20, 2017. 437 records were identified for screening and 16 original studies (1327 eyes, 787 patients) were eligible for review. Pattern visual evoked potential (pVEP) was the most frequently studied EPS. Eyes of TAO patients with DON showed delayed P100 latencies, decreased P100 amplitudes or delayed N75 latencies during pVEP, compared to those without or healthy controls. Due to study heterogeneity, no quantitative analysis was possible. This review highlights the most common type (pVEP) and useful parameters (P100 latency and amplitude) of EPS, and supports further research on them using standardized testing conditions.


Pattern VEP (pVEP) in TAO & DON.
Comparison of pVEP results in DON, TAO, and normal controls. P100 latency, P100 amplitude, and N75 latency were compared between DON and normal controls in 3 studies 10,11,17 . An increase in P100 latency of patients with DON was reported by Shawkat 17 . A decrease in P100 amplitude was found in eyes with DON compared to control by Tsaloumas et al. (3.67 ± 0.81 vs. 8.97 ± 0.59 µV, P < 0.001) 11 and Ambrosio et al. (P < 0.0001) 17 .
Five studies reported significant increases in P100 and N75 latencies comparing eyes from TAO patients without DON to healthy eyes ( Table 2) 8,12,13,16,18 . Wijngaarde et al. first reported significant increase in P100 latency of TAO to healthy eyes (P < 0.01) 8  102.4 ± 2.7 ms, P < 0.01) also found increased P100 latencies in eyes from TAO subjects without clinical evidence of DON when compared with controls 18 . In addition, Pawlowski et al. found an increase in N75 latency (79.0 ± 3.7 vs. 73.9 ± 2.8 ms, P < 0.001) 19 , while Spadea et al. showed a decrease in P100 amplitude   20 . In the latter study, correlation of P100 latency was moderate and statistically significant with total cross-sectional areas of all extraocular rectus muscles (EOM-A) (r = 0.496, P < 0.01); moderate but insignificant with ratio between the total cross-sectional area of all extraocular rectus muscles and the orbital area (r = 0.482, P > 0.05), mild and insignificant with total error of 100-hue color sensation (r = 0.363, P > 0.05) and with mean deviation of retinal sensitivity (MD) in perimetry (r = −0.342, P > 0.05). On the other hand, the correlation between peripapillary nerve fiber layer thickness and degree of exophthalmos with P100 latency was insignificant 20 . Acaroglu et al. reported a mild but significant correlation between the disease activity (clinical activity score) and P100 latency (r = 0.364, P = 0.04) 16 .
The correlation between degree of exophthalmos and pVEP varied among studies. Pawlowski et al. reported a moderate and significant correlation between degree of proptosis and N75 latency (r = 0.51, P < 0.01) but not with p100 latency 18 . On the other hand, Wijngaarde et al. described a mild correlation coefficient between degree of proptosis and P100 latency (r and P value not available) 8  pVEP after treatments. Four studies reported the pVEP results before and after treatments including high-dose steroids, orbital radiotherapy and/or decompression (Table 3) 11,15,19,21 . While treatment strategies varied, increase in p100 amplitude and/or decrease in p100 latency post-treatment were generally observed. More improvements were observed in eyes with DON than those without. Three studies reported more than 10% decrease in P100 latency after treatment of DON. Tsaloumas et al. reported a significant decrease (from 129.2 ± 7.13 to 114.0 ± 4.47 ms, P < 0.01) 11 , and so did Rutecka-Debniak et al. (from 126.0 ± 15.9 to 108.0 ± 5.3 ms, P = 0.01) 15 Table 2. Summary outcomes of observational case series and case-control studies on the use of VEP in DON/ TAO. DON = dysthyroid optic neuropathy; fVEP = flash visual evoked potential; mfVEP = multifocal visual evoked potential; ms = millisecond; n.a. = not available; pERG = pattern electroretinography; pVEP = pattern visual evoked potential; TAO = thyroid associated orbitopathy; VEP = visual evoked potential; µV = microvolts. *P < 0.05 compared to TAO without DON, † P < 0.001 compared to TAO without DON, ‡ P < 0.05 compared to Control, § P < 0.001 compared to Control.
In TAO eyes with no clinical evidence of DON but prolonged P100 latency, Rutecka-Debniak et al. reported a significant decrease after treatment (from 114.8 ± 12.6 to 107.3 ± 13.2 ms, P = 0.05) 15 . There was no post-treatment change in TAO eyes with normal pre-treatment VEP.

Multifocal VEP (mfVEP) in TAO.
In 2012, Perez-Rico et al. first reported the use of mfVEP in TAO patients without DON 22 . There was a significant increase in mean latency in TAO group compared to age-matched control (2.12 ± 1.72 vs. 6.57 ± 1.90 ms, P < 0.05) and 23 eyes (35.4%) had abnormal mfVEP amplitude and/or latency. By interocular comparison, 12.3% of TAO eyes showed decreased amplitude and 13.8% of them showed increased latency. Visual acuity was significantly related to mfVEP amplitude changes (mean difference = −0.104, P = 0.018), while intraocular pressure measured at upgaze was significantly related to mfVEP latency changes (mean difference = 2.595, P = 0.028). No statistically significant relationship was observed between mfVEP parameters and standard automated perimetry results or nerve fiber layer thickness measured on optical coherence tomography 22 . Comparing Table 3. Summary outcomes of longitudinal case series comparing VEP changes before and after treatment for DON/TAO. DON = dysthyroid optic neuropathy; fVEP = flash visual evoked potential; ms = millisecond; No. = number; pVEP = pattern visual evoked potential; TAO = thyroid associated orbitopathy; VEP = visual evoked potential; µV = microvolts. *P < 0.05 compared to pre-treatment, † P < 0.001 compared to pretreatment.

Electroretinography (ERG) in TAO.
smaller (P < 0.0001) pERG amplitude in TAO eyes without providing numerical results 14 . They also described a negative correlation of pERG amplitude with optic nerve diameter measured by ultrasonography. Pawlowski

Assessment of the quality of study and grading of clinical recommendation. The 12 studies on
VEPs were assessed according to the NOS (Newcastle-Ottawa Scale) quality assessment of case-control studies 33 ( Table 5). The study with best quality was carried out by Tsaloumas et al. in 1994 11 . Clinical recommendation of EPS in detecting and monitoring visual dysfunction in TAO was rated according to the American Academy of Ophthalmology on preparing Preferred Practice Pattern (PPP) guidelines (Table 6) 34 . pVEP was given level A importance in application and level II in strength of evidence.

Discussion
Clinical features of DON may include impaired visual acuity and color vision, visual field, afferent and relative affect pupillary defect (APD/RAPD), optic disc hyperemia or swelling 5,31,35 . In practice, these features rarely co-exist while ocular co-morbidities often confound with clinical assessment 35 . The European Group on Graves' Orbitopathy (EUGOGO) was the first to propose that the presence of optic disc swelling alone or any other two of the above abnormalities without an alternative explanation suggested the presence of DON in any TAO patient 35,36 . Among the 94 eyes recruited, impaired visual acuity (<20/40), color vision, visual field defects, relative afferent pupillary defect and optic disc swelling were present in only 73%, 77%, 71%, 45%, and 56% of eyes subsequently diagnosed to have "definite DON". On the other hand, these abnormalities were also found in 32%, 7%, 13%, 0% and 5% of eyes subsequently diagnosed to have "no DON". These results implied that none of the individual findings of optic nerve dysfunction was found to be sensitive or specific enough to diagnose or exclude DON. Proptosis or increased clinical activity scores (≥3/7) were absent in more than one-third of eyes with "definite" DON 35 . Despite its serious visual consequences, no widespread consensus on the diagnostic criteria of DON is available to date. The challenge in diagnosing DON at its early stage or in patients with ocular comorbidities remains.
Electrophysiological studies (EPS), including visual evoked potential (VEP) and electroretinogram (ERG) were adopted to provide objective evaluation and correlation with the presence and/or severity of DON. VEP refers to the electrophysiological signals extracted from visual cortex during visual stimulation over the retina 37 . Any disturbance along the visual pathway or visual cortex results in VEP abnormalities (decrease in amplitude or increase in latency). It was first reported in 1972 by Halliday et al. to assess optic neuritis 38 . Subsequently it was used in patients with DON in 1980 by Wijngaarde et al. 8 . Three types of VEP have been used: flash VEP (fVEP), pattern VEP (pVEP), and multifocal VEP (mfVEP) ( Table 7). fVEP uses a diffuse flash stimulating the entire retina for a mass response. Therefore, localized abnormal response may be averaged out and left undetected. pVEP uses checkerboard pattern reversal simulation covering the central 15° visual field. The major components of pVEP are a large positive wave at peak latency of about 100 milliseconds (P100) and a negative wave peaking at 70 milliseconds (N70). Any delay in P100 latency or decrease in amplitude measured from N70 to P100 suggests the presence of optic neuropathy 37 . Since the first report on pVEP in assessing visual function in TAO patients by Wijngaarde et al. in ref. 8,9 other studies were published comparing the use of pVEP in TAO patients with or

No.
Author (year)  without DON (Table 2). mfVEP records signals from multiple stimuli given simultaneously across 20° to 25° of the central visual field enabling assessment of small local defects 39 . ERG records the electrical response of the retina upon light stimulation by various types of corneal electrodes. ERG is widely used in retinal disorders but rarely in TAO 40 . Pattern electroretinogram (pERG) uses reversing black and white checkerboard stimulus to collect signals from inner retina and indirectly measure retinal ganglion cell function. Commonly used parameters of pERG include a prominent positive wave at approximately 50 millisecond (P50) and a larger negative wave at about 95 millisecond (N95) 41 . pERG was used for evaluating early ganglion cell dysfunction in glaucoma patients since 1980s 42,43 . pERG alteration was also reported in animal models of optic nerve transection during retrograde degeneration of retinal ganglion cells 44,45 . In clinical practice, combined interpretation of pVEP and pERG helps to differentiate retinal (abnormal pVEP and pERG) from optic nerve disorders (abnormal pVEP and normal pERG) 46 .

Definition of subjects
Here we report the first systematic review on the use of EPS in DON. pVEP has been the most widely reported EPS in DON. Case-control studies reported significant differences of pVEP parameters among eyes with DON, TAO only and from controls 8,[10][11][12][13][15][16][17][18]22 . Prolonged P100 latency was found comparing either eyes with DON to eyes without from TAO patients or eyes from TAO patients to control. P100 latency correlated with visual acuity, clinical activity score, color vision, visual field, and orbital imaging parameters 8,20 . Significant improvement in pVEPs were found in patients after successful treatment of DON 11,15,19,21 .
We acknowledge insufficient evidence to support the use of pVEP as part of the diagnostic criteria of DON due to its limited availability and inherent variability. To improve generalizability for meta-analysis, future studies should adopt testing protocols by the International Society for the Clinical Electrophysiology of Vision (ISCEV) standards 37,41,[47][48][49] , include age and/or gender-specific reference ranges, post-treatment follow-up results and all clinical parameters recommended by the EUGOGO 5,31,35,37 . Longitudinal follow-up of pVEP on TAO patients with equivocal or early clinical features of DON may shed insight on the natural history, treatment response and clincal implication on the evolving entity of "subclinical" DON.
In conclusion, pVEP was the most studied EPS in DON. Latency and amplitude of P100 were shown to be promising for the diagnosis and monitoring of DON. Future studies on pVEP using standardized settings will be required to fully evaluate its diagnostic accuracy and clinical utility in the management of DON.   Table 6. Clinical recommendation of VEP or ERG in detecting visual dysfunction in TAO. A = most important application; B = moderately important application; C = relevant but not critical application; II = well-designed cohort or case-control analytic studies, preferably from more than one center, or multiple-time series with or without the intervention.

Methods
Literature search. Literature search was performed in MEDLINE, EMBASE, and the Cochrane databases via Ovid platform. We formulated sensitive search strategies using the Boolean logic and search terms with controlled vocabularies (Medical Subject Heading terms): ("thyroid associated" OR "endocrine" OR "dysthyroid" OR "Graves") AND ("orbitopathy[ies]" OR "ophthalmopathy[ies]") OR ("ophthalmic Graves' disease") in combination with "optic neuropathy(ies)" ( Assessment of the quality of study and level of evidence. NOS (Newcastle-Ottawa Scale) 33 was adopted to evaluate the quality of the case-control studies. The clinical recommendation of VEP or ERG in detecting and monitoring visual dysfunction in TAO were rated from 2 aspects, "importance to the care process" and "the strength of evidence in the available literature", according to the American Academy of Ophthalmology on preparing Preferred Practice Pattern (PPP) guidelines 34 . "Importance to the care process" represents the value of this application to improve the quality of the patient's care in a meaningful way. Level A indicates the most important; level B indicates moderately important and level C indicates relevant but not critical application. "Strength of evidence" was rated in 3 levels. Level I includes evidence obtained from at least one properly conducted, well-designed, randomized, controlled trial. It also includes meta-analysis  Table 7. Features of included studies. EPS = electrophysiological studies; fVEP = flash visual evoked potential; pVEP = pattern visual evoked potential; mfVEP = multifocal visual evoked potential; pERG = pattern electroretinography; mfERG = multifocal electroretinography; No. = number; n/a. = not applicable.

9
(optic adj1 nerve adj1 (disease or disorder)).mp.  of randomized controlled trials. Level II includes well-designed controlled trials without randomization, well-designed cohort or case-control analytic studies, preferably from more than one center, or multiple-time series with or without the intervention. Level III includes evidence obtained from descriptive studies or case reports.