## Introduction

Barrett’s esophagus (BE) is a pre-neoplastic condition that is associated with an increased risk of developing dysplasia and esophageal adenocarcinoma (EAC)1,2,3. It is defined as endoscopically visible columnar epithelium containing goblet cells (intestinal metaplasia). Although the American Gastroenterological Association has not specified a length requirement1, the American College of Gastroenterology requires extension at least 1 cm proximal to the gastroesophageal junction4. BE is a genetically unstable metaplastic epithelium that accumulates an increasing number of genetic and chromosomal abnormalities as it progresses to low-grade dysplasia (LGD), high-grade dysplasia (HGD), and eventually EAC5,6,7,8,9,10. Because dysplasia is currently the primary clinical biomarker used to identify patients who are at an increased risk for EAC, clinical guidelines recommend that BE patients undergo periodic endoscopic surveillance with biopsies to detect dysplasia4. This allows for risk stratification and management of BE patients based upon the presence and grade of dysplasia prior to the development of EAC. The detection of HGD usually prompts endoscopic therapy, generally in the form of radiofrequency ablation (RFA) with or without endoscopic mucosal resection (EMR), due to its more frequent association with EAC compared with LGD11,12,13,14,15,16. However, a significant number of BE patients with dysplasia (~20%) are resistant to endoscopic therapy, and recurrences or progression to EAC during endoscopic therapy are not uncommon12,17,18,19,20,21,22,23. As for LGD, although continued surveillance every 12 months is an acceptable approach, there has been a shift toward endoscopic ablation therapy in recent years4,14,15,16,24,25,26. Subsequently, there is greater emphasis on optimization of the diagnosis of dysplasia as well as identification of patients who are more likely to progress to HGD/EAC and/or have a poor response to endoscopic therapy. However, considering the annual cancer risk for patients with non-dysplastic BE (NDBE) is low (0.1–0.5% per year)27,28,29, identification of reliable biomarkers of low risk to allow prolongation of surveillance intervals compared with the current recommendations of repeating endoscopy every 3–5 years in those with NDBE4 remains an important goal of biomarker research.

Even though endoscopic therapy has revolutionized the treatment of BE patients, the current surveillance protocols based on the histologic classification of BE have several shortcomings, including limited sampling of the affected BE segment (leading to false negative biopsy results), sampling error of potentially neoplastic lesions, and interobserver variability among pathologists in the diagnosis and grading of dysplasia, particularly for LGD30,31,32,33,34,35. In fact, there is evidence that the current surveillance protocols may not be effective in reducing mortality from EAC, with one study demonstrating that patients with fatal disease were nearly as likely to have received surveillance (55%) as were controls (60%)36. Furthermore, the rate of missed HGD/EAC (i.e., diagnosed within 1 year of negative endoscopy) is high (19–24%)37, suggesting that early repeat endoscopy, ideally within 1 year of an initial BE diagnosis, may be crucial, although the cost-effectiveness of this approach remains to be determined. Notably, in a recent meta-analysis of 24 cohort studies of NDBE or LGD patients followed for at least 3 years after index endoscopy, Visrodia et al. reported that ~25% of EACs and 27% of HGD/EACs were classified as missed37. When only NDBE patients were considered, the rates of missed EACs and HGD/EACs were 24% and 19%, respectively37.

Consequently, there is an increased interest in ancillary tests that could (1) improve the diagnostic accuracy of dysplasia (and its grading) in challenging situations to avoid a repeat endoscopic examination with biopsies (potentially more expensive than most ancillary tests); (2) predict which NDBE or LGD patients are at a higher risk for developing HGD/EAC (including missed lesions) so that such patients can be identified early and successfully treated with endoscopic therapy to prevent progression to EAC; (3) identify patients who are less likely to develop HGD/EAC so that the surveillance of low-risk patients can be reduced; and/or (4) predict those more likely to have a poor response to endoscopic therapy. In this regard, a variety of biomarkers and assays, such as p53 immunostaining38,39,40,41,42, Wide Area Transepithelial Sampling with Three-Dimensional Computer-Assisted Analysis (WATS3D)43,44,45,46,47,48,49,50, TissueCypher51,52,53,54,55,56,57,58, mutational load analysis (BarreGen)59,60,61,62, fluorescence in situ hybridization (FISH)7,63,64,65,66,67, and DNA content abnormalities as detected by DNA flow cytometry68,69,70,71,72,73,74,75,76,77 have been extensively evaluated. Although none of these studies have comprehensively evaluated the potential utility of these biomarkers in reducing mortality from EAC compared with the current surveillance standards, they have demonstrated a potential benefit when used in combination with histologic findings to assist in the diagnosis and/or risk stratification of BE and dysplasia. As such, this review provides an overview of these biomarkers and tests that appear most promising based on the availability of multiple published results and/or on their commercial availability.

## Dysplasia as a biomarker for risk stratification

Currently, dysplasia is the primary clinical biomarker used for risk assessment in the surveillance and management of BE patients. Morphologically, dysplasia is defined as unequivocal neoplastic epithelium that remains confined within the basement membrane of the epithelium from which it developed, and it is classified as (1) negative for dysplasia, (2) indefinite for dysplasia (IND), (3) LGD, or (4) HGD3,30,31,78. The rationale for its use as the primary clinical biomarker is based on the premise that EAC in BE patients develops through a sequence of molecular (i.e., loss of CDKN2A followed by TP53 inactivation and aneuploidy) and morphologic changes that begin with intestinal metaplasia and then progress from LGD to HGD, and ultimately to EAC3,5,68,71,78,79,80,81,82. This is also supported by multiple outcome studies demonstrating a strong correlation of higher EAC rates with increasing levels of dysplasia. While the annual cancer risk for NDBE patients is low (0.1–0.5% per year)27,28,29, HGD is considered a key premalignant step that is associated with a greater risk of either already having EAC or developing it on follow-up (16–100%)83,84,85,86,87,88. The natural history of LGD is more controversial, with variable progression rates ranging from 0.4 to 13.4% per year89,90,91. It is worth emphasizing that it is not often possible to distinguish true progression from missed lesions in these outcome studies. In other words, a patient with LGD may progress from an unsuspected HGD in the same site or elsewhere in the esophagus. In such a case, HGD/EAC detected on follow-up may represent either true progression or a delayed/missed diagnosis.

Unfortunately, dysplasia has a number of limitations as a biomarker. First, dysplasia is often focal and may not be endoscopically visible, so sampling error is a major issue as most surveillance techniques sample only a minority of the BE segment. Although Reid et al. reported that four-quadrant biopsies taken every 1 cm in the BE segment (also known as the “Seattle protocol”) can consistently detect early cancers arising in HGD92, most endoscopists do not adhere to this protocol and take too few biopsies, compounding the problem of sampling error. Second, consistent diagnosis and grading of dysplasia by histology is challenging, as exemplified by a relatively high degree of interobserver variability in the histologic classification of BE among pathologists, particularly toward the lower end of the spectrum30,31,32,34,93. The most pronounced variability is linked to the diagnosis of LGD, with a recent study illustrating sub-optimal interobserver agreement for LGD (kappa = 0.11) even among gastrointestinal (GI) pathologists32. In another study, up to 85% of LGD cases were downgraded to NDBE or IND following expert pathology review91. Even though an excellent interobserver agreement for HGD among GI pathologists has been reported in earlier studies30,31, a more recent study demonstrated that upon review of 485 HGD samples from both academic and private centers by experienced GI pathologists, up to 40% of these cases were reinterpreted as LGD, IND, NDBE, or no BE93. Consequently, both the American College of Gastroenterology and the American Gastroenterological Association strongly recommend that all potential dysplasia cases be confirmed by at least one experienced GI pathologist before embarking on a management plan4,94. This recommendation is further supported by several studies demonstrating a strong correlation between the number of pathologists who agree with a diagnosis of dysplasia and the rate of neoplastic progression. For instance, Skacel et al. showed that the rate of progression was 80% when three GI pathologists agreed on a diagnosis of LGD, while the rate was 41% when two GI pathologists agreed95. Finally, even if the issues stated above could be resolved, there are no observable histologic features in NDBE or LGD on hematoxylin and eosin (H&E) staining that can accurately identify those patients most likely to develop HGD/EAC versus remain stable for years. Indeed, recent studies have suggested that many EACs develop through a more direct, accelerated pathway in which TP53 mutation is followed by doubling of the whole genome, rapidly resulting in genomic instability, oncogenic amplifications, and EAC, rather than through the stepwise accumulation of tumor suppressor alterations96,97. This accelerated pathway to EAC might explain in part why endoscopic surveillance is sometimes unsuccessful in detecting dysplasia before the development of EAC in some BE patients36. Overall, these results suggest that additional or alternative biomarker(s) may be useful to better risk stratify BE patients.

## p53 immunostaining as a diagnostic and risk stratification biomarker

Immunohistochemistry (IHC) for p53 to confirm a dysplasia diagnosis or predict likelihood of progression to EAC is of interest but has limitations, as summarized by others38,98. The TP53 gene encodes p53, which prevents mutations. Normal cells have low levels of this protein in their nuclei, but the gene and protein are upregulated in the presence of DNA damage or stress, resulting in DNA repair, growth attenuation, and apoptosis. In dysplastic cells and EAC, mutations in TP53 lead to aberrant nuclear accumulation of abnormal p53 protein (which has a long half-life) that can be detected on immunostaining (Fig. 1A, B). Alternatively, truncating mutations/bi-allelic inactivation of TP53 lead to complete loss of nuclear expression of the protein, termed the “null” pattern (Fig. 1C, D). Light and patchy staining using p53 IHC reflects normal physiologic activity of the protein to maintain cell health and is the pattern of cells that are TP53 wild-type (Fig. 1E, F). However, in one study of p53 staining, aberrant expression was detected in ~10% of cases regarded as non-dysplastic, ~40% of LGD, ~85% of HGD, and all of EACs39. Strong nuclear staining aligns with TP53 mutations but can still be detected in cases of LGD lacking TP53 mutations. Bian et al. reported that although 95% of cases interpreted as LGD had p53 expression on IHC, TP53 mutations were only detected in about a third40.

Pathologists in many institutions, particularly in the UK and Europe, have advocated for the use of universal p53 IHC in BE cases to detect dysplasia that might be otherwise overlooked, to the point that the British Society of Gastroenterologists endorsed adding it reflexively in routine practice99. The recommendation seemed to reflect studies directed at predicting progression of NDBE to HGD/EAC rather than establishing an initial diagnosis. In one study, scoring p53 immunostaining as “significant” in the presence of strong or absent staining versus “not significant” resulted in kappa scores on the order of 0.6 (strong reproducibility), whereas scoring morphologic features as “negative for dysplasia” versus “IND” versus “LGD” versus “HGD” (4 categories) resulted in kappa scores of 0.3, an unsurprising result since grouping cases into 2 categories versus 4 produces greater observer variability ab initio41. In fact, when the authors grouped the morphologic interpretation into only two categories as they had done for p53, namely “definite dysplasia” versus “no dysplasia” on H&E, they achieved a comparable kappa score of 0.55 for morphology alone, diminishing their conclusions concerning p53 considerably. Nonetheless, the results of many studies have supported the use of p53 IHC as a marker of the likelihood of progression to HGD/EAC in patients whose biopsies show H&E findings of negative for dysplasia, IND, or LGD. The latter studies are summarized by Srivastava et al.38.

More recently, Redston et al. studied “progressors” versus “non-progressors” gleaned from a large commercial laboratory system42. The authors used a retrospective set of over 500 BE patients with or without known progression from negative for dysplasia, IND, or LGD to HGD/EAC. To establish their IHC scoring system (Table 1), the authors obtained DNA for sequencing from 92 BE samples derived from 28 progressors and 6 non-progressors. TP53 mutations were identified from 50 of the samples, specifically from 21 patients who progressed and 3 who did not. In ~90% of cases, the TP53 mutational status correlated with p53 immunostaining results. The authors validated their p53 staining criteria using 50 NDBE and 50 HGD biopsies. They found abnormal p53 staining in 4% of NDBE and 96% of HGD, thereby confirming their scoring criteria. In the testing phase, amongst 646 NDBE patients, 20 progressed to LGD, and 10 to HGD/EAC. Abnormal p53 immunostaining was detected in half of the progressors, resulting in good specificity but poor sensitivity. Essentially, amongst 646 NDBE patients, adding p53 staining offered additional information for only 15, and arguably, progression to LGD is not truly progression. The authors suggested that patients with abnormal p53 expression in NDBE have comparable rates of progression to those who have LGD. They further suggested annual endoscopy for such persons. However, the study was limited by the lack of uniformity of the screening and surveillance methods of the gastroenterologists submitting their materials to the commercial laboratory. Also, it is worth noting that although the risk of progression to HGD/EAC is reported to decline with an increased number of endoscopies showing NDBE100,101, most studies on ancillary tests, including those of p53, do not clarify the number of negative endoscopies prior to the development of HGD/EAC, complicating the interpretation of their outcome data.

In a 2018 study, Ten Kate et al. reported that simply refining histologic criteria for diagnosis of LGD identified BE patients likely to progress102. Similarly, other refined histologic criteria allowed another group that included one of us (EAM) to essentially eliminate the IND category with excellent prediction of outcome103. Ten Kate et al. also used p53 staining alone and achieved similar results to those afforded by use of H&E alone, with some synergy for the two combined but probably not enough to support reflex testing102. Years ago, two of us (GYL and EAM) were part of a group that also achieved excellent prediction of outcome using H&E alone despite imperfect interobserver variability31,104. We would also point out that the Kaplan–Meier curves for progression of NDBE with and without abnormal p53 staining from the study by Redston et al. do not differ dramatically because so few patients without histologic dysplasia progress regardless of p53 immunostaining status42.

Incorporating reflex IHC for p53 is not terribly expensive in the individual patient, and reimbursement is readily obtained. The 2021 Medicare fee schedule listed a 2021 figure of $99.82 for a global code of 88342 (immunostaining; technical only$67.41) and modified it to $106.07 (technical only$70.82), whereas H&E global code (88305) affords $66.76 (technical only$32.06), which was updated to $71.52 (technical only$33.84). This means that adding a p53 stain increases the cost per biopsy by two and a half fold. This might be prohibitively expensive if p53 staining is added to every single esophageal biopsy demonstrating intestinal metaplasia. No cost analysis was provided by Redston et al.42, although pathologists might be motivated by payments to add p53 staining to all BE samples that lack dysplasia or show LGD. Overinterpretation of normal staining, however, might result in unnecessary surveillance or ablation procedures.

## TissueCypher as a diagnostic and risk stratification biomarker

The objective of TissueCypher is to evaluate samples from BE patients diagnosed as negative for dysplasia, IND, or LGD on routine histologic evaluation to identify those patients most likely to progress to HGD/EAC so that intensified screening or ablation can be offered to them. Similarly, the technique is intended to identify patients who are unlikely to progress such that their surveillance can be reduced.

TissueCypher uses immunofluorescent labeling of sections from formalin-fixed paraffin-embedded (FFPE) samples for p16, AMACR, p53, HER2, CK20, CD68, COX-2, HIF-1α, and CD45RO, together with Hoechst staining dye (Fig. 3)51,52,53,54,55,56,57,58. Hoechst dye allows fluorescent detection of DNA105, thereby permitting image analysis software to identify nuclei as discrete objects in tissue. It also allows the software to assess nuclear area, solidity, and DNA content. Some of the markers are combined on the same slide51,52. The slides are then used to perform image analysis with an image analysis algorithm. The image analysis algorithm quantifies 15 different “image features” (Table 2). The quantified image features are then combined into a risk score. Samples are still reviewed in the typical manner (routine diagnosis by local pathologists) and then sections are prepared and subjected to the TissueCypher staining and algorithm.

This method offers the advantage of using a variety of markers with a consistent interpretation, thus eliminating interobserver variability, although this does not necessarily mean an accurate interpretation. Using the company’s platform, a risk score for progression is stratified as low, intermediate, or high, but there is some advantage to combining the intermediate and high risk scores. Although similar data are reported in all studies from the TissueCypher team51,52,54,55,56, the initial and some recent studies were performed in Europe, and in 2020, a US-based study from two institutions was added53. The latter study was a case-control study from patients with biopsy diagnoses of negative for dysplasia (n = 227), IND (n = 23), and LGD (n = 18). The samples were from 58 patients who progressed to HGD/EAC (median time to progression of 2.7 years; 7/58 progressed after 5 years), and from 210 patients who did not progress (median surveillance time of 7 years). In this study, the prevalence-adjusted proportions of patients scoring low, intermediate, and high risk using the TissueCypher method were 84.2%, 9.4%, and 6.4%, respectively. The sensitivity and specificity of the test at 5 years for the 3-tier TissueCypher classification (low, intermediate, and high risk) were 29% and 86%, respectively, and 40% and 86%, respectively, for the 2-tier classification (low and intermediate/high risk combined). By comparison, the sensitivity and specificity of an expert diagnosis of LGD were 19% and 88%, respectively, and the sensitivity and specificity of the initial community diagnoses of LGD (i.e., diagnosis recorded in the health records) were 26% and 66%, respectively. Of 51 patients who progressed within 5 years, 14 scored high risk, 6 scored intermediate risk, and 31 scored low risk. Among 210 patients who did not progress, 13 scored high risk, 18 scored intermediate risk, and 179 scored low risk. Using the TissueCypher test, the prevalence-adjusted positive predictive value (PPV) was 23%; i.e., 23% of patients who score high risk would progress to HGD/EAC within 5 years. The prevalence-adjusted negative predictive value (NPV) was 96.4%. The risk prediction test also showed improved risk stratification when compared to p53 alone using the automated scoring.

Overall, expert pathologists’ diagnosis of LGD outperformed TissueCypher in specificity and PPV, but TissueCypher was more sensitive. However, this was not the case for samples from patients with no dysplasia. Patients without dysplasia as confirmed by expert pathologists who scored high risk were at about 5-fold increased risk of progression as compared to patients without dysplasia who scored low risk using TissueCypher. The adjusted PPV for the test in expert pathologist-confirmed NDBE was 26%, indicating that 26% of patients without dysplasia but with a high risk score using TissueCypher will progress within 5 years, a rate similar to that associated with an expert diagnosis of LGD.

## DNA content abnormalities as detected by DNA flow cytometry as a diagnostic and risk stratification biomarker

Since the 1980s, a number of studies have consistently demonstrated the potential utility of DNA flow cytometry in the diagnosis and risk stratification of dysplasia in BE patients68,69,70,71,72,73,74,75,76,77. Although its availability has been limited to few medical centers due to perceived technical demands and use of fresh tissue in earlier studies68,69,70,71, subsequent studies have successfully employed FFPE tissue for DNA flow cytometric analysis to generate high-quality DNA content histograms, demonstrating the feasibility of this methodology72,73,74,75,76,77. For optimal results, the computer program Multicycle (De Novo software, Glendale, CA) should be used to analyze DNA content histograms68,69,70,71,76,77. The published consensus guidelines for clinical DNA flow cytometry should be followed107,108. Most epithelial cells are normally in the G0/G1 phase of the cell cycle and have diploid (2 N) DNA content, while less than 6% of cells have tetraploid (4 N) DNA content (G2) (Fig. 5A, B). Aneuploidy is defined as an extra G0/G1 peak that is bimodally separated from the normal diploid G0/G1 peak (Fig. 5C, D). The presence of a G2/tetraploid (4 N) fraction greater than 6% (with DNA index of 1.9–2.1) is also classified as abnormal due to its strong association with neoplasia (Fig. 5E, F)5,69,71,76,77.

A recent retrospective study analyzed 80 FFPE BE samples with HGD, 38 LGD, 21 IND, and 14 NDBE and reported that the frequency of DNA content abnormalities (aneuploidy or elevated 4 N fraction) increases with increasing histologic grade of dysplasia: 0% of NDBE, 9.5% of IND, 21.1% of LGD, and 95% of HGD77. As a diagnostic marker of HGD, the estimated sensitivity and specificity of abnormal DNA content were 95% and 85%, respectively. Interestingly, DNA flow cytometry also identified a subset of LGD and IND patients who are at higher risk for subsequent detection of HGD/EAC, with the univariate hazard ratios (HRs) of 7.0 and 20.0, respectively (p < 0.001)77. Considering that endoscopic therapy is increasingly being recommended for LGD patients26, abnormal flow cytometric results at baseline LGD or IND could potentially enable clinicians to recommend endoscopic therapy, whereas continued surveillance may be an acceptable approach in the setting of normal flow cytometric results. Furthermore, Bowman et al. recently demonstrated that abnormal DNA content in baseline HGD/IMC can serve as a predictive marker of persistent/recurrent neoplasia following endoscopic therapy, with the univariate and multivariate HRs of 3.8 (p = 0.007) and 6.0 (p = 0.003), respectively76. This suggests that the detection of DNA content abnormalities in baseline HGD/IMC may help to identify high-risk BE patients who may benefit from alternative therapeutic strategies (e.g., different ablation technique, combined endoscopic modalities, or endoscopic submucosal dissection) as well as long-term follow-up with shorter surveillance intervals following endoscopic therapy.

There are some advantages of using DNA flow cytometry. First, DNA flow cytometry is an inexpensive send-out test (\$350 at ARUP laboratories; CPT code: 88182) that can be completed within 2–3 days. Second, DNA flow cytometric markers of dysplasia or progression (aneuploidy or elevated 4 N fraction) are usually absent in NDBE72,75,76,77,109,110, and features potentially altering the histologic interpretation (i.e., increased acute inflammation or ulceration) do not cause aneuploidy or elevated 4 N fraction, which can be very helpful in evaluating IND cases69,111. In fact, many genetic and chromosomal abnormalities detected in BE (including 9p LOH [site of CDKN2A], 17p LOH [site of TP53], and mutations of TP53 and CDKN2A) tend to occur early and frequently throughout large areas of BE5,6,7,8,9,10,112,113,114, even before the first histologic sign of dysplasia, limiting their utility as a diagnostic or prognostic marker of dysplasia in BE patients.

In conclusion, as the current surveillance methods based on the histologic diagnosis and classification of dysplasia imperfectly assess the risk of BE patients, especially those with IND or NDBE histology, there is an increasing demand for ancillary tests to aid in the diagnosis/grading of dysplasia and risk stratification of BE patients. In cases with equivocal histology, one may argue that a repeat endoscopic examination with biopsies may provide the answer without the need of an ancillary test. However, this approach is likely to be more expensive than most ancillary tests. In this regard, several biomarkers and assays, including p53 IHC, WATS3D, TissueCypher, mutational load assessment (BarreGen), FISH, and DNA content abnormalities as detected by DNA flow cytometry have been demonstrated as ways to support a dysplasia diagnosis and aid in risk assessment for the development of HGD/EAC (Table 3). More importantly, many of these tests are currently available in academic centers and commercial laboratories, and often utilize FFPE, obviating the need to obtain separate samples. Although none of these tools are widely used in practice, there is an increased interest among gastroenterologists to pursue ancillary tests in BE surveillance biopsies, as they have shown promising results in identifying early neoplasia and could potentially serve as adjuncts to histologic evaluation. By providing information that cannot be assessed by morphology alone, especially if the cost is reasonable (i.e., cheaper than repeat endoscopy with additional pathologic evaluation), these tests may become attractive tools, especially for patients with inconsistent diagnoses, IND, or LGD histology. Like many molecular tests (e.g., next-generation sequencing) currently used in the diagnosis and management of many diseases, incorporating these tools in the management of BE patients, in conjunction with histologic evaluation, may allow for more precise surveillance and/or earlier treatment in patients at higher risk of progression, while avoiding unnecessary interventions or surveillance in those at lower risk. Prospective studies on these biomarkers (including assessment of their potential utility in reducing mortality from EAC) as well as cost-effective analysis compared with the current surveillance methods are singularly missing. Until these comprehensive data exist, it is impossible to fully evaluate their potential impact and better tailor their potential roles in the care of BE patients.