Introduction

Multimodal therapy, i.e., preoperative/neoadjuvant or perioperative chemo- or radiochemotherapy followed by surgery is currently standard treatment of locally advanced gastrointestinal malignancies, particularly esophageal, gastric, and rectal carcinomas [1,2,3,4,5,6,7,8,9]. Regressive changes can be observed by macroscopic and histopathological investigation of the resection specimens. Assessing tumor regression changes can be challenging; even within one tumor entity, they may vary from patient to patient. Regressive changes can also vary within comparable histologic subtypes or tumor grade differentiation [10,11,12,13]. Various attempts have been made to categorize these changes into tumor regression grading systems, particularly for esophageal, gastric, and rectal carcinomas [12, 14,15,16,17,18,19]. The two major principles common to these systems for grading tumor regression is either the estimation of residual tumor in relation to fibrotic changes, or the estimation of residual tumor in relation to the previous tumor site, which can be described as percentage or in a descriptive manner [20]. Application of tumor regression grading in clinically annotated case series have shown that they can provide highly valuable prognostic information, particularly as usually complete or near complete tumor regression is associated with a better prognosis of the patients [21,22,23,24,25].

Although some tumor regression grading systems are widely used in clinical practice and are also used as surrogate markers for therapy response and endpoints in research [26] and clinical trials [21, 22, 27], there is still no consensus regarding which system should be used and on which tumor entity. Moreover, which of these systems are actually being used by pathologists in daily routine practice is unknown, let alone what standards of grossing and histologic work-up they implement on the resection specimens. Finally, the challenges pathologists face in applying tumor regression grading systems, as well as their concerns regarding the limitations of these systems are also unclear.

The aim of this study was therefore to create and distribute a survey about tumor regression grading among multinational pathologists who have a special focus on gastrointestinal pathology. The survey included critical issues such as their routine practices in grossing and histologic work-up of neoadjuvantly treated gastrointestinal tumor resections, the tumor regression grading systems they used or preferred, and their opinion regarding an ā€œoptimalā€ regression grading system.

Materials and methods

Questionnaire

The goal of the survey was to be concise but cover critical issues that may frequently impact practicing pathologists, as raised in the literature [20, 28,29,30] and in international forums. Finally, a 23 items questionnaire (see BoxĀ 1) was developed comprising the following topics:

(a) Grossing and histologic work-up of gastrointestinal resections after neoadjuvant (four questions)

(b) Usage of specific tumor regression grading systems (four questions, one with subunits)

(c) Preference in regards to the components of an ā€œidealā€ tumor regression grading system (two questions)

(d) Opinion regarding difficult issues, such as assessment of fibrosis, residual tumor, acellular mucin (two questions, both with subunits)

(e) Regression in lymph nodes (three questions)

(f) Tumor regression grading in non-luminal gastrointestinal cancers including liver metastases (two questions)

(e) Demographic data (four questions)

(f) Free comments (one question)

A table with the description of frequently used tumor regression grading systems, such as the tumor regression grading systems according to Mandard [15], Dworak [14], Ryan [31], or the American Joint Committee on Cancer (AJCC)/College of American Pathologists (CAP) [32] as examples for grading systems that refer to the relation of tumor/fibrosis, and the Becker [12], the Rƶdel [17], and the tumor regression grading systems of the Japanese Gastric Cancer Association [33, 34] as grading systems that use the percentage of residual tumor as a reference for regression grading was provided with a link to a Google-database (TableĀ 1)

Table 1 Overview about commonly used tumor regression grading systems (as provided in the survey)

.

During a pre-test period three commercially available survey tools were tested [35] and finally the survey monkey (https://de.surveymonkey.com) online tool was chosen due to the best handling options, including statistics.

Participants

The survey was announced at two major pathology congresses (107th annual meeting of the United States and Canadian Academy of Pathology, 2018, Vancouver and the 30th European Congress of Pathology, 2018 in Bilbao) and distributed online via communication through several national and international communities of gastrointestinal pathologists starting in May 2018 for North American Pathologists and in September 2018 for European Pathologists and pathologists from other regions. The participants should have had a focus or special interest on gastrointestinal pathology. Membership in an official community of gastrointestinal pathologists was not required.

Evaluation

The survey was closed in February 2019. For descriptive statistical analysis, the IBM SPSS statistics program V 24 (IBM Corporation, Armonk, USA) and the options provided by the survey monkey program were used. Comparison between groups were calculated using cross tabs and chi-square or Fisherā€™s exact tests.

Results

A total of 203 pathologists participated in the study and 173 (85%) of them answered every question. Of the 30 participants who did not complete the entire survey, 9 did not use tumor regression grading systems so they automatically skipped all related questions. This leaves 21 participants who did not complete the questionnaire without specific reason. The complete results (export survey monkey) can be found as Supplemental fileĀ 1. The average time for the completion of the survey was 6ā€‰min and 33ā€‰s. There were three peaks of answers, two immediately after the distribution of the survey through e-mails by the two major working groups and a third one after one reminder.

Demographics

Detailed demographic data were available from 182 participants. Fifty-two participants (29%) of those who answered the demographic specific questions were from North America, and 92 (50%) from Europe, among them 23 (13%) from Western Europe, 38 (21%) from Central Europe, and the remaining from South West, South East and Eastern Europe. Eighteen participants (10%) were from Australia and Oceania. The remaining participants were from Central and South America and Africa. Only six participants were from Asia.

One-hundred thirty-two participants (72%) are working in an academic center, and 26 (15%) and 24 (13%) in private practice or public non-academic centers, respectively. The experience of >20 years practice was stated by 69 participants (38%), of 11ā€“20 years by 50 (27%), 6ā€“10 years and 1ā€“5 years by 32 (18%) and 31 (18%) participants.

The majority (110 participants; 54%) is signing out >20 post-neoadjuvant therapy gastrointestinal resection specimens per year. Forty participants (20%) dealt with ten or less of these types of specimens on a yearly basis.

Macroscopic and histologic work-up

Data were available from 203 participants. One-hundred eighty-seven (92%) use a standardized protocol for the work-up of resection specimens. One-hundred and nine participants (54%) embed the whole-tumor bed completely up to a certain size (not specified ā€”ā€œif huge I do not submit the whole thingā€), and 54 (27%) do always submit the complete tumor bed. Only two participants investigate a maximum of three blocks. For standard histology, most of the pathologists use hematoxylin and eosin only (120; 59%), 55 (27%) use hematoxylin and eosin and immunohistochemistry, 28 (14%) hematoxylin and eosin and special stains (not specified). In cases where the lesion was completely embedded and there is no tumor seen on first sections, 142 participants (70%) order deeper sections, if the first ones were not cut adequately, and 35 participants (17%) would always order deeper sections. Twenty-six persons (13%) would not order deeper sections.

Standardized grossing was also considered as very important by 130 participants (75%) and standardized histology work-up by 116 (67%) (Fig.Ā 1).

Fig. 1
figure 1

Questions regarding work-up (macroscopically and histologically): a standardized protocol for grossing; b histology work-up; c blocks submitted; d approach for detection of residual tumor; e importance of several issues regarding work-up and reporting tumor regression grading

Tumor regression grading systems

One-hundred ninety-two participants (95%) use tumor regression grading systems. Most pathologists (107 people, 62%) are familiar with other systems than the one(s) they are using in daily practice. Almost half of the participants (85 people, 59%) are involved in, or familiar with clinical studies or research projects on gastrointestinal cancers where tumor regression grading systems are used as a crucial factor for data generation and interpretation. Standardized reporting of tumor regression (each tumor entity separately) was considered as very important by 89 participants (51%) and moderately important by 43 (25%). Seventy-one participants (41%) preferred one system for all gastrointestinal cancers, this was moderately important for 43 people (25%) and neutral or unimportant for 59 (34%).

For esophageal squamous cell carcinomas, the most frequently used TRG system were Mandard and AJCC/CAP (62 participants; 36% each) followed by Ryan (16 participants; 9%)

For esophageal adenocarcinomas and gastroesophageaƶ junction carcinomas it was the AJCC/CAP system (63 participants; 36%), followed by Mandard (51 participants; 29%) and Becker (25 participants; 14%) and similar with gastric cancer (Mandard: 43 participants; 25%; AJCC/CAP 62 participants; 36%; Becker 25 participants; 14%). For rectal cancer, most participants used the AJCC/CAP (66 participants; 38%), followed by Ryan (28 participants; 16%) and Mandard (25 participants; 14%). For all entities, there was a certain number of participants (ranging from 13ā€“22) who use a descriptive way of reporting regressive changes or use a different system (ranging from 9ā€“13), which also includes the usage of tumor regression grading systems that are modified versions of the ones listed in the survey (Fig.Ā 2).

Fig. 2
figure 2

Usage of tumor regression grading across the world: a all participants; b illustration of the regional differences of the usage of TRGs. *Note that the information about the usage of tumor regression grading according to the Japanese classification systems bases on personal experience and information, and is not supported by the survey where only few participants from (East) Asia replied; c subgroup of North American participants; d subgroup of European participants

In addition, 64 participants (37%) use a tumor regression grading system for pancreatic cancer (high variety of TRG systems, including AJCC/CAP, Ryan [31], Le Scodan [36] Evans, [37]), and 55 (32%) for liver metastases (mostly according to Rubbia-Brandt [38]).

Lymph nodes

Regressive changes in lymph nodes are reported by 147 participants (85%), however, only 55 (32%) report them in every case, 64 (37%) in regressive lymph nodes without residual tumor, and 28 (16%) in regressive lymph node metastases with evidence of residual tumor. One-hundred thirty-nine participants (80%) think that it is important to mention therapy-induced regressive changes in lymph nodes, among them 20 (12%) who would grade the changes while 119 (69%) would only report presence vs. absence. One-hundred fifteen participants (66%) think that regressive changes in lymph nodes should be part of the tumor regression grade.

The ā€œidealā€ tumor regression grading system

In response to the question of how many categories are considered to be reasonable for a tumor regression grading system in daily practice, the predominant number was four (89 persons; 51%) followed by three (53 people; 31%) and five (25 people; 14%). Only six participants considered a two-tier approach reasonable. In contrast, there was no preference for whether the tumor regression grade should be based on (1) fibrosis/tumor ratio in percentage, (2) a descriptive assessment of residual tumor, or (3) assessment of residual tumor in percentage; all three of these were equally stated (44, 45, 48 participants; 25%, 26%, 28%). fibrosis/tumor ratio in a descriptive manner was preferred by 32 persons (19%).

Identification of residual tumor was considered to be very easy for 24 participants (14%), easy for 96 participants (55%), and difficult or very difficult by 15 (9%). Estimation of residual tumor was considered easy by 74 participants (43%) and difficult or very difficult by 45 (26%). Estimation of therapy-induced fibrosis was considered easy by only 14 participants (8%), difficult by 83 participants (48%), and very difficult by 15 (9%). Interpretation of acellular mucin was estimated equally easy, neutral or difficult with around 30% each (Fig.Ā 3).

Fig. 3
figure 3

Questions regarding the ā€œideal tumor regression grading systemā€ and difficult issues of tumor regression grading: a tiers for the ideal tumor regression grading system; b base for the ideal tumor regression grading system; c Difficulties in various aspects of assessing residual tumor and fibrosis

Free comments

Free comments included the following issues: practicability of applying tumor regression grading given the workload in clinical practice, cost vs. benefit with general impact on clinical consequences, description and wording, need for data-driven recommendations, biology of tumor regression (fragmentation vs. shrinkage), availability for comparison with pre-therapeutic conditions as being the main determinant for a tumor regression grade, discrepancies between tumor regression grade, and Tumor-Nodes-Metastases (TNM) staging (e.g., ypT3 tumors with little residual tumor in deep layers) and how to interpret them clinically, unification of tumor regression grading systems across cancers of the luminal gut, and problems dealing with stroma-rich tumors (e.g., poorly cohesive gastric cancer).

Subgroup analysis

The amount of experience (years of practice and number of cases per year) that the participants had did not have any differential impact on the replies to the questions and opinions and attitudes regarding tumor regression grading systems and related issues. Participants from academic centers more frequently used defined tumor regression grading systems for gastric and rectal cancer instead of descriptions compared to private practice or public non-academic hospitals (pā€‰=ā€‰0.001 each), but there was no difference in practices regarding macroscopic and histologic work-up. There were, however, striking differences in practices between participants from different regions. This was particularly seen with European pathologists as compared with North American and Australian pathologists regarding several issues: while there was no difference between regions in grossing practices, North American and Australian pathologists used hematoxylin & eosin alone to assess histologic sections, whereas European pathologists more frequently used special stains or immunohistochemistry in addition to hematoxylin & eosin in their histologic work-up of post-neoadjvant treated gastrointestinal resections (pā€‰<ā€‰0.001). Moreover, ordering deeper sections to exclude residual tumor in cases where no carcinoma is seen upon initial sections is more intense in Europe; a higher number of European pathologists always order deeper sections in the ā€œno residual tumorā€ scenario (pā€‰<ā€‰0.001). Other differences include the almost exclusive use of the AJCC/CAP or Ryan system by North American and Australian pathologists, while in Europe, other systems are common (Mandard and Becker system for upper gastrointestinal tumors; Mandard, Dworak and Rƶdel system for lower gastrointestinal tumors; pā€‰<ā€‰0.001 each for all entities; Fig.Ā 2Ā and Supplemental file 2). Of note, there were also significant differences within Europe itself: the Mandard system is more commonly used in Western Europe and the Becker and Dworak system more frequently used in Central Europe (pā€‰ā‰¤ā€‰0.001 for each entity).

The differences between Europe and North America/Australia may also have influenced the suggestions for an ā€œideal systemā€. In Europe, pathologists more often preferred a four-tiered system (pā€‰=ā€‰0.043), and the tumor regression grade was favored to base on the estimation of the residual tumor in percentage (pā€‰=ā€‰0.011) while in North America and Australia there was no clear preference whether three or four grades and on what the ideal tumor regression grading should base on. Another notable regional difference was that European pathologists also use regression grading systems for pancreatic cancer (pā€‰=ā€‰0.035) and liver metastases (pā€‰=ā€‰0.020) more frequently than pathologists in North America and Australia. Finally, the demand for standardized work-up and reporting was more frequently stated as ā€œvery importantā€ in Europe compared to North America and Australia, where it was considered as ā€œimportantā€ only (pā€‰=ā€‰0.038 for macroscopy; pā€‰=ā€‰0.025 for histology; pā€‰=ā€‰0.001 for homogenization along the total luminal gut).

Discussion

We present the results of a world-wide survey about practices of tumor regression grading of gastrointestinal carcinomas after neoadjuvant therapy. We received over 200 replies, over 50% of the participants had major experience in this field with a significant annual case load of respective specimens, one-third had >10 years of professional activity and over 70% were from academic centers. This critical mass makes the results of the survey valid and significant and not only gives a comprehensive overview about the use of tumor regression grading in daily routine practice regarding but also presents opinions regarding critical issues.

The vast majority of the participants reported a standardized grossing and histological work-up, and over 90% stated to use a regression grading system in their reports. This highlights the positive attitude of pathologists towards this issue, despite the fact that tumor regression grading is not implemented for all tumors of the gastrointestinal tract in the Union for International Cancer Control (UICC)/American Joint Committee on Cancer (AJCC) TNM classification, not even as an additional factor. The most striking result, however, was the heterogeneity of the application of different tumor regression grading systems across different regions of the world and also depending on tumor type. In the United States of America and Canada, the majority of pathologists use the AJCC/CAP system, which is recommended in the CAP guidelines and which closely resembles the system proposed by Ryan for rectal cancer. In contrast, in Europe the Mandard system, originally described in esophageal squamous cell carcinomas, the Becker system, initially described in gastric cancer and also the Dworak system for rectal cancer are used as well, besides the Ryan and AJCC system and even more frequently. Interestingly, the use of these tumor regression grading systems also differs within Europe as for example in Central Europe (including the German speaking countries) the Becker and Dworak systems are more frequently used and in Western Europe (i.e., UK and the Benelux countries) the Mandard system is the most popular one. Unfortunately, we did not receive many replies from East Asia, which clearly represents a major bias of the survey. According to the experience of the authors, however, the Japanese Classification systems for esophagus, gastric and rectal cancer is almost exclusively used in Japan and Korea for these entities. This situation is therefore comparable to the North America, where one authority (i.e., CAP or AJCC) recommends or favors the use of one particular system. It also should be noted, that the use of a particular system does not necessarily imply an exclusive application on the entity where it was first described. For example, the Ryan system [31], which is referenced in a modified form by the CAP [39] for tumor regression grading in anus, esophagus, pancreas, stomach, and rectum cancers, was originally described to be effective for rectal cancer. Standardized reporting, using comprehensive datasets, has become routine practice for pathologists in many countries where national guidelines exist, such as the CAP or the Royal Academy of Pathologists. Working groups such as the International Consortium of Cancer Reporting try to homogenize cancer reporting between the East and West and it is expected that tumor regression grading will be a core item in the forthcoming proposed datasets. Another issue, which was different between Europe and North America and Australia, is the histology approach in specific situations. An extensive to complete macroscopical investigation of the tumor bed is performed by almost all pathologists independent of the region. However, participants from Europe more frequently use special stains and immunohistochemistry in addition to routine hematoxylin & eosin staining compared to Northern American and Australian pathologists. They also perform a more extensive work-up more frequently in cases where no tumor was found in first sections and routinely order deeper sections even when the initial blocks were adequately cut. Currently, however, there are no data to indicate if such approaches would lead to a higher detection rate of clinically meaningful foci of residual tumors, apart from few anecdotal reports.

The results of the survey do also reflect the fact, that it is not clear at the moment, which of the various tumor regression grading systems is superior in terms of reproducibility and prognostic impact. Studies comparing interobserver agreement show similar results for several systems basing on description with substantial (0.71) to excellent (0.84) agreement using kappa values in esophageal carcinomas [40], or concordance indices between 0.65 and 0.69 for the Dworak, a simplified three-tiered Mandard system or the AJCC system in rectal cancers [41]. Comparison between different concepts of TRG show slightly better values for systems that base on % [42, 43]. There is convincing evidence for a significant association of the tumor regression grade with patientsā€™ outcome: numerous studies have investigated the prognostic relevance of tumor regression grading. The strongest evidence for the association between tumor regression and patient outcome has been observed for upper gastrointestinal cancers as also shown in a recent meta-analysis [44]. With some exceptions, mainly for esophageal cancers [6, 45, 46] where partial tumor regression was also associated with significantly better outcome, patients with complete or subtotal tumor regression generally have the best prognosis [6, 45,46,47,48]. For rectal cancer complete tumor regression was constantly shown to be associated with better prognosis including and lower risk of local and distal recurrence [8, 49, 50], but data regarding the impact of subtotal and partial tumor regression, are conflicting [41, 49,50,51,52,53,54]. Studies comparing different systems in large-scale or even trial-associated case cohorts, however, are lacking.

Overall, most of the participants would favor a four-tiered grading system as ā€œidealā€ tumor regression grading system. There was no predilection for a concept on which a tumor regressing grading system should be based on, with equal results for residual tumor in percentage form or descriptive or tumor/stroma relation. This somehow reflected the tumor regression grading system that is used in routine, in particular regarding differences between European and North American and Australian pathologists, but interestingly there was no perfect correlation, i.e., some participants who use descriptive system would favor % and vice versa.

In literature, frequently stated reasons for interobserver disagreement are precise assessment of the relative amount of fibrosis and the discrimination between therapy-induced fibrosis and intrinsic stromal desmoplasia [29, 30, 55]. This was also observed in our survey, where the estimation of therapy-induced fibrosis was considered easy by only few and as difficult or very difficult by over half of the pathologists. While the identification of residual tumor itself was considered very easy or easy by the majority of the participants, the estimation of residual tumor was considered easy by less participants and more pathologists considered this part as difficult or very difficult. In contrast, the interpretation of acellular mucin, which also may cause disagreement between observers was estimated equally easy, neutral or difficult, in line with previously published data [48, 56]. The substantial number of pathologists expressing their difficulty in evaluating post-treatment tumor raises the question of whether there is a need for more educational opportunities or easier access to teaching modules to help them implement tumor regression grading. This is necessary, given the importance of post-therapy staging on resections in modern oncology. One example of the increasing role of post-therapy evaluation is in the latest AJCC staging manual, where organs such as esophagus have emerged with new post-therapy staging categories that were absent in previous editions, albeit tumor regression grading is not included.

Highly valuable information was also obtained from the free comments that the participants were encouraged to add. The most frequent issues were practicability, in terms of workload and cost, which also should take into account the benefit of the work and clinical consequences. It was also emphasized that the agreement on one particular tumor regression grading system should be data-driven and in dialog with clinical colleagues. Interestingly, homogenization of tumor regression grading along the total gut was considered to not be as important as standardized grossing and histology work-up in general. In Europe, however, where standardized reporting in terms of the usage of one particular system is less commonly in place than in North America and Australia, more pathologists considered the need for standardized tumor regression grading to be very important compared to these regions where it is already performed in daily practice.

Recent work describes also the impact of tumor regression in lymph node metastases. In line with data from esophageal and rectal carcinomas [57,58,59,60] most participants would suggest to report on regressive changes in lymph nodes. At the moment, however, grading of these changes was not seen as priority but inclusion into a general tumor regression grading system was favored by more than half of the participants. Given that the presence of lymph node metastases is one of the major adverse prognostic factors in gastrointestinal carcinomas both in the multimodal setting and for primary resected tumors, further studies on regression in lymph node metastases clearly are warranted. Future work should also include the comparison between imaging of lymph nodes and the actual status in the resection specimen in order to improve preoperative clinical staging. We also asked about tumor regression grading in liver metastases, which is performed by less than half of the participants and in view of the current therapeutic developments in pancreatic cancers. Here, one-third of the participants stated that they use tumor regression grading for this entity but the application of tumor regression grading systems is very heterogenous. Tumor regression grading of pancreatic cancer, however, differs from that of luminal gastrointestinal carcinomas, e.g., due to the three-dimensionality of the resection specimens, the marked tumor intrinsic stromal desmoplasia and lastly by the lack of large-scale data on the benefit of preoperative treatment or histological regression in this entity itself.

In summary, this survey provides a comprehensive and world-wide overview about routine practice in reporting tumor regression of gastrointestinal carcinomas. Our data clearly show the heterogeneity in the application of grading systems but a general positive attitude towards standardization of macroscopic and histologic work-up. This survey complements other activities in this field such as meta-analyses [44], expert reviews [20, 61, 62] and expert recommendations [63], as well as critical views published along original works [29, 30].

Standardization of reporting a tumor regression grade should, however, always consider quality criteria that would apply for any other biomarker. This includes reliability, reproducibility, and the clinical impact of a potentially proposed and agreed-upon grading system, and finally its practicability in daily practice. Implementation into the AJCC and UICC TNM classification should be the aim in order to achieve a standardized evaluation concept and the opportunity to generate comparable data. This also may be helpful to overcome uncertainties regarding the clinical impact of tumors with little residual tumors in deeper layers of the organs, which are classified, e.g., as ypT3 but have a favorable regression grade [64]. A digital image analysis-based assessment of tumor regression, possibly with the support of machine learning and artificial intelligence, could be a potential solution towards developing an optimal and convenient grading system. If validated with patient outcome data, this type of assessment tool could provide an even more precise correlation between amount of residual tumor and patient prognosis. Such approaches may not even be limited to amount of residual tumor or fibrosis; other patterns of regression such as pre- and post-tumor size comparisons, patterns of tumor fragmentation or stromal changes may also have biologic significance. Moreover, novel therapeutic concepts, such as molecular targeting of tumoral alterations or immunotherapy [65, 66] may be associated with different patterns of tumor response. Careful visualization and comprehensive analysis of regression, in the context of classical treatment or novel therapies, may also help for a better understanding of tumor regression as a biological process and help to identify new approaches to overcome resistance.