To the editor: We read with great interest the work of Lugli et al. [1] on the fascinating subject of tumor budding (TB), an independent and robust prognostic factor in colorectal cancer but without a standardized method for its evaluation until consensus in 2016. However, there is no information on interobserver variability applying the consensus, and we have some concerns.

We evaluated the agreement between observers about the number of buds and the degree of TB in 50 consecutive samples of colon cancer by applying the consensus criteria and found some interesting results. The TB was counted through a ×20 objective lens with an eye field number of 20 mm in one round, and in a ×20 objective lens with a 22-mm eyepiece in another; the last method required the application of the formula proposed by Lugli et al., which results in a decimal number. The evaluation was blindly carried out by two expert gastrointestinal pathologists and two pathology residents to evaluate the agreement between observers (κ) and the method to determine the bud count (since one method gives a decimal number and the other method an integer number) to determine if the degree of TB could be altered. The general agreement for counting the exact number of buds was 52% (κ = 0.28). When comparing the results of decimal numbers, TB number was rounded up or down (according to the number closest to 0.5) and the agreement was 50% (κ = 0.25). In none case, the rounding of the number affected the degree of TB. The general agreement among expert pathologists was 54% (κ = 0.3) and 48% for the residents (κ = 0.2). The general agreement of TB grade applying the consensus was 97.3% (κ = 0.96), with 100% of agreement among experts.

We found that the concordance of the number of buds in colorectal cancer was fair, regardless of the observer’s experience, the bud count, and the counting method. Rounding up or down the number of buds using a formula did not affect the TB grading, and the agreement was almost perfect. These results show the discrepancies between the pathologists in how to apply the consensus criteria, but they also suggest that there is no significant variation between the exact numbers of buds given because the TB grade (even using the formula for correcting the field area) was not affected by this lack of precision.

Our findings and the literature about TB essentially show that rigorous methods such as those proposed by Lugli et al. are unnecessary; pathologists are good at recognizing high-grade tumor budding regardless of training level or technique used in demonstrating the robustness of TB. Lugli et al. made broad recommendations about the method for assessing TB and we understand that standardizing the way of reporting a phenomenon is a first and very important step, but they did not provide any evidence that their method is superior to others with respect to the accuracy of predicting tumor behavior and they should be made to justify their recommendations. We must also emphasize that when there is no high level of evidence available on a subject under study, it is perfectly valid involving experts in the subject to agree. However, this should be performed in the best possible way to drop—as far as possible—bias in the results; and the best way is a well-conducted Delphi method [2]. If well the consensus applied some aspects of the Delphi method and included outstanding pathologists who have contributed greatly to the TB, this does not drop the potential biases of a consensus.

Do the authors have any suggestions for improving the accuracy of counting the number of buds in a case of colorectal cancer without adding time and complexity to evaluation?

Besides, there is evidence on the impact on clinical outcomes of TB in biopsies (sometimes called “intratumoral budding”) [3, 4]; however, there was no consensus on how to report TB in this context. Do the authors recommend reporting TB in biopsies even if TB grade is not applied? Finally, apply the consensus in special types such as mucinous and micropapillary carcinoma is unclear. It is not uncommon for these subtypes to show expansive invasion fronts or in small clusters that do not fit the definition of buds and therefore show little or none TB; and in addition, these special types are accompanied by a non-special component and TB is evaluated in this component rather than the mucinous or micropapillary component. What do the authors recommend in these cases?

We congratulate the authors for the work presented and for their contributions to the study of TB, however, we believe that this method of evaluation of TB is not definitive and requires validation.