We were interested to read the paper by Baunsgaard CB and colleagues1 published in the May 2016 issue of Spinal Cord. The authors aimed to determine the intra- and inter-rater reliability of the International Spinal Cord Injury (SCI) Musculoskeletal Basic Data Set (ISCIMSBDS). Kappa statistics (ranged from κ=0.62 to 1.00) was used to measure reliability.1 Reliability (precision) is an important methodological issue. For qualitative variables, using simple kappa is among common mistakes in reliability analysis. Regarding reliability (precision, repeatability or reproducibility) for qualitative variables, weighted kappa should be used with caution because kappa has its own limitation.2–8 Two important weaknesses of k-value to assess the agreement of a qualitative variable are as follows: it depends upon the prevalence in each category, which means it is possible to have different kappa value having the same percentage for both concordant and discordant cells! Figure 1 shows that in both (a) and (b) situations, the prevalence of concordant cells are 80% and discordant cells are 20%, however, we get different kappa value (0.38 as fair and 0.60 as moderate-good) respectively. Kappa value also depends upon the number of categories, which means that higher the categories lower the kappa value.2–8 Therefore, reporting weighted kappa can be highly recommended.

Figure 1
figure 1

Comparison of two observers' diagnosis with different prevalence in the two categories.

They reported that the crude agreement ranged from 75 to 100% for each of the variables on the ISCIMSBDS.1 Regarding reliability, it is crucial to know that an individual-based approach instead of group-based (crude agreement) should be considered.2–9 The reason is in reliability assessment; we should consider individual results and not global average. In other words, possibility of getting exactly the same crude agreement of a variable between methods with no reliability at all is high.2,3,9

As the authors pointed out in their conclusion, overall, the ISCIMSBDS is reliable. Such conclusions may be a misleading message due to inappropriate use of statistical tests. In conclusion, for reliability analysis, appropriate tests as well as correct interpretation should be applied. Otherwise, misdiagnosis and mismanagement of the patients cannot be avoided.