Introduction

Fragility fractures of the pelvis have increased in recent years, accompanied by the loss of mobility and autonomy, with increased rates of mortality1,2,3,4,5. Variable morphology, dynamic fracture progression, and the resulting instability have led to the development of a computed tomography (CT)-based FFP classification6 (Fig. 1) which is as reliable as the OTA/Tile and Young and Burgess classifications7,8.

Figure 1
figure 1

Fragility fracture of the pelvis (FFP) classification. The FFP classification is outlined according to the characteristic fracture morphology. The main lesions are in red and the less common lesions are in orange. Non-operative treatment is recommended for FFP I and FFP II. Operative stabilization is recommended for FFP III and FFP IV. FFP II with prolonged pain or restricted mobilization should be considered for operative treatment as well.

Based on the classification the following recommendations were given, non-operative treatment for FFP I; non-operative or operative treatment, depending on the patient’s mobility, for FFP II; and operative treatment for FFP III and FFP IV9.

However, recent monocentric studies challenged the recommendations of Rommens and Hofmann6 and showed good results for operatively treated FFP II10,11,12,13 and non-operatively treated FFP III and FFP IV14,15.

This brings into question the usefulness of the FFP classification and the accompanying treatment recommendations.

Currently, there is a lack of data on the usefulness of the FFP classification for therapeutic decision-making, even though this data is essential for preventing harm that could occur due to inappropriate classification of the fracture categories that may lead to incorrect treatment decisions16.

Therefore, in the present study we used CT with multiplanar reconstruction, without clinical information and assessed the intra-rater reliability and inter-rater reliability of therapeutic decision-making for FFP (non-operative vs. operative treatment). We investigated the treatment recommendations, their relation to the FFP classification6 and the effects of classification disagreement on treatment decisions.

Using this approach, we aimed to determine the reliability of treatment strategies derived from CT scans and resulting FFP classification, and thus the usefulness of FFP classification in relation to clinically relevant decisions (operative vs. non-operative). Furthermore, we investigated the favored operative procedures in relation to the FFP classification.

Methods

Study design

The study design used to evaluate the FFP classification, sample size calculation, patient demographics, anonymization, and rating procedures was reported by Pieroh et al.8 The patients from this study8 were used to analyze the association between the FFP classification and the resulting treatment decision as well as the favored operative procedure. Each observer was familiar with the FFP classification and no additional training was performed before the study. At least two weeks lay between the classification cycles and observers had no access to the previous ratings and classification cycles.

In addition to classification, recommended treatment options are also presented in Fig. 2.

Figure 2
figure 2

Treatment decisions for FFP. At first, the rater had to decide between non-operative and operative treatment. No further specific treatment data were obtained when non-operative treatment was performed. For operative treatment, the rater had to decide whether to use anterior and/or posterior stabilization. For anterior stabilization, the rater could choose between procedures; no combinations were possible. For posterior stabilization, the rater could choose unilateral or bilateral stabilization and further specified the operative method; combinations were possible.

Statistical analyses

Intra-rater reliability and Inter-rater reliability

We assessed the intra-rater reliability and inter-rater reliability of the therapeutic decisions (non-operative vs. operative) as previously reported8 using specific MATLAB scripts (MATLAB, version 2013b; MathWorks, Natick, MA, USA) to calculate the Fleiss kappa coefficients17,18 and presented them as means and 95% confidence intervals (CI). The 95% CI was generated using the bootstrap method of resampling the pelves17,18.

Treatment decisions were collected during the classification process from each of the 6 experienced and inexperienced surgeons and the surgeon trained by the creator of the FFP classification (“gold standard”), all from Level-1 trauma centers8. Inexperienced raters were included as part of this pragmatic multicenter agreement study to assess the generalizability of the classification related decisions in raters with differing experience16.

The “gold standard” was included due to his adherence to the prescribed treatment recommendations6. We generated the intra-rater reliability based on the three classification cycles, separate for each rater. For the inter-rater reliability, we calculated one mean vote for each rater out of the three classification cycles and used this for further analysis. Using this data, we determined the inter-rater reliability for each classification cycle and for the overall cycles.

We graded the intra-rater reliability and inter-rater reliability using the following categories determined by Landis and Koch19: Fleiss kappa coefficient of 1, perfect reliability; ≥ 0.81, almost perfect reliability; 0.61–0.80, substantial reliability; 0.41–0.60, moderate reliability; 0.2–0.40, fair reliability; and ≤ 0.21, poor reliability.

Agreement analyses

We used the classifications and therapeutic decisions determined by the references, “gold standard”, submitting hospitals, and majority vote. We investigated the agreement between the therapeutic decisions of the raters and the therapeutic decisions indicated by the references for FFP. We generated the majority vote, similar to the mean vote, for calculating the inter-rater reliability and included the gold standard and submitting hospital votes.

Classification and therapeutic decision agreement

Case 48 was excluded because its FFP classification was not possible according to the gold standard and 42.9% of raters8. We generated the majority vote based on the rater votes for the classification and therapeutic decision for each case (n = 59). The gold standard and submitting hospital votes were excluded from this vote. Using the therapeutic decision of that majority vote, we separated cases according to the recommended non-operative and operative treatments. Subsequently, we allocated cases to the FFP classification based on the gold standard classification. We examined the agreement and disagreement (“gold standard” vs. raters) between the classification and therapeutic decision for each case to assess the impact of classification disagreement on the therapeutic decision.

Preferred operative therapy

Cases where operative therapy was recommended, were analyzed to assess the operative therapy (Fig. 2) preferred by all the raters, the gold standard, and the submitting hospital. Raters had to choose an anterior procedure. For posterior stabilization, unilateral or bilateral procedures could be chosen. The available procedures could be combined.

Ethical statement and study registration

The following ethics committees approved the study: Ethics Commission at the Medical Faculty of the University of Leipzig, Universitäts Klinikum Jena Ethics Commission, Ethics Committee Charité-Universitätsmedizin Berlin, Ethics Commission Medical Association of Saarland, Ethics Committee at the Medical Faculty of the Eberhard Karls University and at the University Hospital Tübingen, Ethics Committee of the University of Ulm.

All these mentioned ethics committees waived the need for informed consent of the patients for this study due to the retrospective nature of the study and because patients consented to the use of de-identified CT scans for research on signing the hospital contract of admission. Separate informed consent was not obtained since the data was collected retrospectively. Afterwards, the study was registered in the German Clinical Trials Register (DRKS00014248). CT scans with multiplanar reconstruction were performed for clinical reasons. The study was performed in accordance to the Declaration of Helsinki.

Results

Classifications and treatment decisions based on the gold standard, submitting hospital, and majority vote for each patient, are summarized in Supplementary Table S1. The patient demographics are available in Appendix II of Pieroh et al.8.

Intra- and Inter-rater reliability

The gold standard and both raters of hospital 1 had almost perfect intra-rater reliability (Table 1). The decisions of two experienced and three inexperienced raters had substantial intra-rater reliability. The overall intra-rater reliability (Table 1), overall inter-rater reliability and inter-rater reliability of the experienced raters were moderate (Table 2). The overall inter- rater reliability of the decisions of the inexperienced raters was fair (Table 2).

Table 1 Intra-rater reliability of the therapeutic decision (non-operative vs. operative) for the three classification cycles.
Table 2 Inter-rater reliability of the therapeutic decision (non-operative vs. operative) for all classification cycles and for each separate cycle.

Agreement analyses

The highest therapeutic agreement (> 90%) was found for FFP I and the lowest was found for FFP II (minimum compared to the gold standard, 66.0%) (Table 3). For FFP I and FFP II, the majority voted for non-operative therapy (Table 3). The agreement for FFP III and FFP IV was > 75%, and the majority recommended operative treatment.

Table 3 Percentage of agreement between raters and references (“Gold Standard,” submitting hospital, and majority vote).

Classification and therapeutic decision agreement

For FFP I, FFP III, and FFP IV, the classifications and resulting treatment decisions were majorly in agreement (Table 4). Pronounced disagreement regarding therapy was found for FFP II, both in classification and treatment recommendation. Although the raters and gold standard agreed on the classification of one FFP IIb case and four FFP IIc cases, the raters recommended surgery (Fig. 3).

Table 4 Case-based (dis)agreement analysis of therapy (separation based on the majority vote) and classification (separation based on mean vote of the "Gold Standard").
Figure 3
figure 3

FFP II cases with classification agreement but differences in treatment recommendations. One FFP IIb case (non-displaced fracture of the sacral ala; anterior fracture not shown) was recommended to undergo non-operative treatment by the gold standard, submitting hospital, and raters. A bilateral non-displaced fracture of the sacral ala without horizontal communication (FFP IIb) was recommended to undergo operative treatment by the raters only. A unilateral, multi-fragmentary, non-displaced fracture of the sacral ala (FFP IIc) was recommended to undergo surgery by the raters and the submitting hospital. Fracture lines are indicated by white arrows.

Surgical treatment preferences

One FFP IIc case (case 39) was excluded from further analyses because < 50% recommended anterior stabilization and/or posterior stabilization. The FFP classification, agreement regarding anterior stabilization and unilateral or bilateral stabilization, and the procedure frequencies are summarized in Supplementary Table S2. Anterior stabilization was recommended for 18 cases (Table 5, Supplementary Table S2). The external fixator was the favored anterior procedure. Posterior instrumentation was recommended for all cases with surgical stabilization. Bilateral fixation was recommended for FFP IIa, FFP IIb, FFP IIIb, and FFP IV cases, which corresponded to 58% (n = 17) of all cases recommended for surgery (Table 5).

Table 5 Preferred surgical therapy (anterior and posterior) in relation to the FFP classification.

Unilateral stabilization was predominantly recommended for FFP IIc, FFP IIIa, and FFP IIIc. For sacral fractures (FFP II and FFP IIIc), the raters preferred stabilization with sacroiliac screws/transsacral bar or, as second choice, with a trans-iliac fixator or spinopelvic fixation. Transiliac fractures (FFP IIIa) were treated with an iliac plate through the lateral window using the ilioinguinal approach. Sacroiliac screws/transsacral bar and an iliac plate were recommended for the FFP IIIb case. Half of the FFP IVb cases were recommended to undergo treatment with sacroiliac screws/transsacral bar, and the other half of the FFP IVb cases were recommended to undergo spinopelvic fixation (Fig. 4). For FFP IVb, if sacroiliac screws/transsacral bar fixation was chosen, then the second choice was spinopelvic fixation, and vice versa.

Figure 4
figure 4

FFP IVb examples with differing recommended posterior operative stabilization methods. Approximately half of the raters recommended that the presented fractures required sacroiliac screws (SIS) or spinopelvic fixation (SPF) (maximum rating difference, 2 votes). The fracture recommended for SIS was a bilateral non-displaced fracture of the sacral ala with vertical communication below S2 and minimal anterior displacement. The fractures recommended for SPF were a displaced trans-foraminal fracture (Denis zone II), a non-displaced fracture of the sacral ala, and a central fracture through S1. In the sagittal view, the vertical fracture through S1 without anterior or posterior displacement is revealed.

Anterior and posterior, percutaneous procedures were the preferred choice for operatively treating FFP.

Discussion

The surgeons agreed to treat isolated anterior lesions (FFP I) non-operatively and to recommend operative treatment for posteriorly displaced fractures (FFP III and FFP IV). There was some disagreement regarding the therapeutic decisions for non-displaced posterior fractures (FFP II). For cases with indications for operative treatment, a combination of anterior and posterior surgery was recommended.

Classification systems should distinguish patients receiving non-operative or operative treatment. Despite a high reported agreement in treating LC-1 fractures/Tile B fractures20 (comparable to FFP II)—representing the most common fracture type in the elderly21—the disagreement found in the survey analysis of Beckmann et al.22 highlights the differing treatment for a similar fracture type. To improve the treatment, the examination under anesthesia (EUA) for lateral compression fractures was introduced, but their interpretation shows a relevant disagreement23. Furthermore, the deduced treatment and the consensus might change upon newer data24,25,26.

The moderate intra-rater reliability and inter-rater reliability for therapeutic decisions observed during our study might be the result of still conflicting information about mortality after non-operative or operative treatment for FFP, and some evidence of lower mortality for operated cases8,27,28. Here, we would like to emphasize that the grading was done using the arbitrarily, not proven but widely used benchmarks according to Landis and Koch29. Using the stricter and probably more practical29 but also arbitrarily set ranges of Svanholm et al.30 would lead to good intra-rater reliabilities for all raters but only good inter-rater reliabilities for experienced observers. The missing validation of these criteria should be considered weighting the presented Fleiss kappa values.

The inexperienced raters presented only a fair inter-rater reliability. One reason might their missing experience in treating pelvic ring injuries by their own. However, in our view most probably, the differences result from classification differences especially between non-displaced and displaced posterior pelvic ring lesions (FFP II vs. FFP III) and the missing detection of the horizontal connection of bilateral sacral lesion leading to a difference in treatment from non-operative to operative treatment.

Therapeutic decisions for FFP based on morphology or classification have not yet been studied, although fracture severity (Tile A vs. B) influenced the survival after FFP31. Similar to a survey analysis on the treatment of high-energy pelvic fractures, we determined a high agreement for stable (Tile A—FFP I) and completely unstable fractures (Tile C—FFP III, FFP IV) but a low agreement for partially stable injuries (Tile B—FFP II)20.

Our study showed a consensus for the treatment of FFP I, FFP III, and FFP IV by applying the consensus criteria (agreement ≥ 75%) for Delphi studies32. The therapeutic strategy chosen by the raters generally followed the recommendations of Rommens and Hofmann6,33. Fractures classified as FFP II showed the lowest agreement for the treatment strategy, probably because of the impaired differentiation between FFP II and FFP III8. However, the classification disagreement itself was not responsible for the differences in therapeutic decisions for FFP. The observers tended to recommend operative treatment for FFP II more often, probably due to the successful operative treatment of these injuries in terms of preserved autonomy, lower mortality3, and to hasten pain relief10,11,12,34. Furthermore, even incomplete, simple sacral fractures (FFP II) displace in approximately 30% of elderly patients to FFP III or FFP IV, especially in patients treated non-operatively, leading to a secondary operative procedure35. Even after 6 months of failed non-operative therapy, patients with FFP II benefited from operative therapy10. However, late surgical therapies might be complicated by deformities in contrast to early performed percutaneous procedures6,36.

Thus, a close follow-up especially for non-operatively treated cases is necessary. Additional factors such as general health21 and a radiographic rating system37 may help with decision-making, especially for FFP II.

Although our study indicated equal use of spinopelvic fixation6 and sacroiliac screws/transsacral implant fixation38,39 for posterior bilateral displaced sacral fractures (FFP IVb), discriminating radiological factors were not found. Significant pain reduction without complications after operative treatment using percutaneous transiliac-transsacral screws or bilateral sacroiliac screws in the upper sacral segment has been reported38,40,41. Interestingly, even unilateral injuries were recommended bilateral stabilization, probably to avoid fracture progression35,42.

The external fixator was preferred for the anterior ring, by four of the seven hospitals. The external pelvic fixator has been reported as a valuable tool for the elderly43, but it may be complicated by frequent loosening or pin tract infections44,45. Retrograde transpubic screw fixation has yielded good results in terms of fracture reduction and healing, with only a small number of adverse events (17 of 128 cases) reported by a large retrospective series46. Anterior plate osteosynthesis is complicated by loosening and consecutive non-union as well as excessive blood loss in elderly patients47,48. It should be noted that two hospitals and the gold standard preferred retrograde pubic screws whereas four hospitals preferred the external fixator.

Currently, it remains unclear which patients require anterior stabilization. A recent systematic review highlights the low number of cases treated anteriorly but also emphasizes the low quality of studies49. However, comparing all three anterior stabilization procedures, all of them showed a relevant complication rate, a maximum of one quarter of patients required revision surgery46,47,50. The need for anterior stabilization as well as the type of osteosynthesis should be investigated in prospective studies.

Clinical data regarding mobility, pain, perioperative risks, and expectations, especially for patients with FFP II is needed to make appropriate treatment decisions. The influences of these factors on the patient’s prognosis need to be elaborated21.

Besides, an incorrect classification of FFP II by the observer, the observer might decide to not follow the recommendations of Rommens and Hofmann9. This might be the result of current studies showing a decreased rate of mortality51, rate of general complications for percutaneous procedures52 and an improved mobility following operative therapy53. This data underlines the need to introduce additional modifiers for treatment decision (e.g. previous mobility level, pain level) and probably of a progressive operative treatment to avoid immobility-associated complications54. On the other hand, data from Saito et al. challenge the more progressive operative treatment55. Based upon the ongoing data gain, an adaption of the recommendations should be considered.

Non-operative therapy should be standardized and the time to failure of non-operative therapy should be defined to avoid immobility-associated complications such as pressure ulcera54.

The FFP classification has sufficient intra-rater and inter-rater reliability however the choice of fixation among the options available especially in the posterior ring injuries are somewhat unclear and depend more on the physician's experience and training.

Although we determined a high agreement of the raters to the proposed therapy by the FFP classification, the clinical course and success in patients must be proven.

The FFP classification might become useful for guiding non-operative and operative therapy strategies, but the inclusion of non-radiological data and recent studies is required to improve therapy guidance. Recommendations for treating FFP have not yet been finalized and controversy exists, especially concerning FFP II. Here, additional factors e.g. secondary displacement and/or unrelenting pain on ambulation should be considered in the change from non-operative treatment to operative treatment in patients with FFP II.

Most differences in procedure were observed due to the individual preference of a surgeon and missing evidence in terms of comparative studies. Based on these findings, the relevance of the FFP classification criteria, operative procedures, and their outcomes should be evaluated further. Here, the patients suffering from a FFP II requiring operative treatment, the concept for non-operative therapy, the most non-invasive type of stabilization with the required stability, fractures at risk for fracture progression as well as the patients requiring anterior fixation should be identified.