A computed tomography based survey study investigating the agreement of the therapeutic strategy for fragility fractures of the pelvis

Treatment recommendations for fragility fractures of the pelvis (FFP) have been provided along with the good reliable FFP classification but they are not proven in large studies and recent reports challenge these recommendations. Thus, we aimed to determine the usefulness of the FFP classification determining the treatment strategy and favored procedures in six level 1 trauma centers. Sixty cases of FFP were evaluated by six experienced pelvic surgeons, six inexperienced surgeons in training, and one surgeon trained by the originator of the FFP classification during three repeating sessions using computed tomography scans with multiplanar reconstruction. The intra-rater reliability and inter-rater reliability for therapeutic decisions (non-operative treatment vs. operative treatment) were moderate, with Fleiss kappa coefficients of 0.54 (95% confidence interval [CI] 0.44–0.62) and 0.42 (95% CI 0.34–0.49). We found a therapeutic disagreement predominantly for FFP II related to a preferred operative therapy for FFP II. Operative treated cases were generally treated with an anterior–posterior fixation. Despite the consensus on an anterior–posterior fixation, the chosen procedures are highly variable and most plausible based on the surgeon’s preference.

This brings into question the usefulness of the FFP classification and the accompanying treatment recommendations.
Currently, there is a lack of data on the usefulness of the FFP classification for therapeutic decision-making, even though this data is essential for preventing harm that could occur due to inappropriate classification of the fracture categories that may lead to incorrect treatment decisions 16 .
Therefore, in the present study we used CT with multiplanar reconstruction, without clinical information and assessed the intra-rater reliability and inter-rater reliability of therapeutic decision-making for FFP (nonoperative vs. operative treatment). We investigated the treatment recommendations, their relation to the FFP classification 6 and the effects of classification disagreement on treatment decisions.
Using this approach, we aimed to determine the reliability of treatment strategies derived from CT scans and resulting FFP classification, and thus the usefulness of FFP classification in relation to clinically relevant decisions (operative vs. non-operative). Furthermore, we investigated the favored operative procedures in relation to the FFP classification.

Methods
Study design. The study design used to evaluate the FFP classification, sample size calculation, patient demographics, anonymization, and rating procedures was reported by Pieroh et al. 8 The patients from this study 8 were used to analyze the association between the FFP classification and the resulting treatment decision as well as the favored operative procedure. Each observer was familiar with the FFP classification and no additional training was performed before the study. At least two weeks lay between the classification cycles and observers had no access to the previous ratings and classification cycles.
In addition to classification, recommended treatment options are also presented in Fig. 2.
Statistical analyses. Intra-rater reliability and Inter-rater reliability. We assessed the intra-rater reliability and inter-rater reliability of the therapeutic decisions (non-operative vs. operative) as previously reported 8 using specific MATLAB scripts (MATLAB, version 2013b; MathWorks, Natick, MA, USA) to calculate the Fleiss kappa coefficients 17,18 and presented them as means and 95% confidence intervals (CI). The 95% CI was generated using the bootstrap method of resampling the pelves 17,18 . Treatment decisions were collected during the classification process from each of the 6 experienced and inexperienced surgeons and the surgeon trained by the creator of the FFP classification ("gold standard"), all from Level-1 trauma centers 8 . Inexperienced raters were included as part of this pragmatic multicenter agreement study to assess the generalizability of the classification related decisions in raters with differing experience 16 .
The "gold standard" was included due to his adherence to the prescribed treatment recommendations 6 . We generated the intra-rater reliability based on the three classification cycles, separate for each rater. For the interrater reliability, we calculated one mean vote for each rater out of the three classification cycles and used this for further analysis. Using this data, we determined the inter-rater reliability for each classification cycle and for the overall cycles.
Agreement analyses. We used the classifications and therapeutic decisions determined by the references, "gold standard", submitting hospitals, and majority vote. We investigated the agreement between the therapeutic decisions of the raters and the therapeutic decisions indicated by the references for FFP. We generated the majority  Classification and therapeutic decision agreement. Case 48 was excluded because its FFP classification was not possible according to the gold standard and 42.9% of raters 8 . We generated the majority vote based on the rater votes for the classification and therapeutic decision for each case (n = 59). The gold standard and submitting hospital votes were excluded from this vote. Using the therapeutic decision of that majority vote, we separated cases according to the recommended non-operative and operative treatments. Subsequently, we allocated cases to the FFP classification based on the gold standard classification. We examined the agreement and disagreement ("gold standard" vs. raters) between the classification and therapeutic decision for each case to assess the impact of classification disagreement on the therapeutic decision.
Preferred operative therapy. Cases where operative therapy was recommended, were analyzed to assess the operative therapy ( Fig. 2) preferred by all the raters, the gold standard, and the submitting hospital. Raters had to choose an anterior procedure. For posterior stabilization, unilateral or bilateral procedures could be chosen. The available procedures could be combined.
Ethical statement and study registration. The  All these mentioned ethics committees waived the need for informed consent of the patients for this study due to the retrospective nature of the study and because patients consented to the use of de-identified CT scans for research on signing the hospital contract of admission. Separate informed consent was not obtained since the data was collected retrospectively. Afterwards, the study was registered in the German Clinical Trials Register (DRKS00014248). CT scans with multiplanar reconstruction were performed for clinical reasons. The study was performed in accordance to the Declaration of Helsinki.

Results
Classifications and treatment decisions based on the gold standard, submitting hospital, and majority vote for each patient, are summarized in Supplementary Table S1. The patient demographics are available in Appendix II of Pieroh et al. 8 . www.nature.com/scientificreports/ Intra-and Inter-rater reliability. The gold standard and both raters of hospital 1 had almost perfect intrarater reliability ( Table 1). The decisions of two experienced and three inexperienced raters had substantial intrarater reliability. The overall intra-rater reliability (Table 1), overall inter-rater reliability and inter-rater reliability of the experienced raters were moderate ( Table 2). The overall inter-rater reliability of the decisions of the inexperienced raters was fair ( Table 2).

Agreement analyses.
The highest therapeutic agreement (> 90%) was found for FFP I and the lowest was found for FFP II (minimum compared to the gold standard, 66.0%) ( Table 3). For FFP I and FFP II, the majority voted for non-operative therapy ( Table 3). The agreement for FFP III and FFP IV was > 75%, and the majority recommended operative treatment.      (Table 4). Pronounced disagreement regarding therapy was found for FFP II, both in classification and treatment recommendation. Although the raters and gold standard agreed on the classification of one FFP IIb case and four FFP IIc cases, the raters recommended surgery (Fig. 3).
Surgical treatment preferences. One FFP IIc case (case 39) was excluded from further analyses because < 50% recommended anterior stabilization and/or posterior stabilization. The FFP classification, agreement regarding anterior stabilization and unilateral or bilateral stabilization, and the procedure frequencies are summarized in Supplementary Table S2. Anterior stabilization was recommended for 18 cases (Table 5, Supplementary Table S2). The external fixator was the favored anterior procedure. Posterior instrumentation was recommended for all cases with surgical stabilization. Bilateral fixation was recommended for FFP IIa, FFP IIb, FFP IIIb, and FFP IV cases, which corresponded to 58% (n = 17) of all cases recommended for surgery (Table 5). Unilateral stabilization was predominantly recommended for FFP IIc, FFP IIIa, and FFP IIIc. For sacral fractures (FFP II and FFP IIIc), the raters preferred stabilization with sacroiliac screws/transsacral bar or, as second choice, with a trans-iliac fixator or spinopelvic fixation. Transiliac fractures (FFP IIIa) were treated with an iliac plate through the lateral window using the ilioinguinal approach. Sacroiliac screws/transsacral bar and an iliac plate were recommended for the FFP IIIb case. Half of the FFP IVb cases were recommended to undergo treatment with sacroiliac screws/transsacral bar, and the other half of the FFP IVb cases were recommended to undergo spinopelvic fixation (Fig. 4). For FFP IVb, if sacroiliac screws/transsacral bar fixation was chosen, then the second choice was spinopelvic fixation, and vice versa.
Anterior and posterior, percutaneous procedures were the preferred choice for operatively treating FFP.

Discussion
The surgeons agreed to treat isolated anterior lesions (FFP I) non-operatively and to recommend operative treatment for posteriorly displaced fractures (FFP III and FFP IV). There was some disagreement regarding the therapeutic decisions for non-displaced posterior fractures (FFP II). For cases with indications for operative treatment, a combination of anterior and posterior surgery was recommended. Classification systems should distinguish patients receiving non-operative or operative treatment. Despite a high reported agreement in treating LC-1 fractures/Tile B fractures 20 (comparable to FFP II)-representing the most common fracture type in the elderly 21 -the disagreement found in the survey analysis of Beckmann et al. 22 highlights the differing treatment for a similar fracture type. To improve the treatment, the examination under anesthesia (EUA) for lateral compression fractures was introduced, but their interpretation shows a relevant disagreement 23 . Furthermore, the deduced treatment and the consensus might change upon newer data [24][25][26] .
The moderate intra-rater reliability and inter-rater reliability for therapeutic decisions observed during our study might be the result of still conflicting information about mortality after non-operative or operative treatment for FFP, and some evidence of lower mortality for operated cases 8,27,28 . Here, we would like to emphasize that the grading was done using the arbitrarily, not proven but widely used benchmarks according to Landis and Koch 29 . Using the stricter and probably more practical 29 but also arbitrarily set ranges of Svanholm et al. 30 would lead to good intra-rater reliabilities for all raters but only good inter-rater reliabilities for experienced observers. The missing validation of these criteria should be considered weighting the presented Fleiss kappa values.
The inexperienced raters presented only a fair inter-rater reliability. One reason might their missing experience in treating pelvic ring injuries by their own. However, in our view most probably, the differences result from  www.nature.com/scientificreports/ classification differences especially between non-displaced and displaced posterior pelvic ring lesions (FFP II vs. FFP III) and the missing detection of the horizontal connection of bilateral sacral lesion leading to a difference in treatment from non-operative to operative treatment. Therapeutic decisions for FFP based on morphology or classification have not yet been studied, although fracture severity (Tile A vs. B) influenced the survival after FFP 31 . Similar to a survey analysis on the treatment of high-energy pelvic fractures, we determined a high agreement for stable (Tile A-FFP I) and completely unstable fractures (Tile C-FFP III, FFP IV) but a low agreement for partially stable injuries (Tile B-FFP II) 20 .
Our study showed a consensus for the treatment of FFP I, FFP III, and FFP IV by applying the consensus criteria (agreement ≥ 75%) for Delphi studies 32 . The therapeutic strategy chosen by the raters generally followed the recommendations of Rommens and Hofmann 6,33 . Fractures classified as FFP II showed the lowest agreement for the treatment strategy, probably because of the impaired differentiation between FFP II and FFP III 8 . However, the classification disagreement itself was not responsible for the differences in therapeutic decisions for FFP. The observers tended to recommend operative treatment for FFP II more often, probably due to the successful operative treatment of these injuries in terms of preserved autonomy, lower mortality 3 , and to hasten pain relief [10][11][12]34 . Furthermore, even incomplete, simple sacral fractures (FFP II) displace in approximately 30% of elderly patients to FFP III or FFP IV, especially in patients treated non-operatively, leading to a secondary operative procedure 35 . Even after 6 months of failed non-operative therapy, patients with FFP II benefited from Approximately half of the raters recommended that the presented fractures required sacroiliac screws (SIS) or spinopelvic fixation (SPF) (maximum rating difference, 2 votes). The fracture recommended for SIS was a bilateral non-displaced fracture of the sacral ala with vertical communication below S2 and minimal anterior displacement. The fractures recommended for SPF were a displaced trans-foraminal fracture (Denis zone II), a non-displaced fracture of the sacral ala, and a central fracture through S1. In the sagittal view, the vertical fracture through S1 without anterior or posterior displacement is revealed.  10 . However, late surgical therapies might be complicated by deformities in contrast to early performed percutaneous procedures 6,36 . Thus, a close follow-up especially for non-operatively treated cases is necessary. Additional factors such as general health 21 and a radiographic rating system 37 may help with decision-making, especially for FFP II.
Although our study indicated equal use of spinopelvic fixation 6 and sacroiliac screws/transsacral implant fixation 38,39 for posterior bilateral displaced sacral fractures (FFP IVb), discriminating radiological factors were not found. Significant pain reduction without complications after operative treatment using percutaneous transiliac-transsacral screws or bilateral sacroiliac screws in the upper sacral segment has been reported 38,40,41 . Interestingly, even unilateral injuries were recommended bilateral stabilization, probably to avoid fracture progression 35,42 .
The external fixator was preferred for the anterior ring, by four of the seven hospitals. The external pelvic fixator has been reported as a valuable tool for the elderly 43 , but it may be complicated by frequent loosening or pin tract infections 44,45 . Retrograde transpubic screw fixation has yielded good results in terms of fracture reduction and healing, with only a small number of adverse events (17 of 128 cases) reported by a large retrospective series 46 . Anterior plate osteosynthesis is complicated by loosening and consecutive non-union as well as excessive blood loss in elderly patients 47,48 . It should be noted that two hospitals and the gold standard preferred retrograde pubic screws whereas four hospitals preferred the external fixator.
Currently, it remains unclear which patients require anterior stabilization. A recent systematic review highlights the low number of cases treated anteriorly but also emphasizes the low quality of studies 49 . However, comparing all three anterior stabilization procedures, all of them showed a relevant complication rate, a maximum of one quarter of patients required revision surgery 46,47,50 . The need for anterior stabilization as well as the type of osteosynthesis should be investigated in prospective studies.
Clinical data regarding mobility, pain, perioperative risks, and expectations, especially for patients with FFP II is needed to make appropriate treatment decisions. The influences of these factors on the patient's prognosis need to be elaborated 21 .
Besides, an incorrect classification of FFP II by the observer, the observer might decide to not follow the recommendations of Rommens and Hofmann 9 . This might be the result of current studies showing a decreased rate of mortality 51 , rate of general complications for percutaneous procedures 52 and an improved mobility following operative therapy 53 . This data underlines the need to introduce additional modifiers for treatment decision (e.g. previous mobility level, pain level) and probably of a progressive operative treatment to avoid immobilityassociated complications 54 . On the other hand, data from Saito et al. challenge the more progressive operative treatment 55 . Based upon the ongoing data gain, an adaption of the recommendations should be considered.
Non-operative therapy should be standardized and the time to failure of non-operative therapy should be defined to avoid immobility-associated complications such as pressure ulcera 54 .
The FFP classification has sufficient intra-rater and inter-rater reliability however the choice of fixation among the options available especially in the posterior ring injuries are somewhat unclear and depend more on the physician's experience and training.
Although we determined a high agreement of the raters to the proposed therapy by the FFP classification, the clinical course and success in patients must be proven.
The FFP classification might become useful for guiding non-operative and operative therapy strategies, but the inclusion of non-radiological data and recent studies is required to improve therapy guidance. Recommendations for treating FFP have not yet been finalized and controversy exists, especially concerning FFP II. Here, additional factors e.g. secondary displacement and/or unrelenting pain on ambulation should be considered in the change from non-operative treatment to operative treatment in patients with FFP II.
Most differences in procedure were observed due to the individual preference of a surgeon and missing evidence in terms of comparative studies. Based on these findings, the relevance of the FFP classification criteria, operative procedures, and their outcomes should be evaluated further. Here, the patients suffering from a FFP II requiring operative treatment, the concept for non-operative therapy, the most non-invasive type of stabilization with the required stability, fractures at risk for fracture progression as well as the patients requiring anterior fixation should be identified.

Data availability
The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request.