Introduction

The anterior cruciate ligament (ACL) is the primary passive constraint for internal tibial rotation and anterior tibial translation over the femur1,2,3,4. ACL injury is one of the most common knee injuries in the young athletic population5,6, most commonly in those performing jumping, twisting and cutting movements7. Its estimated incidence worldwide is about 70 per 100,000 people per year8,9,10,11,12. Anterior cruciate ligament rupture affects the knee kinematics13,14,15 resulting in joint instability, articular cartilage injury, and meniscal damage14,15,16,17,18,19,20,21,22,23,24,25,26,27. The optimal management of ACL is still debated28,29. Likewise, despite thousands of clinical articles on ACL surgical treatment, controversies still remain regarding the optimal choice of graft30,31,32,33,34. Bone-patellar tendon-bone (BPTB) and hamstring tendon (HT) autografts are the most common options for primary anterior cruciate ligament reconstruction35,36. The use of the BPTB autograft was introduced in the 1980s37 and it is still one of the most commonly used38. BPTB autografts achieve high patient satisfaction, quick return to sport and bone-to-bone healing39,40. However, concerns have been raised about donor site complications after BPTB autograft, such as anterior knee pain, discomfort, crepitus, loss of sensation, patellar fractures, contracture of the lower patella, and loss of extension strength41,42,43. To reduce damage to the extensor apparatus, the rates of anterior knee pain and patellar fractures, hamstring tendon (HT) autograft has been advocated44,45,46,47. However, ACL reconstruction using HT autograft may lead to a greater tunnel widening, flexor weakness, and knee laxity compared to BPTB42,48,49. In addition, the lack of bone block on the extremities of the HT graft may promote greater laxity leading to higher frequency of rupture50. Several clinical studies compare the autografts mentioned above, but the results are inconclusive35,51. In this Bayesian network meta-analysis, BPTB, two- and four-strand HT (4SHT and 2SHT, respectively) autografts for ACL reconstruction in young adults were compared. Joint laxity, patient reported outcome measures (PROMs), rate of failure, and anterior knee pain (AKP) between the autografts were compared, as were the time to return to sport and the peak torque. A multivariate analysis was conducted to investigate possible prognostic factors leading to worse outcomes. It was hypothesized that all grafts yield similar proprieties in terms of joint laxity, PROMs, and rate of failure, but that the BPTB autograft causes a greater rate of anterior knee pain (AKP).

Material and methods

Search strategy

The present Bayesian network meta-analysis followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) extension statement for reporting of systematic review incorporating network meta-analyses of health care interventions52. A PICO guide protocol was preliminary drafted:

  • P (population): ACL tears in young adults;

  • I (intervention): primary ACL reconstruction;

  • C (comparison): BPTB, 4SHT, 2SHT;

  • (outcomes): laxity, PROMs, failure, AKP.

Data source and extraction

Two reviewers (**;**) separately performed the literature search in February 2023. PubMed, Google scholar, Embase, and Scopus databases were accessed. The following keywords were used using the Boolean operator AND/OR: anterior cruciate ligament, ACL, pain, knee, tear, rupture, injury, damage, reconstruction, management, treatment, arthroscopy, surgery, autografts, bone patellar tendon bone, hamstring, strands, patient reported outcome measures, PROMs, laxity, stability, instability, complication, anterior knee pain, failure. The resulting titles were screened by the same authors independently. If the title and the abstract matched the topic, the article’s full-text was accessed. If the full-text was not accessible, the article was excluded from the present study. A cross reference of the bibliographies was also performed. Disagreements were debated and the final decision was made by a third author (**).

Eligibility criteria

All clinical investigations comparing BTPB, and/or 4SHT, and/or 2SHT were accessed. Articles in English, German, Italian, French, and Spanish were eligible. Levels I to III of evidence, according to Oxford Centre of Evidence-Based Medicine (OCEBM)53, were considered. Grafts other than BTPB and/or 4SHT and/or 2SHT were not eligible. Studies which reported data on skeletally immature patients were not considered. Articles reporting outcomes from allograft or synthetic graft reconstructions were not eligible, nor where those concerning revision settings. Articles reporting ACL reconstruction in patients with multi-ligament damage were not eligible. Letters, comments, reviews, opinions, and editorials were not included. Animals and biomechanics studies were also not considered. Only articles reporting quantitative data under the outcomes of interest were considered for inclusion. Missing data under the outcomes of interest warranted the exclusion from this study.

Data extraction

Two authors (**;**) independently examined the resulting articles for inclusion. Generalities and patient demographic were retrieved: author, year, journal, study design, length of the follow-up, type of graft, number of included patients, mean age, BMI, sex, time span from injury to surgery, and size of the graft. To investigate knee stability, data from the manual (Pivot shift and Lachman tests) and instrumental laxity were extracted. The instrumental laxity was evaluated using the arthrometers KT-1000 and KT-2000 (MEDmetric Corp, San Diego, California). Both of these devices applied a force of 134N on the tibial plateau over the femoral condyles, directed anteriorly. Concerning PROMs, data from the Tegner activity scale and Lysholm score at the last follow-up were extracted. The Lysholm score and Tegner activity scale have been validated for knee ligament surgery54,55,56. Data concerning the peak torque and the return to sport were also retrieved. The rates of failure and AKP were also investigated.

Methodology quality assessment

The methodological quality assessment was made using the risk of bias graph of the Review Manager Software (The Nordic Cochrane Collaboration, Copenhagen). The following risks of bias were evaluated: selection, detection, reporting, attrition, and other sources of bias.

Statistical analysis

The statistical analyses were performed by the main author (FM) using STATA Software/MP, Version 14.1 (StataCorporation, College Station, Texas, USA). For descriptive statistics, mean and standard deviation were calculated. The analysis of variance (ANOVA) was performed to evaluate the baseline comparability, with P values > 0.1 considered satisfactory.

To assess the return to sport, the ANOVA test with Tukey post-hoc test and honestly significant difference (HSD) were performed, with values of P < 0.05 were considered statistically significant. The confidence interval (CI) was set at 95%.

The NMA was performed through the STATA routine for Bayesian hierarchical random-effects model analysis. The inverse variance method was used for analysis of continuous variable, with standardized mean difference (STD) effect measure. The Log odd ratio (LOR) effect measure was used for binary data. The overall inconsistency was evaluated through the equation for global linearity via the Wald test. If the P value was > 0.5, the null hypothesis could not be rejected, and the consistency assumption could be accepted at the overall level of each treatment. Both confidence (CI) and percentile (PrI) intervals were set at 95%. Edge plot, interval plots, and funnel plots were obtained and evaluated.

For the multivariate analysis, a multiple linear model regression with Pearson Product-Moment Correlation Coefficient (r) was used to establish whether patient characteristics (age, BMI, women, time from injury to surgery, and graft size) are associated with the outcome (Pivot shift and Lachman tests, instrumental laxity, Lysholm score, Tegner scale, return to sport, failure, and anterior knee pain). The Cauchy–Schwarz formula was used for inequality: + 1 was considered as positive linear correlation, while − 1 was a negative one. Values of 0.1 < |\(r\)| < 0.3, 0.3 < |\(r\) | < 0.5, and |\(r\) | > 0.5 were considered to have respectively small, medium, and strong association. The overall significance was assessed through the χ2 test, with values of P < 0.05 considered statistically significant.

Ethical approval

This study complies with ethical standards.

Results

Search result

The literature search resulted in 1035 articles. Of them, 306 were excluded as they were duplicates. Furthermore, 636 articles were not eligible: not matching the topic (N = 403), reporting data on allografts or synthetic grafts (N = 41), study type (N = 154), revision or multi-ligament settings (N = 37), language limitation (N = 1). Additionally, 32 articles were excluded as they did not report quantitative data under the outcomes of interest. This left 61 clinical trials for the present study. The literature search results are shown in Fig. 1.

Figure 1
figure 1

Flow chart of the literature search.

Methodological quality assessment

The prospective design of 85% (52 of 61) of the included investigations was an important strength of the present study. Of them, 62% (32 of 52) performed randomisation. Since most of the studies performed assessor blinding, the risk of detection bias was moderate to low. The proper analyses of most of the included studies, along with the intention to treat, clear definition of the timing of assessing outcomes, as well the use of validated tools for assessing outcomes, lead to a low risk of reporting and attrition bias. The risk of other biases was moderate to low. In conclusion, the methodological quality assessment demonstrated a moderate to low risk of bias (Fig. 2).

Figure 2
figure 2

Methodological quality assessment.

Patient demographics

Data from 102,573 procedures were retrieved. The median length of follow-up was 51.5 ± 49.4 months. The median age of the patients was 27.9 ± 4.2 years. The median time span from injury to surgery was 14.4 ± 11.2 months. The mean BMI was 24.6 ± 1.6. The median size of the graft was 9.7 ± 0.7 mm. The ANOVA test found moderate baseline comparability among age, length of the follow-up, time span from injury to surgery, BMI, and graft size (P > 0.05). Patient demographics is shown in Table 1.

Table 1 Generalities and patient baselines from the included studies.

Network comparisons

With regard to joint laxity, similarities were found in terms of Lachman and Pivot shift tests between all three autografts. The BPTB demonstrated the greatest stability in terms of instrumental laxity. The equation for global linearity found no statically significant inconsistency (P = 0.06, P = 0.08, and P = 0.1, respectively). These results are shown in greater detail in Fig. 3.

Figure 3
figure 3

Edge, funnel, and interval plots of the network comparisons: joint laxity.

Concerning PROMs, BPTB demonstrated the greatest Lysholm score and Tegner activity scale, followed by 2SHT and 4SHT, which scored similarly (Fig. 4). The equation for global linearity found no statically significant inconsistency (P = 0.3 and P = 0.5, respectively).

Figure 4
figure 4

Edge, funnel, and interval plots of the network comparisons: PROMs.

Patients who underwent reconstruction of the ACL using a BPTB graft demonstrated the greatest rate of anterior knee pain, while both 2SHT and 4SHT ranked similarly. No statistically significant inconsistency was found (P = 0.2). The equation for global linearity found statistically significant inconsistency for the comparison of graft failure (P = 0.008), thus no further conclusion could be inferred. The network comparisons of complications are shown in greater detail in Fig. 5.

Figure 5
figure 5

Edge, funnel, and interval plots of the network comparisons: complication.

Peak torque

Given the lack of quantitative data concerning the 2SHT group, only BPTB and 4SHT were considered for analysis of peak torque. BPTB demonstrated greater peak flexion torque at 60° (P < 0.0001) and 180° (P < 0.0001). No difference was found at 120° (P = 0.06). BPTB demonstrated lower peak extension torque at 60° (P = 0.01), 120° (P = 0.008), and 180° (P = 0.006). These results are shown in greater detail in Table 2.

Table 2 Peak torque.

Return to sport

The 4SHT demonstrated the quickest return to sport, followed by BPTB, and 2SHT (Table 3).

Table 3 Return to sport.

Multivariate analysis

There was evidence of a negative association between the time span between injury to surgery and Lysholm score (r = − 0.50; P = 0.04) and Tegner scale (r = − 0.26; P = 0.04). Furthermore, there was evidence of a weakly positive association between the time span between injury to surgery and return to sport (r = − 0.06; P = 0.01). The results of the multivariate analysis are shown in Table 4.

Table 4 Results of the multivariate analysis (AKP = Anterior knee pain).

Discussion

According to the main findings of the present study, BPTB may promote lower joint laxity, greater PROMs, and greater peak flexion torque compared to 2SHT and 4SHT autografts in young adults. The ACL is one of the most important constraints against anteroposterior translation of the knee113,114. In the present study, BPTB was associated with the lowest peak extension torque and the greatest rate of AKP. Peak flexion torque is used to assess knee flexor muscle strength after reconstruction, as a quantitative outcome measure, particularly when comparing hamstring autografts to alternative graft options. Knee flexor weakness in knee flexion is relevant in certain sports such as gymnastics, judo, or wrestling, and it is useful to assess the return to sport115. Knee torque is significantly affected after ACL injury. Both extension and flexion isokinetic strength are important outcomes to evaluate after surgical reconstruction116. AKP remains a major complication after ACL reconstruction, and potentially recognizes several aetiologies, including bone-harvesting pain, neuroma of the infrapatellar branch of the medial saphenous following its lesion, and rarely, patellar tendinopathy33. Finally, a longer time span between ACL rupture and reconstruction may represent a negative factor influencing the outcome.

Concerning joint laxity, similarity was found in terms of Lachman and Pivot shift tests between all three autografts. The Lachman test evaluates the anterior translation of the tibia in relation to the femur with the knee in static flexion117. The Pivot shift test instead assesses the rotatory instability of the joint during its dynamic flexion118. Similarly, a previous meta-analysis found no difference in IKDC score, Lachman and Pivot shift tests between BPTB and hamstring autografts119. However, BPTB autograft resulted in a higher incidence of AKP, kneeling pain, and rate of osteoarthritis119. The literature on osteoarthritis of patients undergoing reconstruction with BPTB or HT autografts is controversial120,121. In the present study, patients receiving a BPTB graft demonstrated the lowest instrumental laxity and the greatest Lysholm score and Tegner activity scale, followed by 2SHT and 4SHT, which scored similarly. The Lysholm score and Tegner activity scale are outcomes measurements of a subjective nature that evaluate performance and activity restrictions both before and after surgery122. These PROMs have been validated for knee ligament surgery54,55,56.

On the other hand, BPTB demonstrated the greatest rate of AKP compared to both the 2SHT and 4SHT autografts, which showed a similar rate. Concerning failure, no statistically significant inconsistency was found. The equation for global linearity found statistically significant inconsistency for the comparison failure; thus, no further conclusion could be inferred. In a study on 5462 patients with primary ACL reconstruction, HT autografts resulted in greater anterior knee laxity and failures compared with BPTB autografts65. In a previous meta-analysis including 25 studies (47,613 ACL reconstructions), HT autografts failed at a higher rate than BPTB autografts123. Similar results have been evidenced in another meta-analysis involving 15 RCTs (1298 patients)121. A further meta-analysis including 20 RCTs compared BPTB versus 4SHT. The BPTP cohort evidenced lower laxity and failure rupture, but a greater risk of kneeling pain and AKP124.

Given the lack of quantitative data concerning the 2SHT group, only BPTB and 4SHT were considered in our study for analysis of peak torque. BPTB demonstrated greater peak flexion torque at 60° and 180°. No difference was found at 120°. BPTB demonstrated lower peak extension torque at 60°, 120°, and 180°. While BPTB exhibits some better outcome measures, it should be noted that BPTB also demonstrated the greatest rate of AKP. These findings agreed with previous studies comparing HT and BPTB, which stated that the latter restores greater knee stability, but also results in greater postoperative complications121,125,126. AKP is common following ACL reconstruction and can persist for a long time in athletes. The removal of the central third of the patellar tendon and its subsequent repair might cause a lowering of the patella and lead to increased sensitivity and pain during kneeling or squatting127. In this regard, in our study, the 4SHT graft demonstrated the quickest return to sport, followed by BPTB, and 2SHT. This should be considered when making a decision with athletes whose goal is to return to play as soon as possible. Lastly, results from the multivariate analysis demonstrated that a longer time span between initial injury and surgery was associated with lower Lysholm scores, Tegner scale, and longer return to sport. This worse outcome associated with a longer time from injury to surgery should be considered when planning the reconstruction. It should also be noted that some insurance companies currently require a dedicated physiotherapy trial for ACL injuries before surgery is authorized128. This delay in treatment can lead to suboptimal results129,130.

This study has certainly limitations. The retrospective nature of most studies is an important limitation which increases the risk of selection bias. Demographic data of the patients were collected, but further information regarding their general health were seldom reported in the included studies. Most of the authors did not specify whether the surgeon who performed the procedure was the investigator himself, and whether the assessor was blinded to the procedure performed. Many studies did not clearly specify the surgical technique (arthroscopic, open, or both) or postoperative management. Rehabilitation protocols following ACL reconstruction are associated with significant differences in outcome131. Several new modalities of rehabilitation after ACL reconstruction such as strengthening, and functional exercises, resistance training, neuromuscular exercise, high-level dynamic functional tasks and sport-specific training have been proposed132,133,134,135,136. However, given the lack of quantitative data, the various rehabilitation protocols could not be analysed separately. Most authors did not specify whether patients had undergone MRI preoperatively, thus providing poor information on preoperative diagnostic methods. Most authors did not report information on the sporting activity and level of the patients; therefore, further subgroup analyses were not possible. Given the lack of quantitative data, it was not possible to investigate additional autografts137,138. Allografts have been advocated as they avoid donor site morbidity139,140,141. However, the greater risk of graft-versus-host reaction, disease transmission, and delayed graft incorporation limits the use of allografts142,143,144. There is also a growing trend of using quadriceps tendon grafts, which may provide another viable and safe alternative for autografts options145,146,147,148. This autograft may result in a lower rate of failure compared to both BPTB and HT grafts, as well as a reduced rate of AKP compared to the BPTB autograft149. Further high-quality investigations should validate the present results also in skeletally immature patients. Furthermore, the aetiology of the AKP following ACL surgery still remains debated, and international recommendations on the management and classification of this condition are required.

Conclusion

BPTB may promote lower joint laxity, greater PROMs, and greater peak flexion torque compared to 2SHT and 4SHT autografts. On the other hand, BPTB resulted in the lowest peak knee extension torque and the greatest rate of AKP. Concerning PROMs and AKP, similar scores were obtained in the comparison between SHT2 and 4SHT. However, the 4SHT demonstrated the quickest return to sport, followed by BPTB, and 2SHT. Finally, longer time span between injury and ACL reconstruction negatively influences the outcomes.