Introduction

Sleeve gastrectomy (SG), Roux-en-Y gastric bypass (RYGB), and one anastomosis gastric bypass (OAGB) are the three commonest bariatric procedures worldwide [1]. There is currently no randomised controlled trial (RCT) comparing these three procedures in the scientific literature. There are several RCTs comparing two of these three procedures [2, 3] but they were not powered to evaluate differences in morbidity or mortality.

30-day morbidity and mortality is a recognised outcome measure for the evaluation of surgical safety and has been used in surgical literature for several decades [4]. There are large studies comparing 30-day morbidity and mortality of RYGB and SG. Alizadeh et al. [5] reported from an analysis of Metabolic and Bariatric Surgery Accreditation and Quality Improvement Program (MBSAQIP) data in the United States that RYGB was associated with higher 30-day morbidity (4.4% vs 2.3%; adjusted odds ratio (AOR) 0.53; p < 0.01) and 30-day mortality (0.2% vs 0.1%; AOR 0.58; p = 0.07) in comparison with SG. However, there is no large data in the scientific literature comparing 30-day morbidity and mortality of SG and OAGB or RYGB and OAGB.

Notwithstanding the lack of such large data, these direct database comparisons are often flawed due to significant differences in the baseline population. Propensity score matching is a valid tool for comparing non-randomised populations by matching them for confounding variables [6]. To the best of our knowledge, there is only one published study comparing 30-day morbidity and mortality of SG and RYGB [7]; and one comparing RYGB and OAGB [8] in propensity score-matched populations. Both of these studies emanate from the MBSAQIP database. In their study, Kapur et al. [7] found lower adverse events with SG in comparison with RYGB. However, the study by Docimo et al. [8] comparing the 30-day morbidity of OAGB and RYGB had too few patients to be meaningful. It is probably because the MBSAQIP database is not likely to have large numbers of OAGB, a procedure not endorsed by the American Society for Metabolic and Bariatric Surgery.

Global 30-day outcomes after bariatric surgEry duriNg thE COVID-19 pAndemic (GENEVA) study [9] is a large, multinational, observational study evaluating 30-day morbidity and mortality of bariatric and metabolic surgery (BMS) during the Coronavirus Disease-2019 (COVID-19) pandemic. The global reach of the study, a large number of patients, and significant numbers of OAGB procedures submitted to this study present a unique opportunity to compare 30-day morbidity and mortality of SG, OAGB, and RYGB in propensity score-matched cohorts.

Methods

Study design and population

The GENEVA study is an international, multicentre, observational cohort study of BMS performed between 1/05/2020 and 31/10/2020 [9]. The current study included all consecutive patients who underwent a primary SG or RYGB or OAGB during this period. Detailed methods have been published previously [9,10,11]. Data collection included patients’ demographics, details of surgery performed, and in-hospital as well as 30-day morbidity and mortality. Complications were categorised using the Clavien–Dindo (CD) Classification system for reporting surgical complications [12].

Statistical methods

Only patients with a complete data entry were included in the analysis. Continuous data were presented as median and interquartile range. Frequencies were used to summarise categorical variables. To examine differences between the three individual procedure types, the Fisher’s exact test was used for categorical variables and Kruskal–Wallis analysis of variance testing for continuous variables.

Propensity score matching was completed in a step-wise fashion. Pairwise propensity matching was performed to robustly assess the quality of matching. Standardized mean difference (SMD) was used statistic to examine the balance of covariate distribution between treatment groups. Patients were matched using the following features: sex, Type 2 diabetes mellitus (T2DM) status (No diabetes; diet controlled; oral hypoglycaemics; insulin therapy), hypertension, hypercholesterolaemia, obstructive sleep apnoea, smoking status, age, and baseline body mass index (BMI).

The patients were matched against individuals that had other surgeries using the “nearest” method which utilises a greedy search to match each sample with their nearest neighbour. The distance was calculated using the Mahalanobis distance, which estimates the distribution closest for each point [13]. This procedure was performed in R (R Core Team 2021) using the Matchlt package [14, 15]. The outcome variable was the presence of a complication at 30-days follow-up.

Multivariate analysis was performed to strengthen the resulting statistics from univariate analysis, correcting the influence of each variable on the outcome measured. Multivariate models were created using all the variables used for propensity score matching plus ethnicity (white ethnicity vs other ethnic groups), presence of any co-morbidity, and other unspecified co-morbidity (other than those listed above). Patients were then analysed using a generalised linear model in R (R Core Team 2021) [14].

Results

A total of 470 surgeons from 179 centres in 42 countries submitted data on 7092 adult patients who underwent primary BMS between 1st May 2020 and 31st October 2020 at the participating centres. Of these, complete 30-day morbidity and mortality data were available for 7084 (99.88%) by the 10th of December 2020.

Basic demographics

Of the 7084 patients, 300 patients underwent other procedures and were excluded. A further 14 patients were excluded due to missing values. Complete data were available for a total of 6,770 patients who underwent a primary SG or RYGB or OAGB (SG n = 3983; RYGB n = 2085, OAGB n = 702). Demographic details for all the patients, who underwent any of these three primary procedures are included in Table 1. There were multiple significant differences in baseline demographics between the three groups as detailed in Table 1. RYGB patients were significantly older than patients in the other two cohorts while patients receiving OAGB were more likely to suffer from co-morbidities (Table 1). Patients undergoing SG had the lowest rate of each of these co-morbidities as detailed in Table 1.

Table 1 Baseline demographics of all patients undergoing a primary SG or RYGB or OAGB (unmatched cohort—14 patients excluded due to incomplete data).

30-day morbidity and mortality in the full cohort (unmatched; Table 2)

The overall complication rate was 6.7% (452/6770) (Table 2). RYGB patients had the highest rate of any complication during the 30-day follow-up (8.0% with RYGB vs 7.5% for OAGB and 5.8% for SG (p = 0.006)). There were seven post-operative mortalities (0.1%) (4 with SG, 3 with OAGB, and nil with RYGB; p = 0.016; Fisher’s exact test).

Table 2 Complications according to primary procedure and CD (Clavien–Dindo) classification system in the full unmatched cohort.

Multivariate analysis of unmatched cohort (Table 3)

On multivariate regression modelling, insulin-dependent T2DM (OR 1.047; 95% CI 1.011–1.083), and hypercholesterolaemia (OR 1.024; 95% CI 1.009–1.040) (Table 3 and Fig. 1) were associated with increased 30-day complications. Being a non-smoker was associated with reduced complication rates (OR 0.984; 95% CI 0.971–0.998). When compared to SG as the reference category, RYGB, but not OAGB, was associated with an increased rate of 30-day complications (OR 1.018; 95% CI 1.005–1.032 for RYGB and OR 1.009; 95% CI 0.989–1.030 for OAGB).

Table 3 Results of multivariate logistic regression on full unmatched cohort.
Fig. 1
figure 1

Multivariate regression results prior to patient matching.

30-day morbidity and mortality in the propensity score-matched cohort (Tables 46)

SG vs OAGB

In total, 702 pairs were matched with a reduction in SMDs for all matched variables (8/8) (Table 4). The overall complication rate in the SG group was 51 (7.3%) as compared to 53 (7.5%) in the OAGB group (Table 7). The difference was not significant (p = 0.68).

Table 4 A comparison of sleeve gastrectomy (SG) and one anastomosis gastric bypass (OAGB) before and after propensity score matching.
Table 5 A comparison of sleeve gastrectomy (SG) and Roux-en-Y gastric bypass (RYGB) before and after propensity score matching.
Table 6 A comparison of one anastomosis gastric bypass (OAGB) and Roux-en-Y gastric bypass (RYGB) before and after propensity score matching.
Table 7 Complications in propensity score-matched populations.

SG vs RYGB

In total, 2085 pairs were matched with a reduction in SMDs for seven of the eight matched variables (Table 5). The overall complication rate in the SG group was 127 (6.1%) as compared to 166 (7.9%) in the RYGB group (Table 7). The difference was not significant (p = 0.09).

OAGB vs RYGB

In total, 702 pairs were matched with a reduction in SMDs in four of the eight matched variables (4/8) (Table 6). The overall complication rate in both the groups was the same 53 (7.5%; p = 0.07; Table 7).

Discussion

This study shows that there is no significant difference in 30-day morbidity and mortality of SG, RYGB, and OAGB in propensity score-matched cohorts from a large, global dataset collected during the COVID-19 pandemic. Though RYGB was associated with higher 30-day morbidity in comparison with reference SG (OR 1.018; 95% CI 1.005–1.032) in the unmatched cohort on multivariate analysis, the difference disappeared after propensity score matching (p = 0.09). In comparison, OAGB was not associated with higher 30-day morbidity in comparison with SG on either multivariate analysis (OR 1.009; 95% CI 0.989–1.030) or propensity score-matched comparison (p = 0.68).

RCTs comparing different bariatric procedures often have weight loss [2] or diabetes control [16] as their endpoints. Some [16] do not even clearly report 30-day morbidity and mortality with different bariatric procedures let alone classifying surgical complications adequately according to the widely used and accepted CD Classification [12]. We cannot, therefore, derive any scientifically valid conclusions regarding complication rates of different procedures from these RCTs.

Outside of RCTs, perceptions regarding relative safety and efficacy of different procedures for different patient groups may introduce potential bias. For example, in this study, we found 33.0% of patients undergoing OAGB were suffering from T2DM compared to 23.0% undergoing RYGB and 16.0% undergoing SG in the unmatched cohorts. This selection bias may be partly accounted for by fact that the randomised studies have shown superior (non-significant as they were not powered to evaluate these) outcomes in terms of diabetes improvement with OAGB in comparison with RYGB [2] and with RYGB in comparison with SG [17]. This is important because T2DM is known to be associated with complications after bariatric surgery [18, 19] and in our study, Insulin-dependent T2DM was independently associated with 30-day morbidity on multivariate analysis of the unmatched cohort.

Similarly, in the unmatched cohort, hypercholesterolaemia was present in 19.0% of patients undergoing SG in comparison with 32.0% of those undergoing OAGB and 23.0% of those undergoing RYGB. However, after matching, in the analysis of SG and OAGB, the hypercholesterolaemia rates in the two groups were the same at 32.0 and 23.0% in the analysis of SG and RYGB. This is also important because hypercholesterolaemia was independently associated with 30-day morbidity on multivariate analysis of the unmatched data and differences in hypercholesterolaemia rates in the unmatched cohort may well have accounted for some of the observed differences in 30-day morbidity. Others [20] have also found dyslipidaemia to be a predictor of complication after bariatric surgery.

Standardised mean differences in age, BMI, sex, smoking status, hypertension rates all reduced after matching for both the matched comparisons involving SG in this study. Given that all of these characteristics are known to be associated with increased morbidity after bariatric surgery [21,22,23,24,25,26,27,28,29], differences in these baseline characteristics may have been in part responsible for why the observed difference in morbidity between SG and RYGB or OAGB disappeared after matching. At the same time, and probably because of the fewer number of bypass procedures in the GENEVA database, matching failed to reduce SMDs for age, sex, smoking status, and hypertension in the comparison between OAGB and RYGB in this study. This may partly account for the observed lack of difference in 30-day morbidity between the two procedures. Future studies on this topic need to be mindful of this.

There is no published study comparing the 30-day morbidity of SG with that of OAGB in propensity score-matched cohorts. This may be due to continued reservations [30] amongst some surgeons about OAGB. Furthermore, and as mentioned above in the Introduction section, there is only one propensity score-matched study [8] in the scientific literature comparing the 30-day safety of OAGB with that of any other procedure (RYGB in this case) and that study only had 279 pairs of OAGB and RYGB. One could argue this is not a large enough sample to study differences in morbidity.

There is only one study [7] in the published literature comparing the 30-day morbidity of SG with that of RYGB. Interestingly that study showed lower complication rates with SG in contrast to our findings. It is however worth noting that these authors do not report standardised mean difference in propensity score-matched populations and given the large numbers matched, it was inevitable that their matching was not perfect with significant difference between the matched populations with regards to important confounding variables such as age, BMI, smoking, insulin-dependent T2DM, etc.

This study represents the first large propensity-matched comparison of 30-day morbidity and mortality of SG, RYGB, and OAGB. This data was collected from a large worldwide collaborative study of real-world bariatric surgical practice. Data completion rates were extremely high with 30-day follow data available for 97.9% of patients across the entire cohort and this represents a significant strength of this study.

Non-randomised design and self-reported complication rates are two major weaknesses of this study. However, it is not easy to randomise to different procedures with 30-day morbidity as an endpoint and anonymous data collection strategy used in this study may have diminished the desire to under-report complications. Another weakness of this study is that differences in complication rates, though statistically not significant, maybe clinically relevant. Indeed, larger studies may even find statistical significance.

Conclusion

The present analysis shows that there is no significant difference in 30-day morbidity and mortality of SG, OAGB, and RYGB in propensity-matched cohorts.