Cross-training has incredibly grown in popularity for over a dozen years. It combines ten key physical parameters to improve overall fitness: cardiovascular/respiratory endurance, stamina, strength, flexibility, power, speed, coordination, agility, balance, and accuracy1,2,3. Cross-training can be described as high-intensity functional training (HIFT) that incorporates exercises with barbells, kettlebells, dumbbells, cardiovascular tasks (i.e., running, rowing), and gymnastics (i.e., pull ups, muscle ups)4. It is varied in its nature, which appeals strongly to many people around the world.

In a systematic review, Haddock et al.5 suggests that HIFT increases aerobic fitness, strength, musculature, and endurance. Cosgrove et al.6 studied the influence of six months of HIFT on multiple fitness characteristics. The participants completed three separate days of assessments across 10 fitness domains before and after participating in the program for 6 months. They found that HIFT improved several fitness parameters, including flexibility, power, muscular endurance, and strength6. Murawska-Cialowicz et al.7 also showed that there was an improvement in aerobic fitness after 3 months of HIFT training in women but not in men. However, no changes in Wingate anaerobic power were observed.

Other investigators focused on finding the connection between HIFT performance and aerobic and anaerobic fitness as well as strength capabilities8,9,10,11. Butcher et al.9 studied whether physiological measures can predict selected cross-training benchmark performance. Aerobic fitness was measured as maximum oxygen uptake (VO2max) in a treadmill ramp test9. Anaerobic power was measured in a classical Wingate test on a cycloergometer. Cross-training performance was evaluated in benchmark workouts; ‘Fran’ (three rounds of thrusters (a thruster is a combination of a front squat and an overhead press) and pull-ups for 21, 15, and 9 repetitions), ‘Cindy’ (20 min of rounds of five pull-ups, ten push-ups, and 15 bodyweight squats), ‘Grace’ (30 clean and jerks for time) and ‘CrossFit Total’ (1 repetition maximum (1RM) back squat, overhead press, and deadlift). They found no correlation between HIFT performance and aerobic fitness. Bellar et al.8 aimed to investigate the relationship of aerobic fitness and anaerobic power with performance in two representative cross-training workouts. The first was 12 min (as many rounds as possible of 12 medicine ball throws, 12 kettlebell swings, and 12 burpee pullups), and the second was based on the total time to complete the prescribed exercise (3 rounds of 21–15–9 repetitions of a sumo deadlift high pull, box jumps, and farmer walk with bumper plates) in cross-training experienced and naïve adults8. The cross-training experienced group performed significantly better in both performance tests. In the experienced group, both cross-training tests results were correlated with VO2max, but in the naïve group, only the 21–15–9 test. Dexheimer et al.10 conducted a study aimed to determine which physiological performance measure could serve as the greatest indicator of cross-training workout performance. Participants completed a graded exercise test on a treadmill to determine VO2max, a 3-min all-out test, a Wingate test, ‘CrossFit Total’, and three HIFT benchmark trainings: ‘Grace’, ‘Fran’, and ‘Nancy’ (5 rounds of a 400 m run and 15 overhead squats). In comparison to Butcher and colleagues9, they found that ‘Fran’, ‘Nancy’, and ‘CrossFit Total’ were significantly correlated with VO2max, but only the score in ‘Nancy’ and ‘CrossFit Total’ were correlated with peak and average anaerobic power10. Interestingly, they showed that critical speed and total body strength are not crucial predictors of HIFT performance. Finally, Martínez-Gómez et al.11 hypothesized that a full squat might be used as a predictor of cross-training performance, at least for exercises involving lower-limb muscles. The participants performed a squat test to measure 1RM and five cross-training workouts from the CrossFit Opens in 2019. In the first workout of the day (WOD1), participants had to perform 225 repetitions of dumbbell snatches and burpee box jump-overs for time. In WOD2, the athletes had to perform as many repetitions as possible of 50-feet (15.24 m) weighted walking lunges, toes-to-bar, bar muscle ups, and power cleans in 12 min. In WOD3, contestants had to perform the maximum possible number of repeated circuits, each including chest-to-bar pull-ups and squat snatches (with weight progressively increasing from 95 to 265 lbs (43–120 kg)) in 8 min. In WOD4, participants had to perform as many repetitions as possible of deadlifts (225 lb (102 kg)), wall-ball shots (20 lb (9 kg) to a 10-feet (3 m) high target), rowing, and handstand push-ups in 13 min. In WOD5, participants had to perform 440 repetitions of thrusters (95 lb (45 kg)) and double-unders in the shortest time possible. Cross-training performance was then calculated from all five WODs. They found moderate to strong correlations between squat variables and all WODs11. The same for cross-training performance.

Because HIFT is becoming more and more popular every year, many studies are carried out to measure the influence of different dietary regimens and supplementation on cross-training performance12,13. However, it is difficult to compare those results, because authors use different cross-training tests. The reason for this is that there are no validated and well-described tests of cross-training performance. In the present paper, we proposed one of the cross-training benchmark workouts Fight Gone Bad (FGB) as a discipline-specific test to measure HIFT performance. FGB comprises 3 rounds of 5 different cross-training specific exercises, so there are 15 min of exercises with two 1-min breaks between the 1st and 2nd, and 2nd, and 3rd rounds. It requires most of the HIFT key physical traits such as power, strength, speed/strength endurance, balance, and stamina1,2. In addition, it can be performed by both experienced and inexperienced athletes, which makes it useful to track progress over time. That is why the purpose of this study is to evaluate the repeatability and reliability of FGB over repeated measurements and to assess its relation to aerobic performance. The secondary aim was to assess changes in hematological and biochemical parameters caused by FGB in order to have better insight into the organism’s response to such a workout.


Body composition

There were no significant differences in body mass and composition between the three measurements (baseline and after 10 and 20 days), Table 1. Relative reliability (interclass correlation coefficient) was excellent (ICC > 0.9) for all measures by Bod Pod and FFM and body water (kg) by BIA. ICC was good for FM (kg and %) by BIA and low for body water (%). For Bod Pod, body mass, and FFM, the SEM was 1%, and it was 6% for FM (both in kg and %). For BIA, FFM SEM was 1%, for body water (kg) 5%, 10% for body water (%), 11% for FM (kg) and 12% for FM (%). For all Bod Pod measurements, SEM was less than SWC, indicating good ability of the test to detect small and meaningful changes. In BIA, SEM < SWC was observed only for FFM (kg) and body water (kg).

Table 1 Repeatability of body composition measurements.

Fight Gone Bad performance

There were no significant differences in FGB performance between the three measurements (Table 2). ICC was excellent for FGB Round-1 (FGBR-1) and FGBTOTAL and good for FGB Round-2 (FGBR-2) and FGB Round-3 (FGBR-3). SEM was 6, 8, 9 and 6% for FGBR-1, FGBR-2, FGBR-3, and FGBTOTAL, respectively. SEM was lower than SWC for FGBR-1 and FGBTOTAL, indicating good ability of the test to detect small and meaningful changes. However, for FGBR-2 and FGBR-3, SEM was higher than SWC. MDC values are given in Table 2.

Table 2 Repeatability of Fight Gone Bad performance.

There were no significant differences in HR during FGB between the three measurements (Table 3). ICC was good for FGBR-2, Rest2, Mean15min, and Mean17min. ICC was acceptable for Rest1 and FGBR-3 and low for FGBR-1. SEMs were low, between 2 and 5%. SEM was higher than SWC for all HR measurements in FGB. MDC values are given in Table 3.

Table 3 Repeatability of heart rate during Fight Gone Bad.

Aerobic fitness

Furthermore, there were no significant differences in ICT (Table 4). Relative reliability (interclass correlation coefficient) was excellent for Texh, Wmax, TGET, WGET, VO2peak, VCO2peak, and EE, good for HRGET and VO2GET, and acceptable for HRmax. SEM was low (< 5.99%) for HRmax, HRGET, and VO2peak, and moderate (6–10%) for Texh, Wmax, TGET, WGET, and VO2GET. SEM for EE was 11%. SEM < SWC was observed for Texh, Wmax, TGET, WGET, VO2peak, VCO2peak, and EE. However, for HRmax, HRGET, and VO2GET SEM was higher than SWC.

Table 4 Repeatability of aerobic variables measured in incremental cycling test.

FGB and ICT relationship

Correlations between FGB scores and ICT parameters were significant for all measurements besides HRGET (Table 5). Strong correlations (> 0.7) were observed for FGBR-1 with Texh, Wmax, TGET, and EE; and for FGBTOTAL with Texh and EE. Moderate (0.5–0.7) correlations were found for FGBR-1 with WGET, VO2GET, VO2peak, and VCO2peak, for FGBR-2 with Texh, Wmax, TGET, WGET, VO2peak, VCO2peak, and EE, for FGBR-3 with Texh, Wmax, TGET, WGET, VO2peak, and EE, and for FGBTOTAL with Wmax, TGET, WGET, VO2GET, VO2peak, and VCO2peak. In addition, correlations were significant but low for FGBR-1 with HRmax, FGBR-2 with HRmax, and VO2GET, FGBR-3 with HRmax, HRGET, VO2GET, and VCO2peak, as well as FGBTOTAL with HRmax.

Table 5 Correlations between Fight Gone Bad performance and aerobic capacity.

The agreement of two methods was also assessed using the Bland–Altman method (Fig. 1). Bias for standardized FGBTOTAL and performance in ICT measured as standardized Texh, VO2peak, Wmax, and HRmax were 0.0 ± 0.70, 0.0 ± 1.53, 0.0 ± 0.74 and 0.0 ± 0.10, respectively (Fig. 1A–D). Bias for standardized FGBTOTAL and GET measured as standardized TGET and WGET were 0.0 ± 0.76 and 0.0 ± 0.81, respectively (Fig. 1E,F).

Figure 1
figure 1

Bland–Altman plots for standardized measures of Fight Gone Bad performance and aerobic capacity. FGBTOTAL, Fight Gone Bad total number of repetitions; HRmax, maximal heart rate; Texh, time to exhaustion; TGET, time to gas exchange threshold; Wmax, maximal workload; WGET, workload at gas exchange threshold; VO2peak, peak oxygen uptake.

Blood sample analysis

Biochemical marker analysis in blood revealed that MON count and HGB concentration were significantly different among three measurements in both FGBPRE and FGBPOST, whereas Pa concentrations were significantly different only in FGBPRE (Table 6). ICC was rated as excellent in FGBPOST RBC, as good in FGBPRE RBC, PLT, and HTC and in FGBPOST HTC and GLU, acceptable in FGBPRE WBC, LYM, GRA and CK, and in FGBPOST WBC, LYM, GRA, PLT, La, Pa, CK, and LDH. ICC was significant but low in FGBPRE HGB and LDH, and non-significant in FGBPRE MON, GLU, La, Pa, and FGBPOST MON, and HGB. Apart from FGBPRE WBC, HTC and FGBPOST RBC, for all blood measures SEM > SWC both FGBPRE and FGBPOST.

Table 6 Repeatability of hematological and biochemical parameters measured before and after Fight Gone Bad.

Blood parameters ICTPRE and ICTPOST were also measured. HGB concentrations were significantly different among three measurements in both ICTPRE and ICTPOST, whereas GLU concentrations were significantly different only ICTPRE (Table 7). ICC was excellent for RBC ICTPRE and ICTPOST, good in ICTPRE PLT and ICTPOST PLT, and HTC, acceptable in ICTPRE WBC, LYM, GRA, HTC, La, and LDH, and in ICTPOST WBC, LYM, MON, GRA, GLU, and La. Low but significant ICC was observed in ICTPRE MON, HGB and CK, and in ICTPOST HGB, Pa, CK, and LDH. Non-significant ICC were found in ICTPRE GLU, and Pa. SEM < SWC was observed only in pre- and ICTPOST RBC.

Table 7 Repeatability of hematological and biochemical parameters measured before and after incremental cycling test.

In the present study we also evaluated the differences in means (T1, T2, and T3) of blood parameters between FGBPRE, FGBPOST, ICTPRE, and ICTPOST. We found that WBC, LYM, MON, La, and Pa were significantly higher FGBPOST than FGBPRE and ICTPOST than ICTPRE (Table 8). GRA were higher ICTPOST than ICTPRE. GLU was higher FGBPOST than FGBPRE. LYM were significantly different between FGBPRE vs ICTPRE and between ICTPOST and FGBPOST. No significant differences were observed for RBC, HGB, PLT, CK and LDH.

Table 8 Differences in hematological and biochemical parameters between Fight Gone Bad and incremental cycling test (means of three measurements T1–T3).


Cross-training is still becoming more and more popular. A whole range of athletes, as well as sedentary people, can benefit from cross-training, because it gives multiple stimulus to the muscles, and all exercise can be scaled to meet an individual’s abilities and needs. Thus, there is a need to find a validated test to measure its performance. In this paper, we proposed a benchmark workout, Fight Gone Bad, to be such a test. FGB incorporates several of the physiological traits that are most crucial for HIFT performance, i.e., stamina, speed, strength, endurance, and power. The main findings showed that FGB gives reliable and repeatable results when performed three times with each measurement separated by 10 days from the others. Moreover, we revealed that FGB results were strongly correlated to aerobic fitness. When the results were standardized, we also found that the agreement of FGB with aerobic performance indices such as Texh, VO2peak, Wmax, HRmax, TGET, and WGET was high.

In practical and scientific respects, the reproducibility of a test is essential to determine whether an individual has experienced a training response. Moreover, reliability estimates the extent to which the change in the measured score is due to a change in the true score14. The present study is the first to investigate whether FGB performance is reproducible across repeated measurements. There were no differences in body composition, FGB, HRFGB, and ICT between T1, T2, and T3. Relative reliability was measured as the interclass correlation coefficient. The ICC reflects a test's ability to differentiate between participants and, hence, the position of the individual relative to others in the group15. Relative reliability was found to be excellent for FGBR-1 and FGBTOTAL and good for FGBR-2 and FGBR-3, showing the linearity of the relationship between the repeated measures. However, the ICC does not provide information about the accuracy of the scores for an individual. Therefore, absolute reliability was calculated as SEM. Lower SEM means the method is more precise16. SEM for FGBR-1 and FGBTOTAL was 6% each, for FGBR-2—8% and for FGBR-3—9%. The smallest worthwhile changes (SWC) were higher than SEM for FGBR-1 and FGBTOTAL, indicating the ability of test to detect small and meaningful performance changes. Interestingly, the repeatability of body composition measurements in our study indicates that the participant did not implement any changes in their lifestyles throughout the study (i.e., body mass reduction), that could influence the performance.

Moreover, relative reliability was even better for ICT, where for most of the measured parameters (Texh, Wmax, TGET, WGET, VO2peak, VCO2peak and EE), ICC was close to 1.0 and SEM was low. This is in accordance with previous studies. Dideriksen and Mikkelsen17 showed excellent ICC (< 0.9) for VO2max, Wmax, and HRmax and good ICC (0.7–0.9) for VO2VT in recreationally trained triathletes (n = 13). Weston and Gabbett18 found ICC > 0.9 for VO2max, VE, VCO2, HR, and W, with measurement errors below 5% in trained cyclists (n = 16). Graded exercise testing is a reliable tool that is widely used for the determination of VO2peak in sports performance, research, and clinical diagnostics19. However, this is beyond the scope of the present paper.

Considering speed and strength efforts, Fight Gone Bad (FGB) is a high-intensity workout of moderate duration (17 min in total). It is performed very fast and demands a high level of muscular endurance. The present study compared the results in FGB to aerobic fitness measured in incremental cycling test. We found strong and moderate correlations between FGB performance and time to exhaustion, maximum workload, VO2peak, VCO2peak, time to GET, and workload at GET. This suggests that aerobic fitness is crucial to FGB performance. This might seem counterintuitive, since the HR observed during each round of FGB were very close to HRmax measured in ICT, showing that the effort put forth by the participants in FGB was extremely high. What is more, given that HR at gas exchange threshold was at the level of 162 ± 11 bpm to 164 ± 10 bpm, we can assume that work done in FGB rounds was mainly of an anaerobic nature (HR mean of 3 rounds was between 169 ± 11 bpm to 173 ± 10 bpm). One explanation, though not strong enough, for such a phenomenon can be that the duration of FGB forces the engagement of the aerobic energy system. The other reason for that can be in the main characteristic of FGB, which is the 1-min rest periods between the rounds. It seems that the ability to recover between bouts of exercise is dependent upon oxidative capacity20. One study showed that better recovery between repeated bouts of Wingate sprints was associated with better cross-training performance20. Therefore, it suggests that even if rounds in FGB are to some extent anaerobic in nature, the ability to sustain effort throughout the entire FGB may be reliant on the aerobic recovery efficiency during breaks between the rounds. This is in accordance with literature suggesting that aerobic capacity enhances recovery from high intensity intermittent exercise through increased aerobic response, improved lactate removal, and enhanced phosphocreatine regeneration21.

Some investigators aimed to compare different physiological variables with HIFT performance. Bellar et al.8 found that cross-training performance was correlated to aerobic power (VO2max) but only in experienced athletes (r = 0.453, p = 0.03) and not in the naïve (r = 0.168, p = 0.64). The cross-training workout they used consisted of 21–15-9 repetitions of (1) sumo deadlift high pull, (2) box jump (50 cm), and (3) 40-m farmer's walk gripping two 20-kg bumper plates. The score was the time for workout completion. In yet another paper, Dexheimer et al.10 showed that the higher the VO2max the better the result in ‘Fran’, ‘Nancy’, and ‘CrossFit Total’. However, in the regression model, VO2max explained 68% of the variance only in ‘Nancy’10. ‘Fran’ consisted of 3 rounds of 21–15-9 repetitions of thrusters and pull-ups. ‘Nancy’ was a workout of 5 rounds of a 400 m run and 15 overhead squats with a barbell (95 lb men/65 lb women). ‘CrossFit Total’ included 1-repetition maximum (RM) back squat, strict shoulder-press, and deadlift. Interestingly, they did not find any significant association between VO2max and the ‘Grace’ workout (30 clean and jerks (135 lb men/95 lb women)). The authors thus hypothesized that those with higher VO2max may perform better in longer workouts that require running (i.e., ‘Nancy’) compared to shorter (i.e., ‘Grace’). In contrast to Dexhaimer, Butcher et al.9 found no correlation between VO2max and ‘Fran’, ‘Grace’, or ‘Cindy’ (as many rounds as possible of 5 repetitions of pull-ups, 10 repetitions of push-ups, and 15 repetitions of bodyweight squats performed in 20 min) or ‘CrossFit Total’. However, they found significant correlations between VO2 at anaerobic threshold and ‘Fran’, ‘Grace’, and ‘CrossFit Total’. Furthermore, Martinez-Gomez et al.11 aimed to determine which physiological variables could predict performance during a cross-training competition (The Open, 2019). They found that the combination of lower-body muscle power (squat jump performance), reactive strength (reactive strength index during a drop jump), and aerobic power (as measured with the VO2max) together explained 81% of the cross-training performance variance, showing that HIFT performance is associated with a variety of fitness markers related to both aerobic and anaerobic/power capabilities. This seems to confirm that HIFT demands high adaptation to both aerobic and anaerobic types of exercise. Therefore, the improvement in cardiorespiratory fitness may enhance the performance in longer workouts like FGB or ‘Nancy’. Thus, it would be reasonable to include aerobic training in cross-training programming. Moreover, it would be also beneficial to include training like FGB in sports using varying energy systems like combat and team sports. In these sports, all three energy systems are used according to the intensity, rhythm, and duration of the competition. That is why they can benefit the most from FGB, which is of high intensity yet strongly correlated with aerobic fitness (i.e., Texh).

Also, the present study is the first to provide extensive data on biochemical response after each exercise test among HIFT-trained participants. La and Pa concentrations significantly increased FGBPOST and ICTPOST compared to FGBPRE and ICTPRE, respectively, but there were no differences in La or Pa between FGBPOST (13.41 ± 3.52 mmol/L and 0.79 ± 0.15 mmol/L, respectively) and ICTPOST (12.74 ± 2.93 mmol/L and 0.77 ± 0.12 mmol/L, respectively). GLU concentration increased significantly only FGBPOST versus FGBPRE (153.0 ± 44.6 mg/dL to 117.2 ± 22.3 mg/dL). In a previous study, Feito et al.20 observed similar concentrations of La in the last round of 15-min cross-training AMRAP (13.89 ± 2.23 mmol/L in males and 11.53 ± 1.69 mmol/L in females). The 15-min AMRAP circuit consisted of 250-m of rowing, 20 Kettle bell swings (16 kg for men and 12 kg for women), and 15 dumbbell thrusters (16 kg for men and 9 kg for women). In another study, La and GLU concentrations were significantly different between WOD1 and WOD2 (13.30 ± 1.87 mmol/L La after WOD1 vs 18.38 ± 2.02 mmol/L La after WOD2, 135.4 ± 19.6 mg/dL GLU after WOD1 vs 167.4 ± 19.6 mg/dL GLU after WOD2)22. WOD1 (AMRAP) consisted of as many rounds as possible of burpees and toes to bar increasing repetitions (1–1, 2–2, 3–3…) in five minutes. WOD2 was the number rounds for time (RFT) consisting of three rounds of 20 repetitions of wall ball (9 kg) and 20 repetitions of power clean (a load of 40% 1RM) in the shortest possible time. HR measurement showed that during WOD1 the majority of time was spent in the intensity zone of 50–59% HRmax, whereas during WOD2 it was in the zone of 90–100% HRmax. Thus, this explains the difference in La and GLU concentration. Fernandez-Fernandez et al.23 observed that La concentration increased from 4.0 ± 1.3 mmol/L to 14.5 ± 3.2 mmol/L in ‘Cindy’ and from 4.0 ± 1.3 mmol/L to 14.0 ± 3.3 mmol/L in ‘Fran’. Moreover, Tibana et al.24 observed lower concentrations of La after two different cross-training workouts, 11.84 ± 1.34 mmol/L after WOD1 and 9.05 ± 2.56 mmol/L after WOD2. WOD1 was 10 min of AMRAP of 30 double-unders and 15 power snatches (34 kg)24. WOD2 was 12 min AMRAP of rowing 250 m and 25 target burpees. They also observed an increase in GLU concentration (81.59 ± 10.27 mg/dL to 114.99 ± 12.52 mg/dL in WOD1 vs. 69.47 ± 6.97 mg/dL to 89.95 ± 19.26 mg/dL in WOD2).

In the present study, we did not observe any significant differences in CK and LDH activities. Timon et al.22 noted a significant increase in CK activity after two WODs (from 566.4 ± 159.1 IU/L to 689.6 ± 281.9 IU/L in WOD1 (5 min AMRAP) and from 406.8 ± 201.0 IU/L to 492.2 ± 203.8 IU/L in WOD2 (RFT)). CK remained elevated 24 h post exercise (864.0 ± 369.5 IU/L after WOD1 and 673.8 ± 444.1 IU/L after WOD2) and returned to baseline 48 h post exercise. However, they did not note any differences in LDH.

Finally, it has been generally shown that acute exercise increases erythrocytes, leucocytes and platelet counts, hematocrit values, and hemoglobin concentration significantly as compared to pre-exercise values, and these increments depend on fluid shifts (plasma volume contraction) caused by the exercise25. For this reason, we standardized post exercise biochemical and hematological parameters in relation to the hematocrit value. We observed significant differences in HGB and MON between T1, T2 and T3, yet they were not clinically important. WBC, LYM, and MON concentration increased after exercise (both FGB and ICT). GRA increased significantly only after ICT. Interestingly, LYM count was higher FGBPOST than ICTPOST, but pre-exercise values were also higher before FGB than before ICT. The increase in WBC can be attributed to increased blood flow that recruits the leukocytes from the marginal pool and/or hormonal changes which are likely to be mediated by beta-2 adrenergic receptors25. In contrast to our study, in sedentary adults acute high-intensity interval training (HIIT) increased HGB concentration from 15.75 ± 0.76 g/dL pre-exercise to 16.59 ± 0.81 g/dL post-exercise and RBC count from 5.44 ± 0.22 × 1012/L to 5.92 ± 0.22 × 1012/L26. In addition, they observed a significant increase in WBC (from 7.32 ± 1.83 × 109/L to 12.84 ± 3.37 × 109/L) and LYM (3.11 ± 1.59 × 109/L to 5.22 ± 1.99 × 109/L) but not in MON. The differences between individual studies most often result from not using the hematocrit conversion formula that takes into account post-exercise dehydration and the transfer of fluids from the bloodstream to the tissues27. This can explain the differences in blood cell counts observed by Belviranli et al.26.

Our study is the first to evaluate the repeatability of a cross-training test. The strength of our study is that the participants performed FGB and ICT three times in the same conditions, which allowed more precise assessment of reproducibility. Moreover, we controlled whether the participants implemented any significant changes in their lifestyles, i.e., reduction diet, by measuring body composition. We also controlled the HR during the test to evaluate the intensity of exercise. By comparing average FGB HR to HRmax measured in ICT, we could assume that FGB was very intense and that every time the participants put in a lot of effort. We also controlled the biochemical and hematological parameters throughout the study, which gave us insight into metabolic changes in relation to both exercise tests.

The biggest limitation of our study is that we only measured the correlation of FGB with aerobic fitness. HIFT is varied in its nature, because it combines strength, power, speed, agility, and cardiovascular fitness. For this reason, we claim that future studies should evaluate the relationship between FGB performance and other physiological parameters such as anaerobic power or strength. Even though our study shows that FGB gives reliable scores, it seems important to evaluate its connection with other physical traits. It should also be taken into account that we used a cycloergometer for aerobic fitness evaluation in this study, which, to some extent, could affect the obtained aerobic fitness results. It is well known that aerobic power measured on a treadmill is higher than on a cycloergometer, because running engages whole-body and cycling mostly lower-body movements. However, the choice of a cycloergometer was motivated by the reluctance of our study group towards running. Thus, there was a justified concern that the participants would not fully engage in a graded running test. It is also worth-noting that the participants were well adapted to perform lower-body movements (like squats, deadlifts and lunges), which could be beneficial for cycling performance.


Our study showed that Fight Gone Bad is a reliable and repeatable test to measure cross-training performance. Moreover, FGB is strongly correlated with aerobic fitness. FGB can be used as a tool in interventional studies to evaluate the changes in cross-training scores. Furthermore, given that FGB is a non-invasive, easy to perform, and accessible test, it can be regularly used by coaches throughout the training season.



Thirty-one participants were initially enrolled in this study. However, twenty-one (9 women, and 12 men with mean ± SD ages of 31.5 ± 5.5 years, body height 174 ± 8 cm, baseline values of body mass 73.0 ± 14.0 kg, free fat mass (FFM) 58.7 ± 13.9 kg, fat mass (FM) 19.1 ± 7.1%, and VO2max 3191 ± 823 mL/min) completed the entire study protocol and were included in the analyses (Fig. 2). The participants were at a similar moderate athletic level. They have been regularly doing HIFT at Rankor Athletics, Reebok CrossFit Poznań, and Caffeine Barbell clubs in Poznań, Poland. The criteria to qualify for the study included the following: age between 20 and 40 years, the absence of injury and/or any other issues, good health with a valid and up-to-date medical certificate confirming the athlete’s ability to practice sports, at least 2 years of regular cross-training experience, and a minimum of 4 workout sessions (cross-training) per week for at least six months. We included both males and females in order to have equal participation of both genders in HIFT training and to test gender-related impact, assuming the purpose and scope of this work was considered negligible. Exclusion criteria included the following: being a current smoker, participating in illicit drug use, alcohol consumption greater than the equivalent 1–2 one alcoholic drinks per week, and dietary supplement use or being on any special diet within 3 weeks of the study’s commencement. For females, additional exclusion criteria were being pregnant or planning to become pregnant during the study. The cross-training box coaches enabled confirmation of the required inclusion criteria declared by the participants. They also supported the control of training adherence compliance. The drop-outs were predominantly independent from the study protocol (Fig. 2). The reasons for dropouts were as follows: personal, infections, minor injuries during customary training, and/or the inability to participate in the time frame of the planned protocol. The studies were conducted in 2015 and 2016 off season. All subjects declared that they had not introduced any changes in their lifestyles, elements of training, and/or customary nutrition.

Figure 2
figure 2

A flow chart of the study design.

The study protocol was reviewed and approved by the local ethical committee (Bioethics Committee at Poznan University of Medical Sciences, Poznan, Poland). Each subject was informed of the testing procedure, its purpose, and the risks of the study. Each participant submitted her/his written consent to participate. All procedures were conducted in accordance with the ethical standards of the 1975 Helsinki Declaration.

Study design and protocol

The primary outcomes in this paper were the repeatability and reliability of FGB performance and its relation to aerobic performance. The study protocol included three visits to the Exercise Tests Laboratory at the of the Department of Human Nutrition and Dietetics (DHND) at the Poznan University of Life Sciences and selected “Cross-training Boxes” in Poznan at baseline (T1), and after 10 (T2) and 20 (T3) days, respectively (Fig. 2). Subjects were instructed not to participate in any high-intensity or long-duration training sessions at least 24 h before testing. All measurements at the DHND were performed in the morning (7.30–10.00 AM) and in a fasting state (water intake was recommended; a standardized meal was eaten the previous night immediately before going to sleep (about 1.2 g of carbohydrates per kg of body mass and 40 g of protein)). At the beginning, subjects underwent body composition analysis. Afterward, an incremental cycling test until volitional exhaustion was performed. During all of these measurements, the ambient temperature remained at 20‒22 °C. In the afternoon of the next day and three hours after standardized small meals (about 0.6 g of carbohydrates per kg of body mass and 15 g of protein), the discipline-specific cross-training test was performed. Enrolled participants were familiar with the tests and procedures used as they had participated in some previous research projects.

Anthropometry and body composition

Body mass (kg) and height (cm) were measured using a professional medical scale with a stadiometer (WPT 60/150 OW, RADWAG, Radom, Poland) at an accuracy of 0.1 cm and 0.1 kg for height and body mass, respectively. FM and FFM were assessed based on air displacement plethysmography using the Bod Pod (Cosmed, Rome, Italy) as described previously12,13. Total body water and hydration level and additional FM and FFM evaluation was assessed by bioelectric impedance with Bodystat 1500 (Bodystat Inc, Douglas, UK) based on the previously mentioned recommended procedures28.

Exercise tests

The study protocol consisted of the incremental cycling test (ICT) and FGB workout performed 3 times (T1, T2, and T3). Between ICT and FGB tests at least a 30-h recovery break was implemented. Prior to each tests (ICT and FGB), participants were given instructions on the procedure, and they completed a brief warm-up period (a 5-min effort on a cycloergometer (Kettler-X1, Kettler, Ense-Parsit, Germany) of approximately 50 W power and ~ 70 rpm cadence, followed by a 5-min light stretching and 5-min break). All tests were performed in proper workout clothing and shoes, and the tests were supervised by an experienced researcher. Heart rate was continuously monitored during exercise using a telemetric system (Polar, Kempele, Finland). Furthermore, capillary blood samples were obtained for analysis before and after each test. During exercise, all test participants were verbally encouraged to maximize their efforts.

Aerobic fitness test

An exercise test on the Kettler X1 cycloergometer (Kettler, Ense-Parsit, Germany) was performed to determine peak oxygen uptake (VO2peak), and gas exchange threshold (GET). We considered the VO2peak to be the moment when the individual oxygen uptake (VO2) recorded during the ICT reached the highest point29. To determine the GET during the ICT, the V-slope method was applied based on an analysis of the linear regression for the curve of increasing CO2 exhalation in comparison to the curve of increasing O2 uptake30,31,32.The initial load was set at 50 W for women and 75 W for men and increased every 1.5 min by 25 W until volitional exhaustion. Respiratory parameters and heart rate (HR) were measured (breath by breath) by the Quark CPET ergospirometer (Cosmed, Rome, Italy). Measured variables included time to exhaustion (Texh), maximal workload (Wmax), maximum heart rate (HRmax), time to GET (TGET), workload at GET (WGET), heart rate at GET (HRGET), oxygen uptake at GET (VO2GET), VO2peak, peak carbon dioxide production (VCO2peak), and energy expenditure (EE).

Fight Gone Bad

Fight Gone Bad comprised three rounds of five exercises: wall ball, sumo deadlift high pull, box jump, push press, and rowing13,33,34. Participants were instructed to complete as many repetitions as possible in one minute at each station prior to moving to the next station. After completing each of the five stations, participants had one minute of rest (Rest1 between the 1st and 2nd and Rest2 between the 2nd and 3rd rounds) before beginning the next round13,34. Wall balls combined a front squat with a medicine ball (6 kg for females, 9 kg for males) and a push press-like throwing of the ball to a target located 2.75 m for females and 3.0 m for males. At the bottom of the squat, the hips should be lower than the knees. In sumo deadlift high pull, the feet were wider than the hips, and the grip was inside the knees. The exercise started with lifting the bar (25 kg for females, 35 kg for males) from the ground like in classical deadlift, but then the bar was pulled to the chest. At the end, the elbows should be higher than the shoulders. The Box jump started with both feet on the ground. Athletes jumped on a box that was 50 cm tall for females and 65 cm for males with landing on both feet. The exercise ended when shoulders, hips, and knees were extended in one line. Push press started with lifting the bar (25 kg for females and 35 kg for males) from the ground to the front rack. Then the bar was pushed overhead using leg power. After the shoulders were straight, the bar was dropped back to the shoulders. Rowing was performed on an ergometer. Feet were taped to the feet plates with special straps. The handle was pulled towards the chest, using the push from the knees. The test was video recorded in order to allow an accurate count of all properly done repetitions. For each valid repetition, a participant needed to complete a full range of motion required for a specific exercise.

Blood samples analysis

Blood was collected by qualified medical personnel in accordance with applicable procedures. Before (ICTPRE and FGBPRE) and 3 min after exercise tests (ICTPOST and FGBPOST) capillary blood was collected from a fingertip of the nondominant hand using a disposable lancet-spike Medlance Red (HTL-STREFA, Łódź, Poland) with a 1.5 mm blade and 2.0 mm penetration depth as described previously13. Approximately 300 μL of blood was collected into a Microvette CB 300 tube (Sarstedt, Nümbrect, Germany) containing K2-EDTA (EDTA dipotassium salt) as anticoagulant for hematological measurements. Blood sample tests were carried out with the use of an 18-parametric automated hematology analyzer Mythic 18 (Orphée, Geneva, Switzerland). The count of white blood cells (WBC), lymphocytes (LYM), monocytes (MON), granulocyte (GRA), red blood cells (RBC), platelets (PLT), as well hemoglobin (HGB) concentration and hematocrit (HTC) value were considered in the study. Furthermore, another 300 μL of capillary blood was collected in a Microvette CB 300 Z tube (Sarstedt, Nümbrect, Germany) with a clotting activator, in which the activities of creatine kinase (CK; EC; Liquick Cor-CK, Cormay, Cat No. 1-219, Łomianki Poland) and lactate dehydrogenase (LDH; EC; Liquick Cor-LDH, Cormay, Cat No. 1-239, Łomianki Poland) were measured using an optimized kinetic methods. In addition, 50 μl of capillary blood was collected into a neutral (without anticoagulant) glass capillary (Vitrex, Medlab, Raszyn, Poland). The blood samples were deproteinized in 0.6 mol/L of perchloric acid (HClO4). After centrifuging (4000 g/10 min/4 °C), the supernatant was isolated. The enzymatic measurements of lactate (La) and pyruvate (Pa) concentrations were based on methodology proposed by Maughan35. Glucose concentration was detected with an enzymatic, colorimetric PZ Cormay test (Liquick Cor-GLUCOSE, Cat No. 2-201, Łomianki Poland). All biochemical measurements were conducted using a multi-mode microplate reader (Synergy 2 SIAFRT, BioTek, Winooski, USA). To avoid the influence on biochemical and hematological parameters caused by changes in plasma volume during physical effort, an appropriate hematocrit converter formula was used27,34.

Statistical analysis

Normal distribution was examined using the Shapiro–Wilk test. Differences between T1, T2, and T3 were analyzed using repeated ANOVA measures. Relative reliability was assessed using the intraclass correlation coefficient (ICC) between T1, T2, and T3. The ICC gives the ratio of variances due to differences between subjects. ICC < 0.40 was considered low, between 0.40 and 0.70 acceptable, between 0.70 and 0.90 good, and > 0.90 excellent. However, ICC does not give an indication of the accuracy of individual measurements. Absolute relativity was calculated as standard error of measurement (SEM), which quantifies the precision of the individual measurements. The usefulness of the test was assessed by calculating the smallest worthwhile change (SWC). The ability of the test to detect small and meaningful changes was rated as good if SEM ≤ SWC, satisfactory when SEM = SWC, and marginal in cases with SEM ≥ SWC. Minimal Detectable Change (MDC), which is the minimal amount of change that a measurement must show to be greater than the within subject variability and measurement error, also referred to as the sensitivity to change, was also calculated. Associations between the FGB score and aerobic capacity were measured using the Pearson correlation coefficient. The following criteria were adopted for the interpretation of the magnitude of the correlation: trivial (r < 0.1), small (0.1 ≤ r < 0.3), moderate (0.3 ≤ r < 0.5), large (0.5 ≤ r < 0.7), very large (0.7 ≤ r < 0.9), nearly perfect (0.9 ≤ r < 1), and perfect (r = 1). The agreement of two methods was evaluated using the Bland–Altman method after data normalization36. Normalization was done by subtracting the mean and dividing by standard deviation.

Ethics approval

All procedures performed were in accordance with the ethical standards of the institutional and national research committee (Bioethics Committee at Poznan University of Medical Sciences, Poznan, Poland (Decision no. 173/15 of 5 February 2015)) and with the 1975 Helsinki declaration and its later amendments or comparable ethical standards.

Consent to participate

All participants signed an informed consent.

Practical applications

This work proposes the first evaluation of the reliability and validation of a specific test to measure HIFT performance. Our study indicated that FGB is a reliable test that can be used in order to measure changes in cross-training performance caused by an intervention. We also showed that cross-training performance is correlated to aerobic fitness, which gives more insight into the physiology of the test. It shows that aerobic fitness, even though underestimated by most of the cross-training athletes, can be an important contributor to success. Our findings could serve as guidance for scientists, as well as coaches and athletes who consider achieving their scientific and/or training goals based on the cross-training specific Fight Gone Bad workout.