Evaluation of the repeatability and reliability of the cross-training specific Fight Gone Bad workout and its relation to aerobic fitness

Cross-training is a high-intensity functional training (HIFT) with multiple workout modalities. Despite the increasing number of studies in HIFT, there is still no validated test to measure its specific performance. It would also be advisable to determine whether selected cross-training workouts can implement a stimulus corresponding to maximize aerobic work. For these reasons, the purpose of our study was to evaluate the repeatability and reliability of Fight Gone Bad (FGB) workout and to assess its relationship with aerobic fitness. Twenty-one cross-training participants (9 females) finished the study protocol which included three two-day measurement sessions separated by 10 days. During each session, participants had their body composition measured, and they performed two exercise tests. The first test was an incremental cycling test to measure aerobic fitness, and the second was a cross-training specific FGB workout performed the next day. Reliability and repeatability were calculated from the three measurements. The total FGB Score (FGBTOTAL) showed excellent reliability (ICC 0.9, SEM 6%). Moreover, FGBTOTAL was strongly correlated with aerobic fitness (i.e., time to exhaustion (Texh, R2 = 0.72), maximal workload (Wmax, R2 = 0.69), time to gas exchange threshold (TGET, R2 = 0.68), and peak oxygen uptake (VO2peak, R2 = 0.59). We also found that agreement between standardized FGB and standardized aerobic performance indices such as Texh, VO2peak, Wmax, maximum heart rate, TGET, and workload at gas exchange threshold was high by the Bland–Altman method. In conclusion, FGB is a reliable test that can be used in order to measure changes in cross-training performance caused by an intervention. Moreover, FGB is strongly correlated to aerobic fitness.

Cross-training has incredibly grown in popularity for over a dozen years. It combines ten key physical parameters to improve overall fitness: cardiovascular/respiratory endurance, stamina, strength, flexibility, power, speed, coordination, agility, balance, and accuracy [1][2][3] . Cross-training can be described as high-intensity functional training (HIFT) that incorporates exercises with barbells, kettlebells, dumbbells, cardiovascular tasks (i.e., running, rowing), and gymnastics (i.e., pull ups, muscle ups) 4 . It is varied in its nature, which appeals strongly to many people around the world.
In a systematic review, Haddock et al. 5 suggests that HIFT increases aerobic fitness, strength, musculature, and endurance. Cosgrove et al. 6 studied the influence of six months of HIFT on multiple fitness characteristics. The participants completed three separate days of assessments across 10 fitness domains before and after participating in the program for 6 months. They found that HIFT improved several fitness parameters, including flexibility, power, muscular endurance, and strength 6 . Murawska-Cialowicz et al. 7 also showed that there was an improvement in aerobic fitness after 3 months of HIFT training in women but not in men. However, no changes in Wingate anaerobic power were observed.
Other investigators focused on finding the connection between HIFT performance and aerobic and anaerobic fitness as well as strength capabilities [8][9][10][11] . Butcher et al. 9 studied whether physiological measures can predict Fight Gone Bad performance. There were no significant differences in FGB performance between the three measurements ( Table 2). ICC was excellent for FGB Round-1 (FGB R-1 ) and FGB TOTAL and good for FGB Round-2 (FGB R-2 ) and FGB Round-3 (FGB R-3 ). SEM was 6, 8, 9 and 6% for FGB R-1 , FGB R-2 , FGB R-3 , and FGB TOTAL , respectively. SEM was lower than SWC for FGB R-1 and FGB TOTAL , indicating good ability of the test to detect small and meaningful changes. However, for FGB R-2 and FGB R-3 , SEM was higher than SWC. MDC values are given in Table 2.
There were no significant differences in HR during FGB between the three measurements (Table 3). ICC was good for FGB R-2 , Rest 2 , Mean 15min , and Mean 17min . ICC was acceptable for Rest 1 and FGB R-3 and low for FGB R-1 . SEMs were low, between 2 and 5%. SEM was higher than SWC for all HR measurements in FGB. MDC values are given in Table 3.
(interclass correlation coefficient) was excellent for T exh , W max , T GET , W GET , VO 2peak , VCO 2peak , and EE, good for HR GET and VO 2GET , and acceptable for HR max . SEM was low (< 5.99%) for HR max , HR GET , and VO 2peak , and moderate (6-10%) for T exh , W max , T GET , W GET , and VO 2GET . SEM for EE was 11%. SEM < SWC was observed for T exh , W max , T GET , W GET , VO 2peak , VCO 2peak , and EE. However, for HR max , HR GET , and VO 2GET SEM was higher than SWC.
FGB and ICT relationship. Correlations between FGB scores and ICT parameters were significant for all measurements besides HR GET (Table 5). Strong correlations (> 0.7) were observed for FGB R-1 with T exh , W max , T GET , and EE; and for FGB TOTAL with T exh and EE. Moderate (0.5-0.7) correlations were found for FGB R-1 with W GET , VO 2GET , VO 2peak , and VCO 2peak , for FGB R-2 with T exh , W max , T GET , W GET , VO 2peak , VCO 2peak , and EE, for  The agreement of two methods was also assessed using the Bland-Altman method (Fig. 1). Bias for standardized FGB TOTAL and performance in ICT measured as standardized T exh , VO 2peak , W max , and HR max were 0.0 ± 0.70, 0.0 ± 1.53, 0.0 ± 0.74 and 0.0 ± 0.10, respectively ( Fig. 1A-D). Bias for standardized FGB TOTAL and GET measured as standardized T GET and W GET were 0.0 ± 0.76 and 0.0 ± 0.81, respectively (Fig. 1E,F).
Blood sample analysis. Biochemical marker analysis in blood revealed that MON count and HGB concentration were significantly different among three measurements in both FGB PRE and FGB POST , whereas Pa concentrations were significantly different only in FGB PRE (Table 6). ICC was rated as excellent in FGB POST RBC, as good in FGB PRE RBC, PLT, and HTC and in FGB POST HTC and GLU, acceptable in FGB PRE WBC, LYM, GRA and CK, and in FGB POST WBC, LYM, GRA, PLT, La, Pa, CK, and LDH. ICC was significant but low in FGB PRE HGB and LDH, and non-significant in FGB PRE MON, GLU, La, Pa, and FGB POST MON, and HGB. Apart from FGB PRE WBC, HTC and FGB POST RBC, for all blood measures SEM > SWC both FGB PRE and FGB POST .
Blood parameters ICT PRE and ICT POST were also measured. HGB concentrations were significantly different among three measurements in both ICT PRE and ICT POST , whereas GLU concentrations were significantly different only ICT PRE (Table 7). ICC was excellent for RBC ICT PRE and ICT POST , good in ICT PRE PLT and ICT POST PLT, and HTC, acceptable in ICT PRE WBC, LYM, GRA, HTC, La, and LDH, and in ICT POST WBC, LYM, MON, GRA, GLU, and La. Low but significant ICC was observed in ICT PRE MON, HGB and CK, and in ICT POST HGB, Table 4. Repeatability of aerobic variables measured in incremental cycling test. EE, energy expenditure; HR GET , heart rate at gas exchange threshold; HR max , maximal heart rate; ICC, interclass correlation coefficient; MDC, minimal detectable change; SEM, standard error of measurement; SWC, smallest worthwhile change; T exh , time to exhaustion; T GET , time to gas exchange threshold; VCO 2peak , maximal carbon dioxide production; VO 2peak , peak oxygen uptake; VO 2GET , oxygen uptake at gas exchange threshold; W max , maximal workload; W GET , workload at gas exchange threshold.  In the present study we also evaluated the differences in means (T 1 , T 2 , and T 3 ) of blood parameters between FGB PRE , FGB POST , ICT PRE , and ICT POST . We found that WBC, LYM, MON, La, and Pa were significantly higher FGB POST than FGB PRE and ICT POST than ICT PRE (Table 8). GRA were higher ICT POST than ICT PRE . GLU was higher FGB POST than FGB PRE . LYM were significantly different between FGB PRE vs ICT PRE and between ICT POST and FGB POST . No significant differences were observed for RBC, HGB, PLT, CK and LDH.

Discussion
Cross-training is still becoming more and more popular. A whole range of athletes, as well as sedentary people, can benefit from cross-training, because it gives multiple stimulus to the muscles, and all exercise can be scaled to meet an individual's abilities and needs. Thus, there is a need to find a validated test to measure its performance. In this paper, we proposed a benchmark workout, Fight Gone Bad, to be such a test. FGB incorporates several of the physiological traits that are most crucial for HIFT performance, i.e., stamina, speed, strength, endurance, and power. The main findings showed that FGB gives reliable and repeatable results when performed three times with each measurement separated by 10 days from the others. Moreover, we revealed that FGB results were strongly correlated to aerobic fitness. When the results were standardized, we also found that the agreement of FGB with aerobic performance indices such as T exh , VO 2peak , W max , HR max , T GET , and W GET was high. www.nature.com/scientificreports/ In practical and scientific respects, the reproducibility of a test is essential to determine whether an individual has experienced a training response. Moreover, reliability estimates the extent to which the change in the measured score is due to a change in the true score 14 . The present study is the first to investigate whether FGB performance is reproducible across repeated measurements. There were no differences in body composition, FGB, HR FGB, and ICT between T 1 , T 2 , and T 3 . Relative reliability was measured as the interclass correlation coefficient. The ICC reflects a test's ability to differentiate between participants and, hence, the position of the individual relative to others in the group 15 . Relative reliability was found to be excellent for FGB R-1 and FGB TOTAL and good for FGB R-2 and FGB R-3 , showing the linearity of the relationship between the repeated measures. However, the ICC does not provide information about the accuracy of the scores for an individual. Therefore, absolute reliability was calculated as SEM. Lower SEM means the method is more precise 16 . SEM for FGB R-1 and FGB TOTAL was 6% each, for FGB R-2 -8% and for FGB R-3 -9%. The smallest worthwhile changes (SWC) were higher than SEM for FGB R-1 and FGB TOTAL , indicating the ability of test to detect small and meaningful performance changes. Interestingly, the repeatability of body composition measurements in our study indicates that the participant did not implement any changes in their lifestyles throughout the study (i.e., body mass reduction), that could influence the performance.
Moreover, relative reliability was even better for ICT, where for most of the measured parameters (T exh , W max , T GET , W GET , VO 2peak , VCO 2peak and EE), ICC was close to 1.0 and SEM was low. This is in accordance with previous studies. Dideriksen and Mikkelsen 17 showed excellent ICC (< 0.9) for VO 2max , W max , and HR max and good ICC (0.7-0.9) for VO 2VT in recreationally trained triathletes (n = 13). Weston and Gabbett 18 found ICC > 0.9 for VO 2max , VE, VCO 2 , HR, and W, with measurement errors below 5% in trained cyclists (n = 16). Graded exercise testing is a reliable tool that is widely used for the determination of VO 2peak in sports performance, research, and clinical diagnostics 19 . However, this is beyond the scope of the present paper.
Considering speed and strength efforts, Fight Gone Bad (FGB) is a high-intensity workout of moderate duration (17 min in total). It is performed very fast and demands a high level of muscular endurance. The present study compared the results in FGB to aerobic fitness measured in incremental cycling test. We found strong  www.nature.com/scientificreports/ and moderate correlations between FGB performance and time to exhaustion, maximum workload, VO 2peak , VCO 2peak , time to GET, and workload at GET. This suggests that aerobic fitness is crucial to FGB performance. This might seem counterintuitive, since the HR observed during each round of FGB were very close to HR max measured in ICT, showing that the effort put forth by the participants in FGB was extremely high. What is more, given that HR at gas exchange threshold was at the level of 162 ± 11 bpm to 164 ± 10 bpm, we can assume that work done in FGB rounds was mainly of an anaerobic nature (HR mean of 3 rounds was between 169 ± 11 bpm to 173 ± 10 bpm). One explanation, though not strong enough, for such a phenomenon can be that the duration of FGB forces the engagement of the aerobic energy system. The other reason for that can be in the main characteristic of FGB, which is the 1-min rest periods between the rounds. It seems that the ability to recover between bouts of exercise is dependent upon oxidative capacity 20 . One study showed that better recovery between repeated bouts of Wingate sprints was associated with better cross-training performance 20 . Therefore, it suggests that even if rounds in FGB are to some extent anaerobic in nature, the ability to sustain effort throughout the entire FGB may be reliant on the aerobic recovery efficiency during breaks between the rounds. This is in accordance with literature suggesting that aerobic capacity enhances recovery from high intensity intermittent exercise through increased aerobic response, improved lactate removal, and enhanced phosphocreatine regeneration 21 . Some investigators aimed to compare different physiological variables with HIFT performance. Bellar et al. 8 found that cross-training performance was correlated to aerobic power (VO 2max ) but only in experienced athletes (r = 0.453, p = 0.03) and not in the naïve (r = 0.168, p = 0.64). The cross-training workout they used consisted of 21-15-9 repetitions of (1) sumo deadlift high pull, (2) box jump (50 cm), and (3) 40-m farmer's walk gripping two 20-kg bumper plates. The score was the time for workout completion. In yet another paper, Dexheimer et al. 10 showed that the higher the VO 2max the better the result in 'Fran' , 'Nancy' , and 'CrossFit Total' . However, in the regression model, VO 2max explained 68% of the variance only in 'Nancy' 10 . 'Fran' consisted of 3 rounds of 21-15-9 repetitions of thrusters and pull-ups. 'Nancy' was a workout of 5 rounds of a 400 m run and 15 overhead squats with a barbell (95 lb men/65 lb women). 'CrossFit Total' included 1-repetition maximum (RM) back squat, strict shoulder-press, and deadlift. Interestingly, they did not find any significant association between VO 2max and the 'Grace' workout (30 clean and jerks (135 lb men/95 lb women)). The authors thus hypothesized that those with higher VO 2max may perform better in longer workouts that require running (i.e., 'Nancy') compared to shorter (i.e., 'Grace'). In contrast to Dexhaimer, Butcher et al. 9 found no correlation between VO 2max and 'Fran' , 'Grace' , or 'Cindy' (as many rounds as possible of 5 repetitions of pull-ups, 10 repetitions of push-ups, and 15 repetitions of bodyweight squats performed in 20 min) or 'CrossFit Total' . However, they found significant correlations between VO 2 at anaerobic threshold and 'Fran' , 'Grace' , and 'CrossFit Total' . Furthermore, Martinez-Gomez et al. 11 aimed to determine which physiological variables could predict performance during a cross-training competition (The Open, 2019). They found that the combination of lower-body muscle power (squat jump performance), reactive strength (reactive strength index during a drop jump), and aerobic power (as measured with the VO 2max ) together explained 81% of the cross-training performance variance, showing that HIFT performance is associated with a variety of fitness markers related to both aerobic and anaerobic/power capabilities. This seems to confirm that HIFT demands high adaptation to both aerobic and anaerobic types of exercise. Therefore, the improvement in cardiorespiratory fitness may enhance the performance in longer workouts like FGB or 'Nancy' . Thus, it would be reasonable to include aerobic training in cross-training programming. Moreover, it would be also beneficial to include training like FGB in sports using varying energy systems like combat and team sports. In these sports, all three energy systems are used according to the intensity, rhythm, and duration of the competition. That is why they can benefit the most from FGB, which is of high intensity yet strongly correlated with aerobic fitness (i.e., T exh ).
Also, the present study is the first to provide extensive data on biochemical response after each exercise test among HIFT-trained participants. La and Pa concentrations significantly increased FGB POST and ICT POST compared to FGB PRE and ICT PRE , respectively, but there were no differences in La or Pa between FGB POST (13.41 ± 3.52 mmol/L and 0.79 ± 0.15 mmol/L, respectively) and ICT POST (12.74 ± 2.93 mmol/L and 0.77 ± 0.12 mmol/L, respectively). GLU concentration increased significantly only FGB POST versus FGB PRE (153.0 ± 44.6 mg/dL to 117.2 ± 22.3 mg/dL). In a previous study, Feito et al. 20 observed similar concentrations of La in the last round of 15-min cross-training AMRAP (13.89 ± 2.23 mmol/L in males and 11.53 ± 1.69 mmol/L in females). The 15-min AMRAP circuit consisted of 250-m of rowing, 20 Kettle bell swings (16 kg for men and 12 kg for women), and 15 dumbbell thrusters (16 kg for men and 9 kg for women). In another study, La and GLU concentrations were significantly different between WOD 1 and WOD 2 (13.30 ± 1.87 mmol/L La after WOD 1 vs 18.38 ± 2.02 mmol/L La after WOD 2 , 135.4 ± 19.6 mg/dL GLU after WOD 1 vs 167.4 ± 19.6 mg/dL GLU after WOD 2 ) 22 . WOD 1 (AMRAP) consisted of as many rounds as possible of burpees and toes to bar increasing repetitions (1-1, 2-2, 3-3…) in five minutes. WOD 2 was the number rounds for time (RFT) consisting of three rounds of 20 repetitions of wall ball (9 kg) and 20 repetitions of power clean (a load of 40% 1RM) in the shortest possible time. HR measurement showed that during WOD 1 the majority of time was spent in the intensity zone of 50-59% HR max , whereas during WOD 2 it was in the zone of 90-100% HR max . Thus, this explains the difference in La and GLU concentration. Fernandez-Fernandez et al. 23 observed that La concentration increased from 4.0 ± 1.3 mmol/L to 14.5 ± 3.2 mmol/L in 'Cindy' and from 4.0 ± 1.3 mmol/L to 14.0 ± 3.3 mmol/L in 'Fran' . Moreover, Tibana et al. 24 observed lower concentrations of La after two different cross-training workouts, 11.84 ± 1.34 mmol/L after WOD 1 and 9.05 ± 2.56 mmol/L after WOD 2 . WOD 1 was 10 min of AMRAP of 30 double-unders and 15 power snatches (34 kg) 24 . WOD 2 was 12 min AMRAP of rowing 250 m and 25 target burpees. They also observed an increase in GLU concentration (81.59 ± 10.27 mg/dL to 114.99 ± 12.52 mg/dL in WOD 1 vs. 69.47 ± 6.97 mg/dL to 89.95 ± 19.26 mg/dL in WOD 2 ).
In the present study, we did not observe any significant differences in CK and LDH activities. Timon et al. 22  www.nature.com/scientificreports/ elevated 24 h post exercise (864.0 ± 369.5 IU/L after WOD 1 and 673.8 ± 444.1 IU/L after WOD 2 ) and returned to baseline 48 h post exercise. However, they did not note any differences in LDH. Finally, it has been generally shown that acute exercise increases erythrocytes, leucocytes and platelet counts, hematocrit values, and hemoglobin concentration significantly as compared to pre-exercise values, and these increments depend on fluid shifts (plasma volume contraction) caused by the exercise 25 . For this reason, we standardized post exercise biochemical and hematological parameters in relation to the hematocrit value. We observed significant differences in HGB and MON between T 1 , T 2 and T 3, yet they were not clinically important. WBC, LYM, and MON concentration increased after exercise (both FGB and ICT). GRA increased significantly only after ICT. Interestingly, LYM count was higher FGB POST than ICT POST , but pre-exercise values were also higher before FGB than before ICT. The increase in WBC can be attributed to increased blood flow that recruits the leukocytes from the marginal pool and/or hormonal changes which are likely to be mediated by beta-2 adrenergic receptors 25 . In contrast to our study, in sedentary adults acute high-intensity interval training (HIIT) increased HGB concentration from 15.75 ± 0.76 g/dL pre-exercise to 16.59 ± 0.81 g/dL post-exercise and RBC count from 5.44 ± 0.22 × 10 12 /L to 5.92 ± 0.22 × 10 12 /L 26 . In addition, they observed a significant increase in WBC (from 7.32 ± 1.83 × 10 9 /L to 12.84 ± 3.37 × 10 9 /L) and LYM (3.11 ± 1.59 × 10 9 /L to 5.22 ± 1.99 × 10 9 /L) but not in MON. The differences between individual studies most often result from not using the hematocrit conversion formula that takes into account post-exercise dehydration and the transfer of fluids from the bloodstream to the tissues 27 . This can explain the differences in blood cell counts observed by Belviranli et al. 26 .
Our study is the first to evaluate the repeatability of a cross-training test. The strength of our study is that the participants performed FGB and ICT three times in the same conditions, which allowed more precise assessment of reproducibility. Moreover, we controlled whether the participants implemented any significant changes in their lifestyles, i.e., reduction diet, by measuring body composition. We also controlled the HR during the test to evaluate the intensity of exercise. By comparing average FGB HR to HR max measured in ICT, we could assume that FGB was very intense and that every time the participants put in a lot of effort. We also controlled the biochemical and hematological parameters throughout the study, which gave us insight into metabolic changes in relation to both exercise tests.
The biggest limitation of our study is that we only measured the correlation of FGB with aerobic fitness. HIFT is varied in its nature, because it combines strength, power, speed, agility, and cardiovascular fitness. For this reason, we claim that future studies should evaluate the relationship between FGB performance and other physiological parameters such as anaerobic power or strength. Even though our study shows that FGB gives reliable scores, it seems important to evaluate its connection with other physical traits. It should also be taken into account that we used a cycloergometer for aerobic fitness evaluation in this study, which, to some extent, could affect the obtained aerobic fitness results. It is well known that aerobic power measured on a treadmill is higher than on a cycloergometer, because running engages whole-body and cycling mostly lower-body movements. However, the choice of a cycloergometer was motivated by the reluctance of our study group towards running. Thus, there was a justified concern that the participants would not fully engage in a graded running test. It is also worth-noting that the participants were well adapted to perform lower-body movements (like squats, deadlifts and lunges), which could be beneficial for cycling performance.

Conclusions
Our study showed that Fight Gone Bad is a reliable and repeatable test to measure cross-training performance. Moreover, FGB is strongly correlated with aerobic fitness. FGB can be used as a tool in interventional studies to evaluate the changes in cross-training scores. Furthermore, given that FGB is a non-invasive, easy to perform, and accessible test, it can be regularly used by coaches throughout the training season.

Methods
Participants. Thirty-one participants were initially enrolled in this study. However, twenty-one (9 women, and 12 men with mean ± SD ages of 31.5 ± 5.5 years, body height 174 ± 8 cm, baseline values of body mass 73.0 ± 14.0 kg, free fat mass (FFM) 58.7 ± 13.9 kg, fat mass (FM) 19.1 ± 7.1%, and VO 2max 3191 ± 823 mL/min) completed the entire study protocol and were included in the analyses (Fig. 2). The participants were at a similar moderate athletic level. They have been regularly doing HIFT at Rankor Athletics, Reebok CrossFit Poznań, and Caffeine Barbell clubs in Poznań, Poland. The criteria to qualify for the study included the following: age between 20 and 40 years, the absence of injury and/or any other issues, good health with a valid and up-to-date medical certificate confirming the athlete's ability to practice sports, at least 2 years of regular cross-training experience, and a minimum of 4 workout sessions (cross-training) per week for at least six months. We included both males and females in order to have equal participation of both genders in HIFT training and to test gender-related impact, assuming the purpose and scope of this work was considered negligible. Exclusion criteria included the following: being a current smoker, participating in illicit drug use, alcohol consumption greater than the equivalent 1-2 one alcoholic drinks per week, and dietary supplement use or being on any special diet within 3 weeks of the study's commencement. For females, additional exclusion criteria were being pregnant or planning to become pregnant during the study. The cross-training box coaches enabled confirmation of the required inclusion criteria declared by the participants. They also supported the control of training adherence compliance. The drop-outs were predominantly independent from the study protocol (Fig. 2). The reasons for dropouts were as follows: personal, infections, minor injuries during customary training, and/or the inability to participate in the time frame of the planned protocol. The studies were conducted in 2015 and 2016 off season. All subjects declared that they had not introduced any changes in their lifestyles, elements of training, and/or customary nutrition. www.nature.com/scientificreports/ The study protocol was reviewed and approved by the local ethical committee (Bioethics Committee at Poznan University of Medical Sciences, Poznan, Poland). Each subject was informed of the testing procedure, its purpose, and the risks of the study. Each participant submitted her/his written consent to participate. All procedures were conducted in accordance with the ethical standards of the 1975 Helsinki Declaration.
Study design and protocol. The primary outcomes in this paper were the repeatability and reliability of FGB performance and its relation to aerobic performance. The study protocol included three visits to the Exercise Tests Laboratory at the of the Department of Human Nutrition and Dietetics (DHND) at the Poznan University of Life Sciences and selected "Cross-training Boxes" in Poznan at baseline (T 1 ), and after 10 (T 2 ) and 20 (T 3 ) days, respectively (Fig. 2). Subjects were instructed not to participate in any high-intensity or long-duration training sessions at least 24 h before testing. All measurements at the DHND were performed in the morning (7.30-10.00 AM) and in a fasting state (water intake was recommended; a standardized meal was eaten the previous night immediately before going to sleep (about 1.2 g of carbohydrates per kg of body mass and 40 g of protein)). At the beginning, subjects underwent body composition analysis. Afterward, an incremental cycling test until volitional exhaustion was performed. During all of these measurements, the ambient temperature remained at 20-22 °C. In the afternoon of the next day and three hours after standardized small meals (about 0.6 g of carbohydrates per kg of body mass and 15 g of protein), the discipline-specific cross-training test was performed. Enrolled participants were familiar with the tests and procedures used as they had participated in some previous research projects.
Anthropometry and body composition. Body mass (kg) and height (cm) were measured using a professional medical scale with a stadiometer (WPT 60/150 OW, RADWAG, Radom, Poland) at an accuracy of 0.1 cm and 0.1 kg for height and body mass, respectively. FM and FFM were assessed based on air displacement plethysmography using the Bod Pod (Cosmed, Rome, Italy) as described previously 12,13 . Total body water and hydration level and additional FM and FFM evaluation was assessed by bioelectric impedance with Bodystat 1500 (Bodystat Inc, Douglas, UK) based on the previously mentioned recommended procedures 28 .
Exercise tests. The study protocol consisted of the incremental cycling test (ICT) and FGB workout performed 3 times (T 1 , T 2 , and T 3 ). Between ICT and FGB tests at least a 30-h recovery break was implemented. Prior to each tests (ICT and FGB), participants were given instructions on the procedure, and they completed a brief warm-up period (a 5-min effort on a cycloergometer (Kettler-X1, Kettler, Ense-Parsit, Germany) of approximately 50 W power and ~ 70 rpm cadence, followed by a 5-min light stretching and 5-min break). All www.nature.com/scientificreports/ tests were performed in proper workout clothing and shoes, and the tests were supervised by an experienced researcher. Heart rate was continuously monitored during exercise using a telemetric system (Polar, Kempele, Finland). Furthermore, capillary blood samples were obtained for analysis before and after each test. During exercise, all test participants were verbally encouraged to maximize their efforts.

Aerobic fitness test.
An exercise test on the Kettler X1 cycloergometer (Kettler, Ense-Parsit, Germany) was performed to determine peak oxygen uptake (VO 2peak ), and gas exchange threshold (GET). We considered the VO 2 peak to be the moment when the individual oxygen uptake (VO 2 ) recorded during the ICT reached the highest point 29 . To determine the GET during the ICT, the V-slope method was applied based on an analysis of the linear regression for the curve of increasing CO 2 exhalation in comparison to the curve of increasing O 2 uptake [30][31][32] .The initial load was set at 50 W for women and 75 W for men and increased every 1.5 min by 25 W until volitional exhaustion. Respiratory parameters and heart rate (HR) were measured (breath by breath) by the Quark CPET ergospirometer (Cosmed, Rome, Italy). Measured variables included time to exhaustion (T exh ), maximal workload (W max ), maximum heart rate (HR max ), time to GET (T GET ), workload at GET (W GET ), heart rate at GET (HR GET ), oxygen uptake at GET (VO 2GET ), VO 2peak , peak carbon dioxide production (VCO 2peak ), and energy expenditure (EE).

Fight Gone Bad. Fight Gone
Bad comprised three rounds of five exercises: wall ball, sumo deadlift high pull, box jump, push press, and rowing 13,33,34 . Participants were instructed to complete as many repetitions as possible in one minute at each station prior to moving to the next station. After completing each of the five stations, participants had one minute of rest (Rest 1 between the 1st and 2nd and Rest 2 between the 2nd and 3rd rounds) before beginning the next round 13,34 . Wall balls combined a front squat with a medicine ball (6 kg for females, 9 kg for males) and a push press-like throwing of the ball to a target located 2.75 m for females and 3.0 m for males. At the bottom of the squat, the hips should be lower than the knees. In sumo deadlift high pull, the feet were wider than the hips, and the grip was inside the knees. The exercise started with lifting the bar (25 kg for females, 35 kg for males) from the ground like in classical deadlift, but then the bar was pulled to the chest. At the end, the elbows should be higher than the shoulders. The Box jump started with both feet on the ground. Athletes jumped on a box that was 50 cm tall for females and 65 cm for males with landing on both feet. The exercise ended when shoulders, hips, and knees were extended in one line. Push press started with lifting the bar (25 kg for females and 35 kg for males) from the ground to the front rack. Then the bar was pushed overhead using leg power. After the shoulders were straight, the bar was dropped back to the shoulders. Rowing was performed on an ergometer. Feet were taped to the feet plates with special straps. The handle was pulled towards the chest, using the push from the knees. The test was video recorded in order to allow an accurate count of all properly done repetitions. For each valid repetition, a participant needed to complete a full range of motion required for a specific exercise.
Blood samples analysis. Blood was collected by qualified medical personnel in accordance with applicable procedures. Before (ICT PRE and FGB PRE ) and 3 min after exercise tests (ICT POST and FGB POST ) capillary blood was collected from a fingertip of the nondominant hand using a disposable lancet-spike Medlance Red (HTL-STREFA, Łódź, Poland) with a 1.5 mm blade and 2.0 mm penetration depth as described previously 13  Statistical analysis. Normal distribution was examined using the Shapiro-Wilk test. Differences between T 1 , T 2 , and T 3 were analyzed using repeated ANOVA measures. Relative reliability was assessed using the intraclass correlation coefficient (ICC) between T 1 , T 2 , and T 3 . The ICC gives the ratio of variances due to differences between subjects. ICC < 0.40 was considered low, between 0.40 and 0.70 acceptable, between 0.70 and 0.90 good, and > 0.90 excellent. However, ICC does not give an indication of the accuracy of individual measurements. Absolute relativity was calculated as standard error of measurement (SEM), which quantifies the precision of the individual measurements. www.nature.com/scientificreports/ the minimal amount of change that a measurement must show to be greater than the within subject variability and measurement error, also referred to as the sensitivity to change, was also calculated. Associations between the FGB score and aerobic capacity were measured using the Pearson correlation coefficient. The following criteria were adopted for the interpretation of the magnitude of the correlation: trivial (r < 0.1), small (0.1 ≤ r < 0.3), moderate (0.3 ≤ r < 0.5), large (0.5 ≤ r < 0.7), very large (0.7 ≤ r < 0.9), nearly perfect (0.9 ≤ r < 1), and perfect (r = 1). The agreement of two methods was evaluated using the Bland-Altman method after data normalization 36 . Normalization was done by subtracting the mean and dividing by standard deviation.
Ethics approval. All procedures performed were in accordance with the ethical standards of the institutional and national research committee (Bioethics Committee at Poznan University of Medical Sciences, Poznan, Poland (Decision no. 173/15 of 5 February 2015)) and with the 1975 Helsinki declaration and its later amendments or comparable ethical standards.
Consent to participate. All participants signed an informed consent.

Practical applications
This work proposes the first evaluation of the reliability and validation of a specific test to measure HIFT performance. Our study indicated that FGB is a reliable test that can be used in order to measure changes in crosstraining performance caused by an intervention. We also showed that cross-training performance is correlated to aerobic fitness, which gives more insight into the physiology of the test. It shows that aerobic fitness, even though underestimated by most of the cross-training athletes, can be an important contributor to success. Our findings could serve as guidance for scientists, as well as coaches and athletes who consider achieving their scientific and/ or training goals based on the cross-training specific Fight Gone Bad workout.

Data availability
The datasets used and/or analyzed during the current study are available from the corresponding author on request.