Linking structural and compositional changes in archaeological human bone collagen: an FTIR-ATR approach

Collagen is the main structural and most abundant protein in the human body, and it is routinely extracted and analysed in scientific archaeology. Its degree of preservation is, therefore, crucial and several approaches are used to determine it. Spectroscopic techniques provide a cost-effective, non-destructive method to investigate the molecular structure, especially when combined with multivariate statistics (chemometric approach). In this study, we used FTIR-ATR spectroscopy to characterise collagen extracted from skeletons recovered from necropoleis in NW Spain spanning from the Bronze Age to eighteenth century AD. Principal components analysis was performed on a selection of bands and structural equation models (SEM) were developed to relate the collagen quality indicators to collagen structural change. Four principal components represented: (i) Cp1, transformations of the backbone protein with a residual increase in proteoglycans; (ii) Cp2, protein transformations not accompanied by changes in proteoglycans abundance; (iii) Cp3, variations in aliphatic side chains and (iv) Cp4, absorption of the OH of carbohydrates and amide. Highly explanatory SEM models were obtained for the traditional collagen quality indicators (collagen yield, C, N, C:N), but no relationship was found between quality and δ13C and δ15N ratios. The observed decrease in C and N content and increase in C:N ratios is controlled by the degradation of protein backbone components and the relative preservation of carbon-rich compounds, proteoglycans and, to a lesser extent, aliphatic moieties. Our results suggest that FTIR-ATR is an ideal technique for collagen characterization/pre-screening for palaeodiet, mobility and radiocarbon research.

In contrast to the in vivo molecule, archaeological collagen degradation models are complex because they need to consider changes that occurred during post-mortem. Some authors have used collagen quality 18,19 or modelled linear structure 20 as an indicator of bone degradation with time, and, in contrast to medical sciences, less attention has been paid to structural changes. Even now, the mechanisms and processes that influence the degradation of collagen extracted from archaeological bone samples are still poorly understood. To redress this, it is necessary to unravel the changes at structural level to achieve a good understanding of archaeological collagen preservation.
Despite its potential, few studies have used spectroscopic techniques to determine collagen preservation in archaeological bone [21][22][23][24][25] . Fourier Transform Infrared (FTIR) spectroscopy has been regarded as a suitable method to explore the structure of collagen [26][27][28][29][30][31][32] , by relating FTIR absorption bands (of the amide I, II and III) to specific chemical bonds and secondary structural features (α-helix, β-sheets, β-turns and random coils), even in the most recent investigations 32 . But, as early as the mid-twentieth century, there was a fundamental change in the comprehension of the collagen structure led by X-ray diffraction investigations, which showed that the traditional model was incorrect and the polyproline II (PPII) model was introduced and backed by later investigations, becoming the accepted model 4,[33][34][35][36] . Although FTIR does not provide the same level of detail of the molecular composition compared with X-ray, Nuclear Magnetic Resonance (NMR) or Pyrolysis GC-MS, it can provide nonetheless valuable insights about the structure of complex molecules such as proteins 37 .
FTIR has many advantages when compared with the conventional methods commonly used to study collagen. It is a quick, cost-effective and non-invasive method 21,26 . Most studies that have used FTIR on ancient skeletons focus on the characterisation of the bone mineral component among others [38][39][40][41] or on taphonomic processes such as cremation [42][43][44] . The collagenous portion of bone has been analysed with relatively less frequency using FTIR 15,22,[45][46][47] , and Raman spectroscopy 21,[48][49][50] . Studies of bulk bone have also demonstrated that it is difficult to detect collagen content in poorly preserved bones 47 , whereas extracted archaeological collagen has only been directly analysed in few studies 24,25,48 . Therefore, previous research focused upon establishing criteria or parameters for collagen preservation screening while the changes in the structure of the molecule have received much less attention.
The objective of our study is to characterise collagen extracted from archaeological human bone of different age, funerary context and burial environment, using FTIR-ATR in the mid infrared region (4000-400 cm −1 ). By using a combination of principal components analysis (PCA) and partial least squares-structural equation modelling (PLS-SEM), we (i) discuss the possible mechanisms of archaeological bone collagen structural transformation, (ii) the potential of FTIR-ATR to predict collagen quality indicators (i.e. C, N, C:N, collagen yield) and (iii) whether collagen quality affects its isotopic (δ 13 C and δ 15 N) composition, which is key for the study of human palaeodiet and radiocarbon dating.

Results
Collagen properties. Of the fifty samples analysed, collagen yield ranged between 25% (similar to intact bone) and 2% (above the proposed limit of 1% 17 ), whereas the C:N ratio was between 3.18 and 3.57. Carbon and nitrogen contents showed a larger range (C: 44.3-17.9%; N: 6.1-16.1%). None of the samples analysed in this study exceeded the C and N values of fresh collagen (43% and 16%, respectively 17,19 ) by more than 3%. Eight and twelve samples provided percentage C and N values below 80% of those of fresh collagen respectively and two samples (424 and 705) were below 50%. Only one sample (424) showed a C:N ratio (3.57) slightly above the range (3.02-3.56) proposed as representative for well-preserved collagen 17 .
A wide distribution of isotopic results has been found in this study, especially for δ 13 C, which is interpreted as the result of palaeodietary preferences. For example, the observed differences in δ 13 C can be related to geographical location, whether coastal or inland, and δ 13 C was found to be influenced by the consumption of marine resources. Preference for the use of C 4 plants in human and domestic animal diet and a strong reliance on seafood and fish-on the coast-occurs in North-Western Spain 51 . Historical and archaeological data agree with the obtained isotopic signatures and were discussed in detail for the analysed populations 52 .
For the samples used in this study, collagen yield shows significant, although low, correlations only with C, N and the C:N ratio (r 0.44, 0.49 and − 0.48, respectively; P < 0.01). Carbon and nitrogen contents are highly correlated (r 0.99; P < 0.01) with each other and are negatively correlated with the C:N ratio (− 0.76 and − 0.79, respectively; P < 0.01). Collagen compositional properties are not significantly correlated with the isotope ratios. Despite this, both isotope ratios are moderately correlated (r 0.55; P < 0.01), caused by the input of marine resources influencing some of the samples see 51 .
Collagen FTIR-ATR spectra. The average spectrum of the samples shows the characteristic band distribution of collagen, with high absorbance in the regions 1500-1700 cm −1 and 2800-3500 cm −1 , moderate absorbance at 1300-1500 cm −1 and relatively low average absorbance at 800-1200 cm −1 (Fig. 1a). The standard deviation spectrum is similar to the average one but shows a relatively large variation between samples in the region 800-1200 cm −1 , despite its low average absorbance (Fig. 1a); whereas the 2800-3500 cm −1 region only presents a peak around 3300 cm −1 .
The most relevant peaks obtained from the second derivative spectra, in the region 800-1800 cm −1 , are shown in Fig. 1b Main spectroscopic signals of collagen. We selected 24 bands, which are representative of the different spectral regions of the type I collagen spectrum (carbohydrates, amide III, miscellaneous-mainly aliphaticsregion, amide II, amide I, aliphatics, amide B, amide A/OH; for a definition of these regions see for example 27,31 ), to perform the PCA. Four principal components accounted for 95.5% of the variance ( Table 1). The first component, Cp1, explains 45.5% of the total variance and it is characterised by large positive loadings (0.73-0.94) of absorptions of carbohydrates (i.e. collagen proteoglycans) and large negative loadings (− 0.86 to − 0.72) of absorptions of the amides (I, II and III) and the miscellaneous region (Table 1).  (Table 1). Of the collagen absorption bands, Cp2 accounts for a large percentage of the 1690 cm −1 (76%) and a moderate percentage of 1624 cm −1 (45%) variance of amide I, and 1200 cm −1 (86%) of amide III. It also contains a low (24%) percentage of the variance the 1655 cm −1 absorption.
Components Cp3 and Cp4 account for a minor part of the total variance, 6.7 and 5.2%, respectively (Table 1). Absorptions related to aliphatics (2874, 2930 and 2982 cm −1 ) have the largest (albeit moderate to low) loadings in Cp3. While absorptions of the amide A/OH region (3320 and 3458 cm −1 ) and one of the carbohydrates bands (1030 cm −1 ) show moderate and opposed (negative and positive, respectively) loadings in Cp4 (Table 1).
Cp1 is highly correlated (P < 0.01) to the PGI, C and N content, and the C:N ratio ( Table 2). Collagen yield is significantly correlated to Cp1 and Cp3, and the CI is negatively correlated with Cp3, although the correlation coefficients are low. www.nature.com/scientificreports/ Modelling collagen quality and isotopic composition. The PCA results suggest that the spectroscopic nature of extracted bone collagen can provide insights into the main transformations of its composition and structure, which may be related to collagen preservation. To do so, we applied PLS-SEM modelling to determine (i) whether transformations of the collagen structure are coupled to changes in collagen quality (i.e. C, N, C:N, collagen yield), and (ii) if changes in collagen quality affect the isotopic (δ 13 C and δ 15 N) composition. The model was initially designed with four predictor LV (amides, backbone lipids, side-chain lipids, and carbohydrates; SI_Figure 3), one primary response LV (collagen quality) and a secondary response LV (collagen isotopic composition; this one depending exclusively on collagen quality). As indicators, we used representative absorption bands for the predictor LV, analysed properties and indices (C, N, C:N, collagen yield, CI and PGI) and isotopic ratios (δ 13 C, δ 15 N). Although the model predicted 92% of the collagen quality variance (SI_Figure 3), the lipids LV failed to pass the collinearity tests as it shared 88% of its variance with the amide LV and its total effect coefficient on collagen quality was very low (− 0.04). As a result, for the final model we merged this LV with the amides into one LV, named as "structural components". Of the 24 absorption bands used in the PCA, 17 met the criteria for good indicators (absolute value of the loading > 0.7, Table 3) and were kept in the model. It is worth remembering that the square of the outer loading accounts for the proportion of variance of the indicator that is captured by the LV in PLS-SEM reflective mode. The loadings of the FTIR absorbances, with only one exception (1200 cm −1 , in the structural components LV), show that almost all their variance is captured by the modelled LV. Carbon, N, C:N and PGI also meet the criteria of good indicators of collagen quality, but collagen yield has a moderate loading and the CI a very low one (Table 3). While the PGI highly co-varies with the common collagen quality parameters and maybe a valid indicator, collagen yield and CI are not. For this specific model, collagen quality is thus related to the former. Collagen yield has some dependence on operator processing (inaccuracy in pipetting, filtering, etc.).
The total effects' coefficients ( Fig. 2) show that the structural components have the strongest, positive effect (0.79) on collagen quality, while carbohydrates and side-chain lipids have negative total effects (− 0.43 and − 0.22 respectively). The weight of the structural components on collagen quality is almost two and four times higher than the weights of the other two LVs. This simple PLS-SEM model explains 92% of the variation in collagen quality (Fig. 2), involving as much as 92-94% of the C and N, 85% of the PGI and 70% of the C:N variance. Figure 3 shows the relationship between observed and expected values for the collagen quality indicators obtained with the PLS-SEM model. Total C and N contents and the PGI are accurately estimated, C:N ratios also show a good albeit lower performance, estimation of collagen yield is moderate and that of the CI is not significant. www.nature.com/scientificreports/ Additionally, at this level, collagen quality seems to have no significant effect on the isotopic composition: its total effect coefficient on the isotopic composition is low and the explained variance is almost negligible (4%).

Discussion
The results of the PCA are in agreement with previous investigations that use FTIR spectra to provide additional insights on protein, in particular collagen, composition and structure 29,31,32,37,[53][54][55] . Different collagen types can be identified/discriminated efficiently using absorbances from selected regions of the spectrum 27 .
In the samples analysed here, Cp1 and Cp2 seems to reflect a loss of protein backbone components. As most of the variation of the characteristic absorption of the aliphatic bonds (at 1337, 1450, 2874, 2934, and 2982 cm −1 ) are also contained in Cp1 and Cp2, and only a smaller proportion is captured by Cp3 (Table 2), it is likely that vibrations in the first two components are related to the methylene present in the backbone peptide structure whereas Cp3 may correspond to the aliphatic side chains. Cp4 seems to discriminate between the OH absorption of carbohydrates and that of the amide A. Figure 4 represents a projection of samples' scores for Cp1 and Cp2. Most samples (28 out of 50) show negative Cp1 scores and positive or slightly negative Cp2 scores. These may represent collagen with a more intact, PPIIlike, molecular structure. Twelve samples show positive Cp1 scores and positive or slightly negative Cp2 scores, suggesting some degree of collagen transformation not affecting the main protein backbone structures. Samples with positive Cp1 and negative Cp2 scores may correspond to those with more intense structural modifications. Collagen quality parameters (C, N and C:N) with the most pronounced departure from those of fresh collagen occur in the two samples with the largest Cp1 values (424 from Ouvigo and 705 from Capela do Pilar; Fig. 4). No evidence of soil contamination (i.e. humic acids) was detected. Our results are consistent with findings in a previous molecular study which used pyrolysis-GC-MS on 28 of the samples analysed here 16 . Although a detailed comparison with the molecular data cannot be done, there is an overall agreement in the classification of collagen as well or poorly preserved (20 samples out of 28).
The PLS-SEM model (Fig. 2) suggests that the more intact the collagen backbone structure (reflected by LVst), the higher collagen quality (higher C and N contents and, to some extent, collagen yield), while lower quality (higher C:N ratios and PGI values) is characterised by the relative abundance of carbohydrates (LVcb) and, to a limited extent, lipidic side chains (LVsc). Collagen transformation results in an overall decrease in C and N, and   56 , is negatively correlated to collagen quality (LVcq; r − 0.77, P < 0.01) and positively correlated to carbohydrates (LVcb) and side-chain lipids (LVsc) (r 0.67 and 0.79, P < 0.01, respectively), also consistent with the PCA results. It has been proposed that the loss of spectral intensity of collagen backbone structures is most likely related to the fragmentation of the molecule due to bacterial preference for the relatively high-energy amide bonds 21 . Altogether, this reinforces the idea that the main collagen transformation in the samples analysed here is controlled by the degradation of the amide backbone structure. However, it is not possible to assess whether bacterial degradation occurred during body putrefaction or later soil contact. Raman analysis of collagen has shown that decreasing yield is accompanied by disappearance of amide peaks but not necessarily of aliphatic (C-H) components, since poorly preserved collagen samples produced spectra with well-defined aliphatic peaks 21 . Another study found that changes in amino acid composition alone could not account for the elevated C:N ratios in low collagen bone from experimentally aged human bones 18 . Moreover, low-collagen samples are more likely to show elevated ratios than contaminated samples 17 . Our results are in line with these observations since the less intact collagen samples are enriched in C-rich compounds (carbohydrates from proteoglycans and side chain lipids) and thus the C:N is expected to increase as degradation progresses. Although the presence of small amounts of non-carbon and non-nitrogen rich contaminants, as detected in other studies 57 , cannot be dismissed, their quantity was not deemed large enough to produce a detectable signal in the spectra.
Another interesting feature is that the best-preserved samples characterised by negative Cp1 scores (Fig. 4) show a high correlation (r 0.91; P < 0.01) between the CI and the PGI (Fig. 5): the relative abundance of aliphatics and carbohydrates to the amide component tends to remain constant. In our opinion, this result has potential for the assessment of collagen transformation and integrity using FTIR-ATR; the larger the departure from the trend the more degraded the collagen structure.
The model also suggests that collagen quality (i.e. C, N, C:N and collagen yield) has no significant effect on the isotopic composition of the collagen. This is also consistent with the PCA and correlation results obtained here and in previous investigations, since no correlation was found between molecular indicators of collagen diagenesis and isotopic composition 16 . Other research also found that the isotopic values (δ 13 C and δ 15 N) and C:N ratios of the insoluble fraction remained almost stable until collagen yield represented less than 1% 18 .
We performed ANOVA tests on the LV scores of the PLS-SEM model, using the necropoleis, archaeological period (Bronze Age to Modern period), burial environment (acidic or alkaline), sex (male or female), type of bone and age-at-death (< 19, 20-39, 40-59, > 60 estimated years old) as grouping variables. No significant differences were found for any of the LV scores (structural components, carbohydrates, side-chain lipids, and  In the latter case, the good macroscopic preservation of the skeletons does not agree with that suggested by the degree of integrity of the collagen structure. As for the burial context, the alkaline environments (the cave on limestone and the palaeodunes with biogenic carbonates) showed better collagen preservation than the acidic ones as found in previous research e.g. 58,59 . Although not significant at P < 0.05, structural components and carbohydrates were higher and lower (P < 0.10) respectively in the alkaline environments. Thus, alkaline conditions seem to be the main reason for the good quality of the collagen of samples from Cova do Santo (limestone cave) and those of Rúa Real and A Lanzada (burials on palaeodunes). This is perhaps surprising given the sensitivity of collagen to hydrolysis under alkaline conditions 20 . The reasons for this apparent disagreement may be explained by (i) relatively low alkalinity in the burial contexts (pH < 9), the rate of collagen hydrolysis largely increasing above pH 11 20 ; (ii) well-drained/aerated conditions predominate; (iii) low decomposition of collagen matrix preventing post-mortem alteration in bone mineral crystal 60,61 ; and (iv) the dissolution of the bone mineral phase is retarded, limiting collagen exposition to enzymatic attack.
Recent research at A Lanzada concluded that the intensity of bone diagenesis was larger in burials in acidic soils than those on palaeodunes, regardless of the period (Roman or post-Roman) 59 . The confined environment of Cova do Santo cave could have had a larger effect than the high pH, as it was observed on research made in catacombs 62 . However, the particular mineral content of groundwater in this cave could also have promoted collagen preservation 63 . In our previous study of collagen molecular composition 16 , we identified a depolymerization process that differed depending on burial environment: acidic (soils/sediments) showing higher degree of depolymerization than alkaline (sand dunes and limestone cave) environments. Acidic conditions, which have been found to be the main cause of bioapatite alteration 41,59 and promotion of collagen dissolution 64 , seem to be also important in the preservation of the protein structure-regardless of the chronological age. The oldest bones were the ones with the best preservation in our study. Finally, pH has been considered as part of "the site hydrology"-including also the mineral content of groundwater-a much more general factor that controls bone preservation 65 . In our study, well-drained sites (e.g. palaeodunes, such as the ones from Calle Real and A Lanzada) and places with constrained water movement (caves, as Cova do Santo) provided the best conditions for preservation. In both areas, groundwater is probably oversaturated for calcium phosphate, which would explain the good preservation of mineral and organic phases of the bone. The humidity of the soil can also promote bone degradation through microbial and fungal attack since alteration by microorganisms seems to dominate in temperate regions 63: p.114 . Humid conditions in NW Spain favour fungi in those soils neither well-drained nor anoxic. In addition, bones from Cova do Santo were exposed (not buried), which may have resulted in different postmortem changes 61,63 .
Despite these reservations, we conclude that there is no single factor to explain the changes in collagen structure. All necropoleis presented relatively large variations in their samples´ collagen structural components (Fig. 3); i.e. we found a range of preservation within populations rather than between populations of well/poorly preserved collagen. This may indicate that within any given geochemical environment conditions occurring www.nature.com/scientificreports/ at microscale may determine the intensity of degradation of collagen, an idea that has been suggested for the alteration of the mineral part of the bone 59,66-68 . Microorganism attack on bone is also a complex process 69 with alterations caused by bacteria and fungi occurring on different scales and dependent on the perimortem and postmortem characteristics of the specific inhumation, which are difficult to fully appreciate in the current study. Raman studies also found spectral heterogeneity on bone crossed-sectioned surfaces, which was interpreted to indicate heterogeneous preservation of the collagen within a single bone 21 .

Conclusions
Chemical transformation on archaeological human skeletons is a topic approached from different perspectives. Despite of this intense work, some authors have remarked upon the improvement of evaluating collagen preservation as a key factor to understand the interaction between bone and burial environment 65 . As far as we know, ours is the first study to analyse extracted collagen from human archaeological bone using FTIR-ATR, instead in bulk bone. Our findings indicate that there is a continuous change in C, N, and C:N ratios that is coupled to the integrity of the collagen structure: C and N decrease and C:N ratio increases as the protein structures are degraded and carbohydrates (and aliphatic side chains) are preserved, resulting in a relative increase in C-rich compounds. This transformation may explain why the discarded collagen samples in isotopic studies used to have high C:N values. However, the observed structural and compositional changes did not affect, in a significant way, the δ 13 C and δ 15 N values, thus supporting their use for palaeodiet reconstruction and radiocarbon dating. Additionally, we found that the carbohydrates/amide I index (PGI) is a potential reliable indicator of the compositional change of the collagen; the combination of the PGI with the CI may be of use to identify well-structured (i.e. preserved) collagen using FTIR-ATR. Thus, FTIR-ATR is an ideal technique for characterizing/pre-screening extracted collagen that is to be used for other destructive, more time consuming and expensive techniques in palaeodiet, mobility and radiocarbon research. For a full understanding of the link between structural and compositional changes in collagen, more research should be done for example by including samples not fulfilling all the "good-quality" criteria. There is the risk of inducing a bias in the results by analysing only those samples fulfilling the criteria 48,70 , as these are the ones expected to show less transformations of the molecule structure.

Materials and methods
Sample selection, collagen extraction and collagen properties. Collagen was obtained from fifty human skeletons recovered from eight necropoleis located in NW Iberia (SI_Figure 1, Table 1 16,52 . These necropoleis represent different archaeological/cultural periods (Bronze Age to post-Medieval) but also cover different geochemical environments (ranging from acidic soils, palaeodunes with biogenic carbonates to a cave formed in limestone) and different types of funerary contexts with human remains (SI_Table 1). The analyzed samples were selected according to bone surface preservation and available skeletal pieces, mainly ribs and long bones. Pathological bones were avoided. The individuals were estimated to be adults (18-60 years old) from both sexes (23 males, 20 females; and 7 undetermined). More archaeological, palaeodietary and osteological information about the necropoleis can be found elsewhere 51 . The climate of the area is temperate and moderately humid, providing good conditions for collagen preservation, with only slow losses expected to have taken place 17 . The collagen extraction procedure followed 11 , with modifications by 73 . Small pieces of cortical bone (100-200 mg) were cleaned by removing 1-2 mm of the outer surface and demineralized in HCl (0.5 M) at low temperature (4 ºC) over approximately a week, in order to limit protein alteration. Samples were then heated (48 h at 70 ºC) in a weak (pH 3) HCl solution in order to gelatinize the collagen. The resulting solution was filtered (Ezee-filter™) and freeze-dried. Recent FTIR research on collagen type I 32 has shown that heating between 20 and 80 ºC affects the relative intensity of some of the amide I and amide III vibrations. The intensity reduction/ enhancement was lower than 5% for most of the bands and much of the change occurred between 40 and 50 ºC, stabilizing thereafter. Thus, the protocol we used to extract collagen was likely to produce a slight reduction in some bands absorbance. Since all samples were treated equally, this effect is not considered to have had a significant effect on the statistical associations and modeling.
Collagen properties (% C, % N, C:N ratio), often used to evaluate its degree of preservation, and stable isotope ratios (δ 13 C, δ 15 N) were determined (in duplication) using an Europa 20-20 isotope ratio mass spectrometer coupled to a Sercon elemental analyzer, in the Department of Archaeology of the University of Reading (UK). Collagen yield was calculated as the wt% of collagen in archaeological bone. The results and discussion of these analyses have been described elsewhere 51,52,71,72 . All selected samples were considered to meet the criteria to be suitable for isotopic (δ 13 C and δ 15 N) study.
Infrared measurements and peak selection, IR indices. FTIR spectra (4000-400 cm −1 ) were acquired at 4 cm −1 resolution by using a Gladi-ATR (Pike Technologies) spectrometer at the IR-Raman facility of the RIAIDT (Universidade de Santiago de Compostela, Spain). All spectra were background corrected and smoothed with the Savitzky-Golay filter. Both processes were computed into Resolutions Pro FTIR Software Scientific Reports | (2020) 10:17888 | https://doi.org/10.1038/s41598-020-74993-y www.nature.com/scientificreports/ (Agilent Technologies, USA) (a figure with all 50 spectra can be found un supporting information, SI_Figure 2). For the sake of representation, and given all spectra showed the same vibrational features, the average spectrum and the standard deviation spectrum were computed. In this way the average spectrum provides an overall figure for the whole set of samples analysed, while the standard deviation spectrum enables to highlight which regions of the mid infrared spectrum showed the greatest variability between samples (that is, where most of the information on the differences between the samples is located). Additionally, two indices, the collagen index (CI) and carbohydrate/amide I index (PGI)-similar to the proteoglycan/amide I index, previously proposed as markers of cartilage degeneration 8,74-76 , were calculated from the IR spectra to check their validity to determine collagen compositional change: The second derivative of infrared spectra was used for a more detailed structural characterisation of the collagen 77,78 of all samples. This is a highly suitable method for peak identification as it enhances sharp bands, allowing to search peaks that are barely visible in the raw spectra 27,79,80 , as well as providing information into the structure of proteins 31 . Peak selection was done by locating minima in the second derivative as described in 81 . When evaluating the position of the relevant peaks in the second derivative spectra, we allowed for a ± 4 cm −1 interval.
Statistical methods. The amount of information contained in each IR spectrum is rather large and the identification of the spectral regions that play a decisive role in the differences between collagen samples becomes quite complex. We applied principal components analysis (PCA) to 24 characteristic collagen vibrations detected with the second derivative to determine the main spectroscopic signatures and their variation for the set of samples analysed. PCA analysis was carried out on correlation mode, with varimax rotation (i.e. maximizing the loadings of the variables), after all variables were standardized (Z-scores = (xi-avg)/std), xi being the absorbance value at any wavenumber, "avg" the average absorbance of the spectrum and "std" the standard deviation of the spectrum) to avoid scaling effects 82 .
With the insights gained in the PCA we developed a PLS-SEM model. This technique was chosen because, in comparison with other multivariate fitting/predicting techniques, it reduces the dimension of predicting variables (only a few latent variables-LV-are used), avoids multicollinearity (the LV are orthogonal), deals robustly with fat matrices (low-moderate number of cases in relation to the number of variables) and enables to calculate direct and indirect effects 83 . In PLS-SEM, predictor LV are defined to maximize the explanation of the variance of the response LV 83 .
In our model, collagen components (amides, lipids, carbohydrates) and collagen quality were defined as latent variables (LV). As indicators of the latent variables we used the characteristic vibrations of the collagen components (see below) and C, N, C:N, collagen yield, the CI index and the PGI index for collagen quality. We aimed to test whether transformations of the collagen structure were coupled to changes in collagen quality (i.e. in C, N, C:N, collagen yield, CI and PGI). A second objective was to assess if changes in collagen quality affected the isotopic composition, so an additional latent variable was included, being δ 13 C and δ 15 N its indicators. The model was performed in reflective mode (i.e. indicators as proxies of the latent variables) using the specific software for PLS-SEM modelling SmartPLS 84 .