Exploring the potential of TEM analysis for understanding cooking at prehistoric feasting sites

This study explores the utility of transmission electron microscopy (TEM) analysis of bone collagen for investigating prehistoric cooking. Approaches to cooking practices have relied principally on artefactual evidence, macroscopic bone modification, and organic residue analysis. However, direct evidence for cooking of bone has been limited. Richter and Koon successfully applied TEM analysis of collagen to determine heating to modern and medieval bones, but this method has yet to be experimentally tested using prehistoric remains. Collagen will denature at relatively low temperatures, such as during roasting, boiling, or baking. The denaturation of collagen causes predictable structural changes that can be viewed through TEM. Zooarchaeological remains of sheep and pig with minimal taphonomic modifications were analysed from four later prehistoric (c. 800–500BC) sites in Britain (n = 33). Humeri and phalanges were selected to compare elements with high and low meat yields. Samples were classified into ‘Heated’ and ‘Unheated’ groups consistent with previous studies, and variable patterns were observed between different sites and taxa. Analytical limitations have hindered the study of cooking in the past, but this study demonstrates the potential of this taphonomic method for exploring prehistoric cooking practices and provides a springboard for wider studies.

www.nature.com/scientificreports/ In both cases, TEM preparation was conducted following different protocol to Koon et al. [15][16][17] and analysis was undertaken on a small sample size (n = 2 in both studies) to confirm the results of the other methods, therefore conclusions were limited within their studies. Discerning heated bone was possible by Trujillo-Mederos et al. 23 despite poorer image resolution due to the different preparation method. Both Solari et al. 22 and Trujillo-Mederos et al. 23 demonstrated the potential of TEM analysis on ancient American samples dating to 2400-1500 BC and 700-500 BC respectively. Chadefaux and Reiche 24 used TEM alongside other methodologies (e.g. SEM-EDX, micro-PIXE/PIGE, micro-ATR-FT-IR) to investigate heat-induced modifications. Unlike the other studies, which all used the TEM to view isolated collagen, Chadefaux and Reiche investigated the mineral-organic relationship using ultrathin sections of modern and archaeological bone (n = 6). Despite the difference in preparation and the inclusion of the hydroxyapatite mineral structure, the collagen was noticeably 'melted' following heating at 200 °C and 220 °C; however, it is possible that the retention of the mineral structure obscured the more subtle organic changes at the lower temperature tested (150 °C). Furthermore, the experimentally heated bone in their study was heated for 1 h, so the effect of lower temperatures for longer periods of time on the mineral-organic relationship is uncertain. Evidence for structural changes in bone collagen when exposed to heat have also been observed in studies on living animals 25 . Molecular changes in archaeological bone collagen resulting from heating have been observed when assessing the aspartic acid racemisation (AAR). A study by Bada et al. 26 demonstrated that bones defleshed by boiling have higher D/L amino acid ratios. AAR ratios in collagen have been shown to be consistent even if the majority of collagen has been lost 27 , highlighting the effect of heat on bone collagen.

Aims and objectives
The overarching aim of this study is to extend previous TEM research to explore its validity on prehistoric material and, if possible, to investigate the nature of cooking at these transitional feasting sites. Precise objectives are as follows: • Investigate the utility of this method for examining cooking at prehistoric sites and whether different approaches to analysis are required to account for naturally more degraded samples of greater age. • Assess inter-observer reliability.
• Explore patterns of cooking at later prehistoric feasting sites on this limited sample.
• Discuss the practicalities and limitations of the method's application to archaeological material and make suggestions for future work.

Materials and methods
An initial study was conducted on three Late Neolithic pig ulnae from the earlier (c. 2500BC) feasting site of Durrington Walls to assess if fibrils were observable on material of this age. After establishing preservation potential, the study was expanded. Pig (Sus scrofa) and sheep/goat (Ovis aries/Capra hircus) samples were selected from four Late Bronze Age to Early Iron Age sites: Whitchurch, Llanmaes, Potterne, and East Chisenbury. In terms of number of identified specimens (NISP), Whitchurch, Potterne, and East Chisenbury were predominantly sheep/ goat (49%, 41% and 56%, respectively), while Llanmaes was predominantly pig (70%) 28 . From each site, five pig and five sheep/goat elements were selected, three elements of which were long bones and two elements which were 'short' bones (e.g. phalanges) ( Table 1). Different elements were selected because it was hypothesised that short bones were less likely to be cooked, given their smaller nutritional yield meaning they are often discarded as butchery waste when optimal exploitation is not required (as would be expected at feasting sites). However, the authors acknowledge there are multiple cooking and butchery processes which could result in the heating of   29,30 and are listed for the samples in Table 1. Sampling strategy targeted elements with well-preserved surface texture and no macroscopic evidence for burning or charring. Samples exhibiting weathering (WH02, LLM12, and ECH38) did not exceed Behrensmeyer stage 1 31 . Where available, % collagen yields from isotope sample preparation 3 are given. Although most samples in this study were not subject to isotope analysis 3 , overall % collagen yields for each of the later prehistoric sites are as follows: the mean % collagen yield for Whitchurch is 11.9% (n = 26), for Llanmaes is 7% (n = 90), for Potterne is 10% (n = 60), and for East Chisenbury is 17.4% (n = 17). Collagen extraction and TEM analysis methods followed the EDTA method described in Koon 15 and are only summarised here. A transverse cross section (c. 10 mm × 10 mm) of each element was extracted using a watercooled diamond bladed saw, then crushed with a hammer to minimize the introduction of heat. A sample of c. www.nature.com/scientificreports/ 60 mg was demineralised slowly over two weeks in Pyrex test tubes using 10 ml of 0.5 M EDTA (ethylenediamine tetraacetic acid), which was changed every three days. Although demineralisation is quicker using hydrochloric acid, EDTA is less likely to damage collagen 15 . After demineralisation, the samples were washed thoroughly using deionized water and left in a phosphate buffer solution (pH7) overnight. Samples were homogenised using a laboratory homogeniser to release the collagen while simultaneously being stained in 3 ml of phosphotungstic acid (PTA). They were then centrifuged at 4 °C for fifteen minutes at 3000×g. The supernatant was discarded before the samples were re-suspended in 1 ml of PTA. Next, the homogenised collagen solution was pipetted onto carbon-coated Formvar grids, two per sample, which were allowed to settle for five minutes for the collagen to attach to the grid. Excess liquid was removed using filter paper by producing a wicking action, then left to dry overnight. The next day, each grid was floated on a drop of uranyl acetate dye, covered to protect the uranyl acetate from light, and left for thirty minutes as the collagen absorbed the stain. After thirty minutes, each grid was rinsed thoroughly with an ethanol and water solution, then with water alone. Excess liquid was again removed with filter paper, and the samples were left to dry overnight. At this point, the grids were ready to be analysed using the transmission electron microscope.
Transmission electron microscopy (TEM) was performed on a JEOL JEM-2100 operating at 200 kV, and images were taken of all identified collagen fibrils. Quantification differed from Koon 15 . One grid per sample was extensively imaged, with each visible fibril being quantified. Collagen was then classified as Unaltered, Beaded, or Dumbbell from the images (Fig. 3). A fourth category of 'Slightly Beaded' was recorded for methodological consideration, and areas of material that were most probably amorphous collagen were noted during recording but not included in the analysis. Samples with fewer than 30 discrete identifiable collagen fibrils were excluded from further analysis. This figure was chosen as a minimum for statistical reliability with the assumption that the collagen that was successfully suspended on the grids was a truly random (representative) sample of the collagen available from the bone sample. This assumption is implicit in the method.

Results and discussion
preservation. Discrete, identifiable collagen was present in all but two samples, where only amorphous collagen that could not be reliably examined remained. However, a total of ten samples had to be excluded from further analysis, as fewer than 30 collagen fibrils were observable (Table 2), leaving 33 samples. Preservation www.nature.com/scientificreports/ varied by site. Llanmaes samples were most degraded, with five failing to meet the threshold. Durrington Walls was also greatly affected, with two out of three samples having too few observable fibrils.
inter-observer error. Inter-observer error was assessed between a practised observer (KEF) and a novice observer (MGBF). Agreement was measured using Spearman's rho correlation (Table 3) and significance was set at p < 0.01. Spearman's rho correlation was used because it is nonparametric and applies to ordinal data, and was calculated using IBM SPSS Statistics 25. Agreement between observers was significant in all classification categories. Correlation was strong for the Dumbbell category, but moderate for the other two categories. This could be influenced by the fact that the Dumbbell category is the only one with a quantitative measurement in its definition, where Dumbbell fibrils are less than 3 µm, making them empirically easier to identify. Observers reviewed the classifications together and found that the main reason for discrepancies in Unaltered fibrils was www.nature.com/scientificreports/ the visualising method. Quantifying collagen from still images rather than within the TEM resulted in Unaltered collagen fibrils being counted multiple times by the second observer, as it was sometimes difficult to identify overlapping regions across multiple images. Therefore, observer experience is thought to be the main reason for the overall discrepancies. Despite these differences, quantified results between observers were significantly correlated; therefore, method reliability was considered sufficient to continue with analysis.
Sites. Three samples from Late Neolithic Durrington Walls were analysed first to assess the preservation of collagen structures in prehistoric material of an earlier date. Durrington Walls was chosen for the preliminary assessment due to prior success extracting collagen for isotope analysis 32 . Presence of collagen was therefore expected but it was unclear whether the fibril structures would be identifiable to the categories used for TEM analysis. Though Richter 21 had success with Neolithic fish remains, the sample size was small and collagen survival would not be expected to be the same in mammalian remains. Sample sizes in Solari et al. 22 and Trujillo-Mederos et al. 23 were also small. Remains from Durrington Walls revealed discrete collagen fibrils in all samples, including a range of Unaltered, Beaded, and Dumbbell fibrils in sample DW1 (Table 2). Despite preserved collagen fibrils in all samples, only one from Durrington Walls achieved the threshold of 30 observable fibrils. Nonetheless this demonstrates the potential of the approach for assessing the cooking of prehistoric mammalian remains. The more substantial samples from middens sites allow tentative statements to be made surrounding patterns of cooking.
Whitchurch. When plotted, the samples from Whitchurch formed two groups (Fig. 4). Pig and sheep/goat were represented in both groups, as were long and short bones. The breakpoint occurred along the Beaded-Dumbbell axis, with one group clustering near the Beaded point (indicating mostly Beaded collagen with some other types), and one group clustering along the Beaded-Dumbbell axis (indicating few/no Unaltered fibrils). These clusters mimic the plot of Koon's 17 'Uncooked' (A) and 'Cooked' (B) groups, respectively. No pattern was identifiable by species or element, as both element and species were split between groups.  www.nature.com/scientificreports/ Llanmaes. The samples from Llanmaes were much reduced by poor preservation. When plotted, there was greater distance between points, but two groups emerge in broadly the same location as the Whitchurch sample (Fig. 5). The group along the Beaded-Dumbbell axis was represented by a single pig long bone, while the group near the Beaded point was mostly sheep/goat mixed elements and one pig phalanx. Unfortunately, the reduction in this site's sample size makes interpreting the pattern difficult, particularly when four of the five excluded elements were long bones making the remaining sample skewed towards short bones.
potterne. The samples from Potterne were unevenly split but followed the same natural separation. The cluster near the Beaded point comprised all the pig long bones and a mix of sheep/goat elements. The cluster along the Beaded-Dumbbell axis included one sheep long bone and one pig short bone (Fig. 6).  www.nature.com/scientificreports/ east chisenbury. The samples from East Chisenbury were the best preserved, with all bones included in analysis. These samples also exhibited the greatest variation in plot location (Fig. 7). Though two groups were discernable along the same axes, their spatial separation was more limited. The cluster along the Beaded-Dumbbell axis was exclusively long bones, mostly pig, and the cluster near the Beaded point was mostly sheep/goat with a mix of elements.
Site trends. In Figure 8a-c all samples have been plotted together, and classification of samples into two groups was consistent with their groupings in intra-site analysis. Viewing the samples together resulted in the same natural break points as seen in the site-specific graphs. Again, the overall pattern is consistent with Koon's 17 plot in regards to cluster regions and shape, with a cluster along the Beaded-Dumbbell axis which is close to the axis indicating few to no Unaltered fibrils, and a cluster nearer the 'Beaded' point which has greater spatial separation from the axes, indicating greater proportions of Unaltered fibrils. Quantitatively, the breakpoint in this study and Koon's 17 study occurs along the Beaded-Dumbbell axis at the point representing samples greater or less than 40% Dumbbell. Given the consistency in natural breakpoints between sites and compared to Koon's study, we can reasonably conclude that these groups are 'Heated' and 'Unheated' respectively. Figure 8a presents the samples by element and species, while Fig. 8b and c compare species and element to site, respectively. Across all sites, long bones dominate the Heated group showing that these elements were more likely to be cooked than short bones. This is unsurprising given the difference in calorific yield from meat, grease, and marrow between element classes. It is perhaps unsurprising that food preparation at these feasting sites, which surely saw conspicuous consumption involving a great deal of surplus, less frequently involved the stewing of low-yield extremities. However, there is a greater diversity of elements in the Unheated group, with almost half of the long bones analysed within the Unheated group. This is true for both species, meaning that for both pigs and sheep/goats, long bones were equally likely to be cooked or uncooked. Pig short bones are slightly more common than sheep/goat in the Heated group, while nearly all sheep/goat short bones are in the Unheated group. Sheep/goat are therefore less likely to be heated (cooked) in articulation (or less likely to have short bones added to a stew pot) than pig, despite their greater representation on most sites. This may indicate socially circumscribed practices relating to the processing of different taxa (perhaps relating to greater grease/fat yields in pigs), something that is evidenced in the butchery patterns at Llanmaes 1 , although basing interpretation on such small samples is ill-advised. Despite long bones being processed in similar ways, with a near equal likelihood of cooked or not cooked, the variation in short bone representation in each group indicates greater processing of sheep/goat prior to cooking (Table 4). While this could be influenced by processing for non-food products (e.g. skins), given the specialized focus on feasting at these sites, the difference is more likely to relate to preferential consumption.
Additional 'slightly beaded' category. During initial research on samples from Durrington Walls, it was apparent that there was a wide range of appearances for Beaded fibrils, with some fibrils having small, discrete expansions along an otherwise tightly structured fibril, and others having numerous, large expansions along a broadly structured fibril (Fig. 3). www.nature.com/scientificreports/ With consideration to Koon's 16 results demonstrating subterranean taphonomic influence on collagen degradation, it was hypothesized that fibrils that were Slightly Beaded (i.e. mostly Unaltered in appearance except for a few small areas of expansion) would be more representative of taphonomic degradation, while more Beaded fibrils would be representative of heating changes before burial. Therefore, during the wider analysis, observers classified Beaded collagen as either "classically" Beaded or Slightly Beaded to test this hypothesis (see Fig. 3). The data was analysed and compared to the results above. To compare on the same scale, the Slightly Beaded fibrils were subsumed under the Unaltered category and plotted as such (Fig. 9). Although the graph indicates clustering of the Cooked and Uncooked Groups, the plots within each group are more spread out and the separation between the two clusters is not as obvious. Though reassuring that the results are consistent, the difficulty in distinguishing the clusters in this format suggests the need for additional study to understand the complexity of distinguishing pre-and post-depositional denaturation. Targeted sampling could specifically investigate sites with differing geologies. Inter-observer error was calculated for these categories using Spearman's rho correlation. Agreement for the Slightly Beaded classification was moderate and significant, but agreement for "classically" Beaded was low and insignificant ( Table 5). The poor agreement between observers would suggest the definitions need to be revised in future studies investigating the difference between subterranean taphonomic denaturation and thermal denaturation.

Methodological considerations
One objective of this study was to examine the practicalities of employing the method and assess ways that might enhance future uptake. The potential of the technique is clear, and it remains one of the only direct methods for reconstructing cooking practices. However, its application has been very limited with only a handful of studies published. The temporal and monetary investment of investigation is likely to be one factor, as this means large sample sets are difficult to achieve. However, this has not prevented the wide-spread uptake of other highresolution analyses of faunal remains (e.g. sequential δ 18 O isotope analysis of enamel). Sample preparation is relatively laborious, but this has also not precluded the wider uptake of other molecular analyses. The greatest cost incurred in the application of TEM analysis of collagen is the operation of the TEM itself. It is exceptionally rare for archaeology departments to have sole use of a TEM and therefore full training and long-term use (often with an hourly charge) can be challenging to achieve. Ideally, grids could be scrutinised at length during live TEM analysis. However, the potential of a more streamlined approach was tested in this study. Grids were scanned by a professional TEM operator and micrographs taken for review. Quantifying fibrils could then be undertaken slowly and systematically. Classifications could easily be revisited and it was possible to conduct an independent inter-observer error analysis.
In comparison to Koon's [15][16][17] results, the grids in this study contained fewer collagen fibrils overall, highlighting a drawback of quantification from micrographs rather than live TEM analysis. As quantification was not undertaken during live TEM analysis, the number of observable fibrils was unknown at the point micrograph retrieval, but every effort was made to provide a representative range of micrographs. Consistency between spatial patterning on ternary plots from this study and previous studies 17 demonstrates that despite fewer fibrils present, similar patterns emerge. This study suggests that, as long as the assumption of random sampling is not violated (e.g. recording all available fibrils, or up to 100), a smaller fibril sample size is informative. To test this assumption with the available material, samples with fibril counts over or equal to 50 were revisited (n = 14). Four random sub-samples of approximately 30 fibrils from each sample were compared to the total count, as well as sub-samples of 50 and 70 where possible (see Supplementary Material for more details). Multiple pairwise chi-square comparisons with Bonferroni correction were conducted between observed groups. Only one sample  www.nature.com/scientificreports/ (ECH31) showed significant differences between groups at Bonferroni corrected level, though the 'Cooked' classification remained unchanged across groups. In all except one sample (ECH33), classification as cooked or uncooked was consistent with the initial analysis for each group. These are promising preliminary results to support the assumption that 30 fibrils can be representative of the overall sample, but it is recommended that future work investigates this assumption more systematically, such as by conducting a bootstrap resampling analysis on larger datasets of 30, 50, 70, and 100 fibrils and beyond. This approach to analysis requires further testing to demonstrate its validity and reliability. However, it has the potential to widen uptake and thus enhance the understanding of cooking practices in wide-ranging contexts. Although observer agreement was acceptable, there was stronger correlation between observers for the quantifiable Dumbbell category. Reliability could therefore be improved by refining the Unaltered and Beaded categories to have objective definitions or be classified by objective technology (e.g. machine learning elliptical Fourier analysis or geometric morphometrics), but larger reference samples are needed.
Finally, the interaction between the organic and mineral components could be explored further in future studies. A comprehensive, experimental study comparing ultrathin bone sections as described by Chadefaux and Reiche 24 and extracted collagen from the same samples would improve our understanding of the microscopic processes of cooking. A study by Bada et al. 26 analysed the effects of amino acid racemisation upon archaeological samples. A further potentially productive future direction would be to explore the use of the discarded supernatant from the process to examine circular dichronism spectra as a possible further proxy for archaeological cooking.

conclusions
This study demonstrates the potential for TEM analysis to enhance understanding of prehistoric cooking practices. The main aim of the study was to assess the application of the method on prehistoric material and positive results have been produced. Tentative distinctions between heated and unheated bone were possible at all Bronze Age and Iron Age sites, though this sample size does not fully represent the diversity of prehistoric sites. This sample represents a tiny fraction of the faunal assemblages from each site and therefore it has little interpretative potential for reconstructing cooking and feasting practices on a site-or period-wide scale. However, it demonstrates the efficacy of the approach and the potential of wider study. Preliminary indications from the data suggest that meat from pigs was more often cooked in articulation than meat from sheep/goats and that the practice of cooking sheep/goat meat potentially included more processing (e.g. filleting) prior to heat exposure. The presence of short bones with evidence for cooking suggest complete limbs or whole animals were sometimes cooked at once, as one might expect from a spit-roast, but with so few short bones within the Heated group, it is unlikely to be the most common method for cooking. These bones appear to have frequently been discarded with no cooking, suggesting (as expected) that optimal exploitation of calories from the carcass was not required at these feasting sites.
At present, this research represents the largest TEM study conducted on prehistoric material. It has extended the understanding of TEM analysis of collagen from archaeological bone and identified further avenues for enquiry. Application of this method to larger sample sizes from these feasting sites is needed to make site-specific conclusions and would allow further comparisons between regions or over time to be made. Improved understanding of the representativeness of each grid of collagen is needed, and further studies using this method should also engage with the evidence from organic residues in pottery to identify practices relating to food preparation in different vessel types. Machine learning also has the potential to enhance the efficacy of fibril quantification and may prove a profitable avenue for future research.
Archaeologists have moved on from the processualist perspective that observes food and diet as merely a biological necessity; cooking too must be viewed through a cultural lens. This method provides an alternative approach to understanding cooking activities in the past.

Data availability
All data needed to evaluate the conclusion in this paper are present in the paper and its Supplementary Information file. Additional data generated and/or analysed during this study are available from the corresponding author on request.