Independent origin of large labyrinth size in turtles

The labyrinth of the vertebrate inner ear is a sensory system that governs the perception of head rotations. Central hypotheses predict that labyrinth shape and size are related to ecological adaptations, but this is under debate and has rarely been tested outside of mammals. We analyze the evolution of labyrinth morphology and its ecological drivers in living and fossil turtles, an understudied group that underwent multiple locomotory transitions during 230 million years of evolution. We show that turtles have unexpectedly large labyrinths that evolved during the origin of aquatic habits. Turtle labyrinths are relatively larger than those of mammals, and comparable to many birds, undermining the hypothesis that labyrinth size correlates directly with agility across vertebrates. We also find that labyrinth shape variation does not correlate with ecology in turtles, undermining the widespread expectation that reptilian labyrinth shapes convey behavioral signal, and demonstrating the importance of understudied groups, like turtles.


CT and 3D data deposition
To generate 3D model data for endosseous labyrinth and crania, we used high-resolution microcomputed tomography (CT) scans (for 167 specimens) and magnetic resonance imaging (MRI) scans (for 1 specimen). The tomographic datasets were largely collected by our team, but some scans were passed on to us from other researchers, or downloaded from online repositories or data supplements of published papers. We deposited all CT scans collected by us in the online repository MorphoSource (www.morphosource.org/), and the majority of scans not produced by us are also available online on MorphoSource or different repositories (Supplementary Data 1). Some museums do not currently allow the public deposition of CT scans and require that the CT data are stored and curated by the museum, so that some CT scans are only available upon request with the respective institution. Details for each scan and where to find it are listed in Supplementary Data 1. Scanning facilities, scanning parameters, and scanning procedures varied for each specimen, and respective information is provided alongside the CT slice data and derivative models in their respective repository.
Within the MorphoSource repository, CT scans can be curated either by the uploader, or by the institution to which the respective specimen belongs, and access for many scans and digital specimens is subjective to curatorial permission online. All scans of specimens that come from museums without download-restrictive policies were uploaded by us for direct download (i.e. not requiring download permission), others were uploaded with respective restrictions implemented.
We also uploaded 3D model data of endosseous labyrinths that were used for landmarking as well as cranial models that were used to obtain cranial measurements to MorphoSource. Links to all 3D labyrinth models can be found in Supplementary Data 1. For those specimens for which tomographic data is also deposited on MorphoSource, model and tomographic data are crosslinked. This is possible because each file within MorphoSource has a unique Media ID, and tomographic media were selected as parent media for respective models. For specimens for which CT data are not on MorphoSource, model data were uploaded without cross-linked parent tomographic media. Thus, all model data for this project are in one place, organized in a MorphoSource Media Collection 000372533 called "Evers. 2021. Extant and fossil amniote labyrinth and cranial models" and accessible here: www.morphosource.org/projects/000372533.

Segmentation of 3D models
3D models were generated by manual segmentation in Materialise Mimics 18.0-19.0. Specimenspecific thresholds were used, and some specimens, specifically but not exclusively fossils, were segmented without applying thresholds by using the manual mask addition tool, for instance when threshold difference between the semicircular canal walls and infilled rock matrix was low. We only segmented one endosseous labyrinth for each specimen, preferentially from the left side but the right side was chosen when the skull was damaged on the left side. In addition to labyrinth models, models of the cranium were produced to be used for specimen-specific cranial measurements. For some fossils, matrix surrounding the crania was digitally removed to enable landmarking. All resulting models were exported as PLY files, and labyrinth and cranial models from the same specimen preserve their special relationships with one another as well as their digital scale. some gaps remain along the palate when both parts are articulated (see Gaffney et al. 4 ). Both parts were CT-scanned separately, and the segmented models were digitally articulated in Blender.

Landmarking concept using midline skeletons on reconstructed semicircular duct trajectories
Due to the intersection of the posterior part of the lateral semicircular duct and the ventral parts of the posterior semicircular duct within the endosseous labyrinth cavity of turtles, endosseous labyrinth models of turtles provide only a rough approximation of the shape of the underlying tissue organ. As it is the endolymphatic flowpath within this duct-tissue system that determines function, we used the recommendations by Evers et al. 5 to reconstruct semicircular duct trajectories from 3D endosseous labyrinth models, guided by comparisons with actual membranous labyrinths of turtles. Reconstructed semicircular models were then skeletonized and landmarked. This reconstruction methods has also recently been used by some of us in a comparative analysis of archosaur labyrinths 6 .
To arrive at midline skeletons, the three semicircular canals needed to be isolated from their original endosseous model. This model modification, as well as skeletonization and landmarking was done in Avizo lite 9.2. Models of the endosseous labyrinths and crania were imported as PLY files into Avizo. We cut the endosseous labyrinths models using 3D mask-editing in the segmentation window of Avizo to isolate the anterior, posterior, and lateral semicircular canals. We isolated the semicircular canals by extending the posterior and lateral semicircular canals through the secondary common crus. Hereby, the lateral semicircular canal was posteriorly extended through the secondary common crus in the same horizontal plane as the exposed anterior portion of this canal (see Evers et al. 5 ). In some turtle species, the posterior part of the semicircular duct also leaves a partial impression in the secondary common crus, further aiding the accurate reconstruction of the full length of the lateral semicircular canal. The visible portion of the posterior semicircular canal was ventrally extended to curve laterally around the lateral semicircular canal, as predicted by membranous duct morphology. In order to do this, the position of the posterior ampulla had to be approximated, and we followed the protocol of Evers et al. 5 for this reconstruction. Our canal reconstructions are taken to be closer approximations of the underlying soft tissue anatomy than unmodified models of the endosseous cavities housing this organ. Because midline skeletons produced in Avizo terminate slightly prior to the end of segmented canals, we extended our canal models on either side to achieve sufficiently long skeletons (i.e., the skeletons extend beyond the inferred position of the ampullae and canalcommon crus intersections). Isolated semicircular canals were skeletonized using the autoskeleton function in Avizo with a 'smooth' coefficient of 0.5, and 'attach to data' parameter values of 0.5 to create streamlined skeletons without artifacts.
Midline skeletons of the semicircular canals were landmarked using six conventional fixed landmarks, which describe the starting points of canals at their intersection with their respective ampulla, and the endpoints of canals at their intersection with the common crus (see Supplementary Fig. 1i). For each semicircular canal, an open semilandmark curve was placed on the midline skeleton between the start and endpoint landmarks (i.e. from ampullae to common crus). Additional to the landmarks on the midline skeleton, a closed semilandmark loop was placed around the inner perimeter of the ASC ( Supplementary Fig. 1h), whereby the landmarks were placed as extending laterally from the dorsal arch of the ASC, continuing ventrally around the canal and dorsally up the common crus to close the loop. During landmarking, each semilandmark loop was sampled with an arbitrary number of semilandmarks, densely packed around the curvature of the loop. At the analytical stage, semilandmarks within each loop were resampled to the mean number of semilandmarks used across specimens, so that each specimen had the same number of semilandmarks.
We place a further semilandmark loop around the cross-section of the ASC near its ampulla, defining the circumference of the ASC. These landmarks were not used for geometric morphometric analyses, and only used to calculate ASC circumferences as the sum of distances between landmark positions. This data is not further used here, but will be explored elsewhere.

Variables for hypothesis testing
We determined several size-related, behavioural, ecological, and taxonomic variables to be used in regression analyses/statistical hypothesis testing. All variable scorings for all taxa are summarized in Supplementary Data 2, and the different variables are discussed below.

Cranial box volumes
Cranial box volumes are the smallest virtual cube to contain the full cranium of a specimen, and were measured as a proxy of skull size for each complete or near-complete specimen by multiplying the maximum anteroposterior length (between the premaxilla and occipital condyle) with the maximum dorsoventral height (between the skull roof and the ventral surface of the basicranium) and maximum mediolateral width (between the temporal skull bones, usually near the antrum postoticum perimeter). Measurements were taken as straight-line measurements in Mimics, either on slices directly, or on the 3D model in the model viewing panel of Mimics. For specimens with minor damage to the respective areas, measurements were approximated when possible. However, for specimens in which the preservation does not allow approximation of one of the measurements (for instance, when only the basicranium is preserved), no box volume parameters were measured. Linear measurements and box volumes for all specimens are included in Supplementary Data 2.

Braincase aspect ratio
To test spatial constraint hypotheses in our model tests, we computed a braincase aspect ratio variable by dividing skull height by skull width, following previous authors such as Bronzati et al. 6 . This was done directly in R, so that the respective ratios are not part of Supplementary Data 2.

Relative neck lengths
Turtle head movements are facilitated by their necks, and neck length and mobility are important for different head movement behaviours (e.g. refs. [7][8]. We established relative neck length categories by contrasting absolute neck length with absolute carapacial length, following the procedure of Joyce et al. 8 . We calculated the percentage proportion of neck length with regard to carapacial length, and created relative neck length categories according to those percentages, with a neck-to-carapace proportion of >70% being categorized as 'extreme'; proportions of 50-69% as 'long'; proportions of 35-49% as 'intermediate'; and proportions of <35% as 'short'. The length of the carapace was measured from the anterior margin of the nuchal to the posterior margin of the pygal, or topological equivalents in turtles that lack a pygal. Absolute neck length was measured as the cumulative length of the centra of all eight cervical vertebrae. Measurements were either taken on 3D models in Mimics (for specimens for which we had full body scans available), or on osteological pictures in ImageJ, using the 'set scale' function in that software to calibrate measurements against the scale bars included in photographs. The same method was used on published images of turtle specimens that preserve the neck and carapace for taxa for which we did not have photographs. Although neck and carapace were always taken from a single individual, these individuals were other individuals than the ones we used for generating the endosseous labyrinth models, with the exception of those specimens for which we had full body scans. Details of specimen identification, absolute neck and carapace lengths, as well as resulting ratios are summarized in Supplementary Table 1, and ratios are also listed in Supplementary Data 2.

Neck retraction ability and type
Extant turtles are generally able to retract their necks underneath the carapace. Neck retraction is facilitated by cervical vertebra anatomy 9-10 , but the two major lineages of crown turtles, pleurodires and cryptodires, have evolved different neck retraction mechanisms. In cryptodires, necks are retracted along a vertical plane and the neck if folded in a sigmoidal fashion to hide the skull underneath the carapace (hence, "hidden-necked turtles"). In pleurodires, the principal plane for neck tucking is horizontal (hence, "side-necked turtles). It is unclear when exactly turtles evolved neck retraction, but studies have argued that early stem-turtle (including Proganochelys quenstedtii), meiolaniforms (including Meiolania platyceps), helochelydrids (incl. Naomichelys speciosa), paracryptodires, as well as thalassochelydians had only an incomplete neck retraction ability in which most of the head would remain exposed (e.g., refs. [10][11][12]. For all stem turtles, cervical vertebra anatomy also does not suggest that either the cryptodiran nor pleurodiran mechanism was developed. The evolution of full neck retraction has been proposed as important for labyrinth size evolution 13 , but this hypothesis has so far not been tested. We coded two morphofunctional variable related to head retraction. The first variable, neck retraction ability, differentiates between turtles that incompletely or fully retract their head underneath the shell. Besides all stem turtles, several macrocephalic crown turtles which have lost the ability to withdraw the head fully (e.g., ref. 14 ) were coded as incomplete: chelonioids, Platysternon megacephalum, Macrochelys temminckii, Phosphatochelys tedfordi, Ummulisani rutgersensis, Peltocephalus dumerilianus. As a second variable, we recorded the principal plane for head movements during neck retraction (none or vertical or horizontal). For this variable, cryptodires or pleurodires without the ability to (fully) retract their necks are scored according to their phylogenetic expectation as vertical (cryptodires) of horizontal (pleurodires), as these turtles still share general cervical vertebral features of their respective clades, but neck retraction is prohibited by their large head sizes. Neck classifications for all species are listed in Supplementary Data 2.

Habitat ecology of extant and fossil taxa
Three general habitat ecology distinctions were made: 'marine', 'freshwater', 'terrestrial'. Although many continental turtles are somewhat amphibious, resulting in a difficult distinction between 'freshwater' and 'terrestrial', turtle species were herein scored according to their predominant habitat preference in which they forage: all extant chelonioids were scored as 'marine'; all extant pleurodires, trionychians, kinosternids, chelydrids were scored as 'freshwater'; all extant testudinids were scored as 'terrestrial'. Geoemydids and emydids were generally scored as 'freshwater', with the exception of the following species, which were scored as 'terrestrial': Cuora flavomarginata, Cuora mouhotii, Cyclemys dentata, Geoemyda spengleri, Rhinoclemmys pulcherrima, Glyptemys insculpta, Terrapene carolina, Terrapene ornata. Ecological classifications for each species are listed in Supplementary Data 2.
The ecology of fossil turtles was assessed on a taxon-by-taxon basis. Categorization was assessed with reference to the available published literature about the ecology of extinct fossils. We classified all fossil chelonioid turtles, including protostegids, as marine. This supported by the depositional environments as well as marine adaptations for all of the species considered in our study (see Evers & Benson 15 for comments). For pleurodires, we followed the Ferreira et al. 16 , Gaffney et al. 2 , and Gaffney et al. 4 to distinguish between marine and freshwater pelomedusoids. The following fossil turtles were scored as 'terrestrial': the early stem turtles Proganochelys quenstedtii, Australochelys africanus and Kayentachelys aprix based on depositional environments and postcranial anatomy (e.g., refs. [17][18][19][20]; the meiolaniform Meiolania platyceps based on postcranial anatomy including armour, shell histology and general habitat preferences of the group 20-23 ; the stem trionychian Basilemys sp. based on the presence of extensive limb armour, foot anatomy, skeletal robusticity, and cranial adaptations to high-fiber herbivory (e.g., ref. 24 ); the testudinoid Stylemys nebrascensis based on general anatomical considerations and depositional environment 25 ). Ecological classifications for all fossil taxa are listed in Supplementary Data 2.

Habitual habitat
Although turtle habitats can grossly be divided into marine, freshwater aquatic, and terrestrial categories, extant turtle species show different behaviours within their preferred environments. Therefore, we classified whether extant turtles show extensive burrowing behavior, are terrestrial walkers, aquatic bottom dwelling species, or open water swimmers. Open water swimmers can be marine or non-marine aquatic. Classifications for each species are listed in Supplementary Data 2.

Forelimb webbing
The limbs of turtles are highly specialized to their environments. This is evident from ratio measurements of individual forelimb parts 19 , but also from the extent of the webbing between finger digits. Forelimb element ratios distinguish terrestrial turtles from semiaquatic ones, and semiaquatic turtles from marine turtles. However, the degree of hand webbing shows further nuances in semiaquatic turtles, so that we decided to use webbing as a proxy for 'aquaticness', and therefore habitat ecology for extant turtles of our sample. Our forelimb webbing categories are the same as proposed by Foth et al. 26 , expanded to our taxon sample, and are listed for each species in Supplementary Data 2.

Composite tree and time calibration
To perform phylogenetic comparative methods on the full dataset including fossils, we required a phylogenetic tree that includes all focal taxa of our sampling. As no single published 'conventical' phylogeny existed that fulfilled this criterion, we constructed a composite tree that includes all of the taxa for which we have labyrinth data by informally combining topologies from several published phylogenies. We used the tree topology of Pereira et al. 27 , which is a time-calibrated molecular phylogeny of extant turtles based on 13 loci for 294 living turtle species, as a constraint on the relationships of extant turtles. Fossil pleurodires were added to the tree following the topology of Ferreira et al. 16 , using the analysis from that paper which included a molecular backbone constraint for extant pleurodires. On the turtle stem lineage, the positions of basal stem-turtles (i.e., Australochelys africanus, Kayentachelys aprix, Eileanchelys waldmani), sinemydids (represented by Ordosemys sp.), xinjiangchelyids, thalassochelydians, and sandownids follow Evers & Benson 15 . The ingroup relationships of plesiochelyids follows Evers et al. 28 , but Solnhofia parsonsi was constrained to be the sister taxon of Sandownia harrisi, following Evers and Joyce 29 and Joyce et al. 30 . Paracryptodire relationships follow Lyson & Joyce 31 , but Naomichelys speciosa was additionally constrained as the earliest branching paracryptodire, following unpublished phylogenetic results of some of us (SWE, WGJ), which support recent comparative anatomical considerations [32][33] . Kallokibotion bajazidi was constrained as a meiolaniform, following Sterli et al. 34 . Adocus lineolatus and Basilemys gaffneyi were placed as sister taxa to each other on the stem of Trionychia, following most recent phylogenies that include Adocidae and Nanhsiungchelyidae as a monophyletic clade 25-39 of stem-trionychians 15,35,39 . Within Trionychia, Allaeochelys libyca was treated as the sister to Carettochelys insculpta 15 , and Petrochelys kyrgyzensis was constrained to be a stem-trionychid, following the implied weighting topology published by Brinkman et al. 40 . Axestemys infernalis was included as the sister taxon to the Apalone group, following Vitek 41 . Stylemys nebrascensis was placed as the sister group to the Gopherus group within testudinids. This placement follows biogeographic considerations and the traditional hypothesis that all North American testudinids share a more recent common ancestor than other testudinids (e.g., ref. 42 ). However, in recent phylogenetic analyses, the position of Stylemys nebrascensis is not stable, and a relationship with gopher tortoises has often not been supported [43][44] . Protostegids were placed as stem-chelonioids, following Raselli 45 and Evers et al. 28 . Non-protostegid chelonioid relationships are: the Eocene sea turtles Argillochelys antiqua, Eochelone brabantica, and Puppigerus camperi were included as stem-cheloniids (see also Evers et al. 28 ); Nichollsemys baieri was constrained as a stem-chelonioid in a more crownward position that protostegids; and Allopleuron hofmanni was included as the sister taxon of Dermochelys coriacea.
The phylogeny including fossils was primarily used for the evolutionary analysis of labyrinth size. Key results from this analysis (ancestrally small labyrinth size for turtles that increases from the node joining Eileanchelys waldmani and more crownward turtles; secondary size reduction in testudinids) are not affected by contentious parts of our phylogeny, which primarily concerns the placements of several crownward stem-turtle clades (e.g. sinemydids) alternatively as stemcryptodires, and the placement of the stem-chelonioid protostegids alternatively as closely related to Jurassic thalassochelydians. Thus, we have not explored the effect of alternative phylogenetic topologies on our analyses.
For the time-calibration of our composite tree, we used two different a posteriori methods: the stochastic cal3 method of Bapst 46 , and the minimum branch length (mbl) method 47 , by using commands from the paleotree 48 , strap 49 , Claddis 50 and ape 51 packages. Both methods use temporal ranges of taxa, which were compiled from the literature (Supplementary Data 6), to calibrate internal nodes. As the primary source phylogenies 16,27 already included calibrations from most internal nodes, we only calibrated nodes that resulted from the inclusion of fossil taxa not originally included in any of the source phylogenies. For the cal3 calibration, we extended the calibration procedure to also include deep nodes of the phylogeny (Testudines, Cryptodira, Pleurodira, Trionychia, Pelomedusoides), as these have very deep ages in the data of Pereira et al. 27 , resulting in conflicts with the fossil record. As the cal3 method is a stochastic calibration, we ran 100 calibrations, from which one calibrated tree was chosen at random for our comparative phylogenetic analyses. To assess the impact of different time-calibrations, we performed comparative phylogenetic analyses both on one randomly-sampled cal3 tree (

Additional notes on analyses
The landmark data and R scripts used for analyses in this paper are provided at Zenodo Data 5. Csv file containing colour codes for landmarks, used for deformation plots along PC axes. This is read by scripts provided as Datasets 10-13 & 16.
Data 6. Csv file containing age data for fossil turtle species, alongside information about the fossil prevenance and museum staff responsible for curation and/or collection management..

Data 7.
Text file containing cal3-calibrated phylogenetic tree in nexus syntax. This is read by scripts provided as Datasets 10-13 & 16.

Data 8.
Text file containing mbl-calibrated phylogenetic tree in nexus syntax. This is read by scripts provided as Datasets 10-30 & 16.

Data 9.
Text file with R script to load landmark data, variable data, phylogenetic data from Datasets 2-5 & 7-8 and to 2B-PLS analysis for te verification of our landmarking scheme.

Data 10.
Text file with R script to load landmark data, variable data, phylogenetic data from Datasets 2-5 &7-8 and to perform GPA and PCA analysis. Script creates Figure 1 from main text.

Data 11.
Text file with R script to load landmark data, variable data, phylogenetic data from Datasets 2-5 & 7-8 and to perform GPA analysis and labyrinth shape regressions.

Data 12.
Text file with R script to perform size-and braincase aspect ratio-corrected PCA and regression analyses, and to get deformation plots. Script creates Figure 2 from main text.

Data 13.
Text file with R script to load landmark data, variable data, phylogenetic data from Datasets 2-5 & 7-8 and to perform GPA analysis and labyrinth size regressions and model comparisons.

Data 16.
Text file with R script to load landmark data, variable data, phylogenetic data from Datasets 2-5 & 7-8 and to perform GPA analysis, labyrinth size regression analysis used for Figure 3 in main text, and ancestral state reconstructions used also in Figure 3.
Data 17. Spreadsheet including specimen data and cranial measurements for amniote labyrinth GPA and labyrinth size plot. This is read by the script provided as Dataset 20.

Data 18.
Collection of turtle landmark data, as individual csv files. This is the data file for script provided as Dataset 20.

Data 19.
Csv file containing information about sliding semilandmarks (for GPA analysis). This is read by the script provided as Dataset 20.

Data 20.
Text file with R script to load amniote landmark data, amniote measurements, and sliders from Datasets 17-19 and to perform GPA analysis and plots shown used as Figure 4 of main text.

Supplementary notes 1 Sensitivity tests for phylogenetic comparative analyses using the mbl-tree
Our Procrustes distance shape regressions are only minorly affected by using a differently calibrated tree. Using the minimum branch length tree results in nearly the same relative importance of variables and models, and R 2 values and significance levels for variables are nearly identical (compare Table 1 from main text with Supplementary Table 2). As for the analysis presented in the main text, the best model tested according to R 2 and in which all individual variables are significant takes the forms of labyrinth shape ~ skull box volume * braincase aspect ratio + labyrinth centroid size. As with the cal3-tree-based analyses presented in the main text, ecological habitat variables and morphofunctional neck variables are insignificant when included in bivariate or multiple regression analyses.
Our pGLS regressions on labyrinth centroid size using the alternative mbl-calibrated tree also show that tree calibration has no major effect on our analyses or their interpretations. Model comparison of the mbl-based analyses retrieves ten models with non-negligible AICc values, all of which are among the set of 13 models with non-negligible AICc in the main-text analysis, which uses a cal3-calibrated tree (compare Table 2 from main text with Supplementary Table 3). The sequence of models (i.e. from best getting AICc-worse) is not identical, but very similar, and models with non-negligible AICc values should all be considered anyway. The same set of variables is returned as part of the non-negligible models, and models have near identical R 2 . Thus, the analysis using the mbl tree supports the results presented based on the cal3 tree in the main text.

Short additional notes to PCA analysis
Although we comment on the position of specific species and higher clades in the morphospace of our PCA analysis or landmark data, we only show a version of the morphospace in which specimens are colour coded according to habitat ecology. Supplementary Figure 4 additionally shows a taxonomic colour coding of specimens, and in addition, specimens are identified via numbers plotted onto their point symbols. Supplementary figures 5-8 show deformations of the labyrinth landmark configurations at extreme points of PC axes 1-6. Disparity analysis and further comments on the taxonomic distribution of specimens in the morphospace will be provided elsewhere.

Labyrinth shape regressions excluding marine taxa and/or chelonioids
Among our sampled turtles, chelonioid sea turtles have unusually derived vestibular morphology, with unusually high aspect ratios (i.e., dorsoventrally tall and anteroposteriorly short labyrinths) and unusually thick semicircular canals. To test if any of our detected patterns, particularly the absence of independently significant effects of ecological variables, may be influenced by the unusual labyrinth shape of sea turtles, we performed two additional procD.pgls analyses that excluded marine species (i.e., chelonioid sea turtles, thalassochelydians, marine pleurodires; N = 109) or excluded only chelonioids (N = 123). This was implemented based on a reviewer's comment.
When only chelonioid sea turtles are excluded, the best model (i.e. including only significant terms, and simultaneously maximising R 2 ) takes the same form as the best model reported in the main text (labyrinth shape ~ skull box volume*braincase aspect ratio + labyrinth centroid size; Supplementary Table 4). The ecological effects 'terrestrial', 'freshwater', and 'marine.all', as well as the functional neck parameters are not included in this model, and also do not have significant relationships with labyrinth shape whem analysed individually in bivariate regressions (Supplementary Table 4) and most multivariate regressions. The only exception to this is 'terrestriality', which becomes marginally significant (p = 0.045) when included in a model with skull box volume, braincase aspect ratio and labyrinth centroid size. But this is redundant with the strong, and consistently significant effect of the interaction term between skull box volume and braincase aspect ratio, which themselves are significant and well-supported (Supplementary Table 4). 'Terrestriality' becomes non-significant when included alongside an interaction term between skull box volume and braincase aspect ratio in multiple regressions (Supplementary  Table 4). Thus, our results from the full analysis, and particularly the absence of significant ecological effects or morphofunctional effects related to the neck on explaining turtle labyrinth shape variation are upheld when chelonioids are excluded.
When all marine taxa are excluded (i.e., chelonioids, bothremydids, stereogyines, thalassochelydians), the best model (i.e., including only significant terms, and simultaneously maximising R 2 ) is still identical to that of the main analysis including all taxa (labyrinth shape ~ skull box volume *braincase aspect ratio + labyrinth centroid size). This strongly supports that our results and interpretations presented in the main text are not driven by the inclusion of marine taxa.
Nevertheless, the effect of excluding all marine taxa can be seen when comparing the results of bivariate regression with those from the full analysis using all taxa. For example, most allometryrelated variables are only significant when included alongside the braincase aspect ratio variable, except for skull height, which remains significant even in a bivarite regression (p = 0.033) (Supplementary Table 5). This differs from the analysis using the full taxonomic dataset, in which all allometry-related variables are highly significant on their own. The braincase aspect ratio is significant in bivariate models using all datasets. The interaction term between skull box volume and braincase aspect ratio is also significant when marine turtles are excluded (Supplementary  Table 5), as in the full analysis. Morphofunctional neck variable are always non-significant (Supplementary Table 5), as in the full analysis. When analyzed by themselves, the ecological variables 'terrestrial' and 'freshwater' are significant when excluding marine turtles (Supplementary Table 5), whereas they were not using the full dataset. However, their effects are redundant with the allometric effects in more complex multivariate models (Supplementary Table  5). Thus, although the results from the analysis excluding all marine turtles differs more strongly from the main analysis including all taxa than the one only excluding chelonioids, it also supports the hypothesis that ecology has no independently significant effect in explaining turtle labyrinth shape variation.
We performed an additional labyrinth shape regression based on a reviewer's comment, which suggested to test if the ASC inner loop has a specific influence on our analyses, as the loop ultimately captures variation related to the thickness of the semicircular canal, but also variation of the saccule, vestibule and ampulla. In order to perform this text, we excluded the ASC landmarks from the dataset and ran the procD.pgls regression (N=138 taxa, including fossils) with the same ecological and functional parameters as the analysis presented in the main text. The results suggest the same conclusions, as the best model (including only significant terms, and simultaneously maximising R 2 ) is identical to that of the main analysis using all landmarks (i.e., labyrinth shape ~ skull box volume *braincase aspect ratio + labyrinth centroid size). As in the main analysis, neither the morphofunctional neck variables, not the ecological variables analyzed are significantly related to labyrinth shape (Supplementary Table 6). Instead, the braincase aspect ratio variable as well as allometric variables receive significant support in bivariate and multivariate analyses (Supplementary Table 6), similar to our main text analyses.   Figure 5. Maximum shape deformations of the landmark data in anterior view (i.e., looking from anterior onto loop of PSC) along PC1-6 (colored) against mean shape (grey). Figure 6. Maximum shape deformations of the landmark data in dorsal view along PC1-6 (colored) against mean shape (grey). Figure 7. Maximum shape deformations of the landmark data in lateral view along PC1-6 (colored) against mean shape (grey). Figure 8. Maximum shape deformations of the landmark data in anterior view (i.e., looking from posterior onto loop of ASC) along PC1-6 (colored) against mean shape (grey). Figure 9. Morphospaces of size-corrected labyrinth shape (N=138) and allometric and braincase aspect ratio effects on labyrinth shape. (a) PC1 vs. PC2 of labyrinth shape corrected for skull size allometry and braincase aspect ratio (b) PC1 vs. PC3 of labyrinth shape corrected for skull size allometry and braincase aspect ratio. Proportions of total shape variance explained by PC axes is given in brackets. (c) Plots of skull box volume regression scores against skull box volume, using the formula shape ~ skull box volume + braincase aspect ratio'. Data points are colour coded by ecology. (d) Plots of braincase aspect ratio regression scores against skull box volume (e) Deformation plots at large (red) and small (grey) skull sizes (f) Deformation plots at large (red) and small (grey) brain aspect ratios. Source data are provided with this paper.

Supplementary Tables
Supplementary Table 1 Table 2. Results of selected phylogenetic Procrustes distance regressions of labyrinth shape ~ independent variables including fossils, using the alternative tree calibrated with the mbl method. N = 138 for all models. Hypothesis testing used a Procrustes ANOVA, in which statistical significance (P-values) is calculated by comparison of sum-of-squared Procrustes distances with sums of squares distributions generated from residual randomization permutation procedure (RRPP 52 ), using 1000 permutations. F-statistic is the ratio between the sum of squares of the regression and the sum of squares of the error. Effect sizes (Z-scores) were computed as standard deviations of F-distributions using residual degrees of freedom (re-df). Models presented in same sequence as in Table 1 of main text, which uses a cal3-calibrated tree. Note similarity between results using different trees.  53 and was estimated during model fitting. R 2 is the generalized coefficient of determination described by Nagelkerke 54 . Coefficients are estimated using pGLS restricted maximum likelihood. The t-statistics are coefficient estimates divided by their standard error. Pvalues are two-sided, and are calculated using the coefficient value and a t-distribution with the number of residual degrees of freedom (re-df) of the model. Note similarity between results using different trees when comparing to Table 2 Table 5. Results of selected phylogenetic Procrustes distance regressions of labyrinth shape ~ independent variables excluding all marine turtles, using the cal3 tree. N = 109 for all models. Hypothesis testing used a Procrustes ANOVA, in which statistical significance (Pvalues) is calculated by comparison of sum-of-squared Procrustes distances with sums of squares distributions generated from residual randomization permutation procedure (RRPP 52 ), using 1000 permutations. F-statistic is the ratio between the sum of squares of the regression and the sum of squares of the error. Effect sizes (Z-scores) were computed as standard deviations of F-distributions using residual degrees of freedom (re-df). Selected models demonstrate absence of independently significant ecological effects.  Table 6. Results of selected phylogenetic Procrustes distance regressions of labyrinth shape ~ independent variables including fossils, using the cal3 tree and a reduced landmark scheme that excludes the internal ASC loop. N = 138 for all models. Hypothesis testing used a Procrustes ANOVA, in which statistical significance (P-values) is calculated by comparison of sum-of-squared Procrustes distances with sums of squares distributions generated from residual randomization permutation procedure (RRPP 52 ), using 1000 permutations. Fstatistic is the ratio between the sum of squares of the regression and the sum of squares of the error. Effect sizes (Z-scores) were computed as standard deviations of F-distributions using residual degrees of freedom (re-df). Models presented in same sequence as in Table 1