Auditory thresholds compatible with optimal speech reception likely evolved before the human-chimpanzee split

The anatomy of the auditory region of fossil hominins may shed light on the emergence of human spoken language. Humans differ from other great apes in several features of the external, middle and inner ear (e.g., short external ear canal, small tympanic membrane, large oval window). However, the functional implications of these differences remain poorly understood as comparative audiometric data from great apes are scarce and conflicting. Here, we measure the sound transfer function of the external and middle ears of humans, chimpanzees and bonobos, using laser-Doppler vibrometry and finite element analysis. This sound transfer function affects auditory thresholds, which relate to speech reception thresholds in humans. Unexpectedly we find that external and middle ears of chimpanzees and bonobos transfer sound better than human ones in the frequency range of spoken language. Our results suggest that auditory thresholds of the last common ancestor of Homo and Pan were already compatible with speech reception as observed in humans. Therefore, it seems unlikely that the morphological evolution observed in the bony auditory region of fossil hominins was driven by the emergence of spoken language. Instead, the peculiar human configuration may be a by-product of morpho-functional constraints linked to brain expansion.

cavity with exposure of the stapes footplate.The ossicular chain or its ligaments were not damaged, thus retaining the mechanical and dynamical properties of the middle ear.A small square of reflective foil (0.5 mm 2 ) was placed on the stapes footplate for laser Doppler vibrometer (LDV) measurements.A small hole was drilled into the anterior wall of the ear canal in order to place a probe microphone in front of the tympanic membrane above the umbo.To maintain physiological compliance of the middle ear, the specimens were continuously moistened during the measurements.
Experimental Setup.The complete head specimens were softly placed at the experimental table and the temporal bone specimen were mounted with articulated clamps.An audiometric insert earphone (eartone 3A) was placed into the remaining cartilaginous part of the ear canal.The applied sound pressure was measured with a probe microphone (ER-7c) in front of the tympanic membrane.The velocity of the stapes footplate was measured using a Laser Doppler Vibrometer (LDV) (CLV 700 laser head and CLV 1000 controller unit, Polytec, Waldbronn, Germany); setup see Supplementary Information SI Fig. S6.The laser head was mounted to a micro manipulator and this assembly was then connected to a standard surgical microscope.The laser beam was focused with the micro manipulator to the reflective foil at the center of the stapes footplate.Due to the morphology of the temporal bone it is not possible to measure the velocity of the stapes footplate in the direction of the piston-like axis of stapes motion.The angle of measurement between the Laser beam and the normal direction of the stapes footplate was estimated to lie between 30-50° for all measurements.This corresponds to a bias between 1 to 4 dB in the METF.
Measurements and data preparation.Measurements were done with data acquisition boards (NI PXI4496-input channels, NI PXI6281-output channel, NI PXI1033 chassis) and software based on LabView (both National Instruments, Austin, TX, USA).The excitation signal for the insert earphone was a multisinus signal in the frequency range of 0.1 to 10 kHz with a resolution of about 50 Hz, generating a sound pressure of approximately 94 dB SPL.
The middle ear transfer function (METF) was obtained as the complex transfer functions H(jω) in the frequency domain, calculated from two measured signals x(t) and y(t), where x(t) is the reference signal, sound pressure in front of the tympanic membrane and y(t) is the response signal, the stapes footplate velocity.The measured time signals x(t) and y(t) were Fourier transformed to get complex functions of frequency X(jω) and Y(jω).X*(jω) and Y*(jω) are the corresponding conjugate complex functions.Mean auto power density and mean cross power density were calculated from n=20 measurement frames as: The transfer function was then calculated as Only the magnitude |H| of the transfer function H(jω) is displayed in the diagrams and used in further data preparation.
In some cases, the frequency response had to be concatenated from consecutive measurements over different overlapping frequency ranges.
Before averaging the METFs of different specimens, the respective METFs were resampled to a common logarithmic frequency scale and converted to decibel with 1 mms -1 /Pa as reference.The means were first calculated per individual in case both ears were measured and subsequently species means were computed.
Since the volume of the tympanic cavity and surrounding spaces have an effect on the METF, particularly in the low frequencies (3, 4), opening the middle ear cavity during temporal bone preparation is expected to have affected recorded METFs.However, since chimpanzee and humans have largely similar middle ear volumes (5), interspecific comparisons should not be hindered by these procedures.

Modelling pressure gain in the external acoustic meatus (EEC)
Model simulations.We used a finite element model of the human middle ear and external acoustic canal (6,7) for the simulations (SI Fig. S7).The middle ear part of the model served as realistic terminating impedance for the calculations of pressure gain in the EEC.The EEC of the model was subsequently adapted to chimpanzee and bonobo anatomy to get simulation data for all three species.
Model structure and parameters.The model is implemented in ANSYS® (Ansys Inc., Canonsburg, Pennsylvania, USA) and consists of the EEC, the tympanic membrane, the ossicles (malleus, incus and stapes), the incudomallear and incudostapedial joint, ligaments/tendons and a simplified model of the cochlea.The EEC is modelled as an acoustic fluid that can be described by the Helmholtz equation.The geometry of the EEC is a 3D volume adapted from published data (8) and corresponds to an average human ear canal.A surface impedance was added to the outer EEC surface to account for the damping of the canal wall.Parameters of the EEC model were adapted such that its pressure gain transfer function matches the average experimental data from literature (8)(9)(10).
The tympanic membrane (TM) is modelled as a three-layer shell with orthotropic material behavior.This represents its histological structure consisting of the epidermal layer, the lamina propria and the mucosal epithelial layer (11).Between the structural elements of the TM and the acoustical elements of the EEC a strong coupling is established.
The ossicles, the joints capsules and the stapedial annular ligament (SAL) are modelled as isotropic elastic bodies.The connection between the tympanic membrane and the malleus handle is modelled by rigid kinematic constraints as is the connection between the ossicles and the joints.
The ligaments and tendons (except SAL) are represented by cylindrical beam elements with isotropic elastic material behavior.Additionally, for all soft tissue components, i.e. tympanic membrane, joints and ligaments and tendons, a structural damping was applied.For the cochlea a simplified mass-spring-damper model was used based on (10).As boundary conditions the circumference of the tympanic membrane is simply supported, i.e. the translational DOF are fixed, and the ends of the ligaments and tendons are clamped.
The stapedial annular ligament is simply supported on the outer circumferential surface corresponding to the oval window.
The model parameters (mechanical properties, length and diameter of the ligaments and joints) are listed in the Supplementary Data Table S9.
Model adaption to chimpanzee and bonobo simulations.To estimate the pressure gain of the chimpanzee and the bonobo the length of the EEC and its diameter was scaled according to the data in Table S7.The middle ear morphology was not altered.The damping on the canal wall (ear canal impedance) was adapted for bonobos and chimpanzees to match the magnitude of the pressure gain in the human ear canal model.This assumption was drawn from (12) in which the pressure gain of a chimpanzee ear canal was shown to have magnitude comparable to the human subjects.
Pressure gain was calculated between 0.2 and 7 kHz (humans) or 5 kHz (panins).A pressure magnitude of 1 Pa was applied at the entrance of the external auditory canal and the pressure in front of the middle of the tympanic membrane was calculated.The ratio between the two pressures represents the EEC pressure gain.
As the model has been validated only up to the first resonance of the EEC the calculations were terminated before the second EEC resonance.

Supplementary Text 2 Characterization/description of the human and panin middle ear transfer function (METF)
Average magnitudes of the METF of humans, chimpanzees and bonobos were plotted against sound frequency from 0.2 to 10 kHz (SI Figs.S1 and S2, Tables S1 and S2).To test for statistical significance between means of peak magnitude (first maximum), frequency of the peak magnitude and magnitude of the METF along its entire progression, Student's t-tests were used (after testing for equal variances) utilizing the Independent Two-Sample T-Test Calculator provided at https://www.icalcu.com/.In the case of unequal variance, the Welch's test was chosen.Data points for means of peak magnitude and frequency of the peak magnitude of the METF were collected by choosing the peak magnitude of every single individual (average of left and right ear, if present).To test statistical significance of differences of means of the METF along its entire progression 15 frequencies describing the entire progression (comprising 5.5 octaves, based on frequencies often depicted in audiograms) were compared.Here, individuals (average of left and right ear, if present) were analyzed.
Up to approximately 4 kHz the overall pattern is similar between the three species: the METF increases up to a peak at approximately 1 kHz, with a slope between 0.2-0.7 kHz of 7 dB/octave in humans and 6 dB/octave in chimpanzees and bonobos, then subsequently decreases until 4 kHz with a slope of -9 dB/octave in bonobos and -8 dB/octave in humans and chimpanzees.From 4 kHz onwards, the METF of humans continues its decrease, whereas a second peak in METF appears at 7.2 kHz (max.-24 dB) in chimpanzees and at 5.4 kHz (max.-24 dB) in bonobos.The small peak around 3 kHz arises because of the averaging process and the very different METFs.The shift in pattern of the METF after 4 kHz results in distinct differences in magnitude between panins and humans (up to 15 dB) in the high frequencies of the measuring range.
In addition to variations in peak magnitude of the METF, significant differences also exist between humans, chimpanzees and bonobos across the entire measured frequency range, with panins showing significantly higher magnitudes than humans in the low frequencies, between 0.2-1 kHz, and in the upper mid-to high frequencies, between 5 kHz and 9.5 kHz, while magnitudes between 1-5 kHz are not statistically different.
In the low frequencies, differences in magnitude increase with decreasing frequency, reaching 10 dB at 0.25 kHz.In the high frequencies, differences are kept high over the frequency range, with magnitudes about 13 dB higher in panins than humans, reflecting the characteristic increase seen in panins above 4 kHz.
Supplementary Fig. S1.Trees reflecting the best models for the evolution of the average auditory threshold between 1-8 kHz in primates.(A-B) Branch length reflects divergence time, multiplied by evolutionary rate along the branch.(A) Best evolutionary model when the Kojima chimpanzee audiogram is considered.(B) Best evolutionary model when the Elder chimpanzee audiogram is considered.Supplementary Fig. S5.Reduced major axis regression average body mass against the area enclosed by the tympanic sulcus (TSA) of extant hominoid species.Raw data of measurements can be found in the Supplementary Data TableS7; R 2 0.93, p<0.001.(Explanation: to determine phylogenetic polarity of TSA, we measured additional hominoid species and looked at the data relative to body mass.Despite being closest extant relatives, humans and panins differ distinctively in relative tympanic membrane area.Indeed, along with lar gibbons (Hylobates lar), chimpanzees and bonobos show the largest relative tympanic membrane areas among all extant hominoids, whereas together with siamangs (Symphalangus syndactylus), humans show the smallest.This result points to a strong and opposing selective pressure on tympanic membrane size within the Pan/Homo clade) Supplementary Fig.S6Reduced major axis regression of species' average body mass against the estimated area enclosed by the circumference of the stapes footplate (eSFA) of extant hominoid species.Raw data of measurements can be found in the Supplementary Data

Table S7
; R2 0.95, p<0.001.Explanation: to determine phylogenetic polarity of eSFA, we measured additional hominoid species and looked at the data relative to body mass.In contrast to TSA, humans share a relatively large stapes footplate area with chimpanzees, but not bonobos, a character state that seems plesiomorphic for the clade.Supplementary Fig. S8.Simulation model for estimating the pressure gain of the external acoustic canal (EEC).IMJ -incudomallear joint, ISJincudostapedial joint, SALstapededial annular ligamentSupplementary

Table S7 . Summary of morphological parameters of the middle and outer ear.
Values reported correspond to the average and the range observed in studied specimens.CL: Cochlea length, CMV: Volume of the endolymphatic and perilymphatic spaces of the cochlea, TSA: Area enclosed by the tympanic sulcus, eSFA: Estimated stapes footplate area, FL: functional length, AAF: Surface area of the articular facet, ITR: Impedance transformer ratio, bEECL: Bony external ear canal length, cEECL: inferred cartilaginous external ear canal length, EECL: inferred external ear canal length, EECCS: external ear canal cross-section at intermediate position.External ear canal lengths of gorillas and orangutan come from Masalli 1992.* Measured on Pongo abelii.

Table S8 . Summary of specimen IDs, image spatial resolutions of microCT scans and the morphological middle and outer ear parameters of humans, chimpanzees, bonobos and additional hominoid species used in this study.
Volume of the endolymphatic and perilymphatic spaces of the cochlea, TSA: Area enclosed by the tympanic sulcus, eSFA: Estimated stapes footplate area, FL: functional length, AAF: Surface area of the articular facet, ITR: Impedance transformer ratio, bEECL: Bony external ear canal length, cEECL: Inferred cartilaginous external ear canal length, EECL: inferred external ear canal length, EECCS: average external ear canal cross-section.AMNH, American Museum of Natural History; CEB, Comparative Ear Bank collection housed at Max Planck Institute for Evolutionary Anthropology, Leipzig; Greding, collection of medieval graves from Greding, Germany housed at University Hildesheim; MAM, Mammal Collection Phyletisches Museum, Friedrich-Schiller-Universität Jena;; CEB, MPI EVA, Max Planck Institute for Evolutionary Anthropology; TAI, Taï chimpanzee collection housed at Max Planck Institute for Evolutionary Anthropology; ULAC, University of Leipzig Anatomy collection; UVAC, University of Vienna anatomical collection; WOC, Werner Ossicle Collection; WZS, Wilhelma Zoological Garden Stuttgart; ZMB, Zoological collection of the Museum für Naturkunde Berlin; *value has changed from previously published.If two values are present in cells giving voxel sizes, then ossicles and temporal bone/skull were scanned separately: first value resolution, ossicles; second value resolution, the temporal/skull.

Table S9 Species, method for obtaining data and original source of primate audiograms used in this study
Beecher, M.D. (1974a) Hearing in the owl monkey (aotus trivirgatus).Journal of Comparative and Physiological Psychology, 86, 898-901.Beecher, M.D. (1974b) Pure tone thresholds of the squirrel monkey (Saimir sciureus).Journal of the Acoustical Society of America