Introduction

Evolutionary change relies on the existence of genetic variance in phenotypic traits (Fisher 1930; Lande and Arnold 1983; Lande and Shannon 1996). According to the general theorem of selection, evolutionary change in a phenotypic trait is equal to the genetic covariance between the trait and fitness (Price 1970; Lynch and Walsh 1998; Teplitsky et al. 2014; Walsh and Lynch 2018). Most of the available evidence for the role of genetic variance in trait evolution comes from laboratory populations and planned breeding studies (including agricultural and artificial selection for specific, desirable properties of organisms) (Lynch and Walsh 1998; Drobniak and Cichoń 2016), which may bias genetic parameters by exposing organisms to conditions unlike those experienced in nature. Far less estimates come from natural populations, even in spite of an increase in numbers of heritability estimates from wild populations (Postma 2014).

Genetic variation arises through mutations and gene interactions (Lynch and Walsh 1998; Lai et al. 2019), and large portions of it remain cryptic (Masel 2006). However, under certain conditions specific components of this variance may increase or decrease. Levels of genetic variance observed in nature may vary with respect to a number of factors (Merilä and Fry 1998; Charmantier and Garant 2005; Oltman et al. 2005; Brommer et al. 2008; Pitala et al. 2009; Galloway et al. 2009; Gunay et al. 2011; Schroeder et al. 2012; Rowiński and Rogell 2017). For instance, some studies demonstrated that stressful environments may induce decreases in the observed levels of genetic variance, sometimes even to the point of completely removing estimable genetic variance in them (Hoffmann and Merilä 1999; Teplitsky et al. 2014). Sexes also can differ in heritabilities of sex-specific variants of certain traits (Jensen et al. 2003; Foerster et al. 2007; Poissant et al. 2010; Wyman and Rowe 2014). Also, other studies suggested that local environment generated by parents could also influence observed heritabilities. Such “environment” may not always reflect typical notions of external habitats and can for example encompass characteristics of parents (e.g. parental age, a characteristic demonstrated to affect genetic variance in the offspring (Kim et al. 2011; Drobniak et al. 2015)). As variation in quantitative genetic parameters is one of the factors proposed to contribute to maintaining genetic variance in the wild (Hoffmann and Merilä 1999; Gienapp and Brommer 2014; Teplitsky et al. 2014), studying it is one of the most important avenues in evolutionary research.

Mechanisms underlying the abovementioned modifications of genetic variances are largely unknown. Irrespectively of the underlying factor influencing genetic variance, several hypotheses were brought up to explain observed patterns of heritability variation. It is possible that external/local/internal environments experienced by individuals modulate expression levels of genes at the molecular level, resulting in the observed patterns in quantitative genetic parameters (Jensen et al. 2003; Fox and Wolf 2006). Mutation accumulation, specific to certain environments or individual characteristics, could also be responsible for such patterns (such explanation was so far considered mainly in the context of senescence and genetic variance increase with age) (Wilson et al. 2007; Charmantier et al. 2014). Finally, rich work on molecular mechanisms able to release existing cryptic genetic variation suggests that mechanisms such as heat-shock proteins (HSPs), prions or alternative splicing events may change conformational states of involved proteins, and effectively add variation on top of this present due to the existence of different allelic variants of genes (Queitsch et al. 2002; Bergman and Siegal 2003; Gibson and Dworkin 2004; Masel 2006; Berger 2011). To date all studies approaching the condition-dependent expression of genetic variance either focused on quantitative genetic estimates of heritabilities under varying rearing conditions (Hoffmann and Merilä 1999; Charmantier and Garant 2005) or explored evolutionary capacitance by creating genetic variants with alterations in specific molecular systems (e.g. prions, True and Lindquist 2000; HSPs, Queitsch et al. 2002). However, such approaches are largely correlative (e.g. assaying individual traits under existing conditions) or use approaches (genetic knockouts or induced mutants) that are unlikely to reflect responsiveness of physiological systems to natural environments. Literature lacks proper experimental tests of the impact of well-known environment-sensing physiological mechanisms (e.g. hormonal milieu, oxidative status, neuronal signalling) on the expression of quantitative genetic variance, and hence – traits’ evolutionary potential.

In this study, we aimed at filling this gap by directly following one of the possible mechanisms. One of the ways environments can impact the development and fitness of individuals is through parental effects, i.e. consistent effects of parents on their offspring phenotype which, in certain cases, can correlate with environmental variability (Groothuis and Schwabl 2008). Maternal effects received more attention, perhaps because maternal phenotypes can influence—or are perceived as such—offspring traits in more ways (e.g. through maternal transfer of resources and biologically active compounds to the offspring at the stage of eggs or developing embryos (Groothuis and Schwabl 2008; Wolf and Wade 2009; Coslovsky et al. 2012)). Condition-dependent expression of genetic variance, also known as G × E (genotype-by-environment interaction, which essentially is a form of plasticity in genetic variances), was repeatedly shown across different types of environmental gradients (reviewed in Charmantier and Garant (2005) & Rowiński and Rogell (2017)), e.g. between favourable and resource-limited rearing conditions (Gebhardt‐Henrich and Noordwijk 1991; Merilä 1997; Merilä and Fry 1998; Garant et al. 2005). At the same time, environmental heterogeneity that can drive such resource-related rearing environment changes has been demonstrated to modulate the amounts of hormones transmitted to eggs (Groothuis et al. 2005; Hegyi et al. 2011; Remeš 2011; Coslovsky et al. 2012). It is therefore likely that hormonally mediated maternal effect may be one of the factors mediating the observed patterns of G × E in wild populations. Steroid hormones are particularly interesting in this context: they have profound effects on offspring development (Hayward and Wingfield 2004; Tschirren et al. 20052009; Tobler and Sandell 2007; Coslovsky et al. 2012; Ruuskanen et al. 2012; Schweitzer et al. 2013; Lutyk et al. 2017) and they can directly impact the expression of genes by acting as transcription modulators in the nuclei of cells after binding to their specific receptors (Kawata 1995; Baker 1997; Podmokła et al. 2018). As such, steroid hormones are well documented as mediators of maternal effects, and one reason for this is the ease of manipulating their levels in eggs of wild birds (Groothuis and Schwabl 2008).

Here we employed a direct manipulation of levels of two steroid hormones (testosterone, TESTO henceforth, and corticosterone, CORT henceforth) in eggs of the blue tit (Cyanistes caeruleus)—a model species in evolutionary ecology. Our experiment involved cross-fostering (Kruuk and Hadfield 2007) at the egg stage and a fully crossed, factorial design ensuring that genetic and environmental sources of trait variation in these birds would be fully interacted with the hormone level manipulation. The choice of hormones was deliberate: testosterone and corticosterone are well studied, both in a laboratory and wild population contexts (Podmokła et al. 2018) and where repeatedly shown to serve as mediators of environmental cues (Groothuis et al. 2005; Groothuis and Schwabl 2008; Lessells et al. 2016). They differ in physiological impact and often mediate different kinds of information (corticosterone being a well-established mediator of stress responses, whereas testosterone being involved in primary sexual characters development and reproductive investment regulation (Groothuis and Schwabl 2008)), although it should be noted that corticosterone is far less studied in evolutionary ecology contexts than testosterone. We predicted that the levels of genetic variation in certain phenotypic traits would be affected by this manipulation: assuming that by supplementing hormones we would simulate a stressful (CORT) or male-like sex-specific (TESTO) reaction, we expected a decrease in genetic variance in hormone-treated birds, compared to control birds receiving a sham manipulation (Merilä and Fry 1998; Jensen et al. 2003). The traits we chose comprised a set of body size descriptors (weights at different ages and tarsus length) and a frequently measured proxy of immunological state of individuals, the phytohaemagglutinin hyperreactivity response. Earlier studies suggested the existence of G×E in all these traits in relation to a number of environmental factors (Merilä and Fry 1998; Ruuskanen et al. 2012; Drobniak et al. 2015).

Methods

General field methods

The experiment was performed in a wild population of blue tits, studied since 2002 on the Baltic island of Gotland, Sweden (57°01’ N; 18°16’ E) in three breeding seasons (2014–2016). In this population blue tits breed in wooden nest-boxes distributed uniformly across 23 study plots of varying size; density of breeding pairs is uniform across plots of different size (unpublished data). Most plots are covered by oak (Quercus robur), ash (Fraxinus excelsior) and poplar (Populus sp.) forests, with dense common hazel undergrowth (Corylus avellana). Some plots lack the undergrowth and are covered by bright, loose oak forests with wet, rich meadows abundant in orchids. In the studied population, tits lay almost exclusively one clutch per year. Females lay on average 11 eggs (range: 5–17) and incubate them for 13 days; chicks fledge at the age of 17–20 days.

All breeding attempts were regularly inspected by visiting all available nest-boxes every 4–5 days, recording nest construction/egg laying stage, and determining species occupying each nest-box (apart from blue tits, the population is also home to great tits (Parus major) and collared flycatchers (Ficedula albicollis)). Selected nests were assigned to experimental triplets (see Fig. 1 and the next section for a more detailed description of the experiment). Figure 1 provides a summary of procedures and measurements performed in each nest. Parents in each nest were caught on the 14th day post-hatching with mist-nets, ringed with aluminium bands (if not having one already) and measured for tarsus length, wing & tail length and body weight. Age of adults was determined based on the presence of moult limits in the tail and between primary and secondary wing coverts (Demongin 2016). Sex was determined by the presence of a brood patch in females.

Fig. 1: Schematic summary of the experiment and different measurements performed during its course.
figure 1

The top time axis is not to scale. Represented egg/chick numbers may differ between nests; also, hatching failure or nestling mortality may lead to some individuals dropping out

Experimental nestlings were also injected (in two of the three seasons) with a small dose of phytohaemagglutinin (PHA, Sigma-Aldrich, Germany) to determine their cell-mediated immune response (Sarv and Horak 2009). Briefly, 0.2 mg of PHA suspended in 400 ul of buffered saline was injected into the right wing-web of each nestling on the 11th day post-hatching. The thickness of the web prior to the injection, and 24 h afterwards was determined with three measurements using a pressure-sensitive spessimeter (Mitutoyo, Japan model 7313) to the nearest 0.01 mm. The difference between averaged triplets of “after” and “before” measurements quantifies the amount of swelling resulting from PHA hypersensitivity reaction and is treated as a proxy of cell-mediated immune response. The three initial and three post-injection measurements were highly repeatable (technical repeatability r2 > 0.96 (p < 0.01) in all cases). Wing web thickness is positively correlated with body size; to account for this, all assayed nestlings were also weighed on the 12th day. The PHA assay was performed only in years 2015 and 2016. Nonetheless, our analyses still are robust and valid: PHA treatment (or lack thereof) was always applied to all three hormonal groups (i.e. it could not generate observed differences between hormone-treated groups) and all nests not treated with PHA are grouped in one year (i.e. possible effects of not receiving PHA injections on other measured variables, however small, are linked to the year effect and fully explained by the year factor).

Blood samples retrieved from nestlings were used to determine the sex of each chick, using a well-established protocol described (Griffiths et al. 1998). Briefly, after isolating bird DNA a PCR was used to amplify a fragment of the chromohelicase (CHD) gene located on sex chromosomes and exhibiting a sex-specific length dimorphism, scored after separating the PCR products on an agarose gel. In some nestlings (Table S3) the sex could not be assigned due to technical reasons (not enough genetic material for reliable PCR, failure of the PCR reaction or ambiguous result with the gel bands markedly differing in intensity).

Field procedures conformed with the legal requirements of Sweden (permit from Jordbruksverket to LG; Swedish ringing licence RC712 to SMD).

Steroid-injection experiment

Nests inspected during egg-laying were grouped into triplets based on their equal laying dates. In each triplet, at the stage of 9 laid eggs, a hormone injection manipulation was performed. The eggs were taken out of their nests and safely transported to the field laboratory. Prior to collection, the eggs where candled using a battery-powered torch to make sure no signs of early incubations were visible. For the time of manipulation, females were left with an equal number of dummy plastic eggs. After transporting to the lab, the eggs were weighed, photographed, and labelled. Each egg was assigned by random to one of three experimental groups: testosterone group (TESTO), corticosterone group (CORT), and control group (C). Each egg was individually marked with a non-toxic marker and then injected with 3 ul of experimental solution. In group C this was pure sesame oil. In group TESTO each dose contained 1.7 ng of testosterone (17β‐hydroxy‐4‐androsten‐3‐on; Sigma-Aldrich, Germany); in the CORT group each dose contained 0.6 ng of corticosterone (11β,21-dihydroxyprogesterone; Sigma-Aldrich, Germany). Testosterone was dissolved directly in the sesame oil, whereas corticosterone (due to its poorer solubility in oil) was first dissolved in absolute ethyl alcohol (Gam et al. 2011), and then 10 μl of such stock dissolved in the sesame oil. To make the groups fully comparable, oil in the C and TESTO groups was also spiked with 10 μl of 100% ethanol. Nonetheless, resulting solutions quickly evaporate the residual ethyl alcohol, which anyway would be present in a concentration of 0.5% and less.

The doses of hormones were determined following a hormone assay on randomly chosen unmanipulated eggs from the studied population, sampled in preceding seasons (TESTO concentration ± SD: 2.13 ± 0.81 ng/yolk; CORT concentration: 0.61 ± 0.26 ng/yolk; N = 10). It should be noted that these values represent a single snapshot of hormonal concentrations—which may vary across seasons and individuals but should nevertheless provide a useful baseline. These values are close to published estimates from the blue tit and the closely related great tit (Tschirren et al. 2004; Vedder et al. 2007; Kingma et al. 2009; Lessells et al. 2016). Final doses were calculated as 2 SDs rounded up to the nearest 0.1 ng, therefore ensuring that the distribution of hormone concentrations in manipulated eggs would be shifted by 2 SD of their natural values (likely the shift would be slightly smaller due to downward bias of variance estimation based on small sample of wild eggs, but it still would be substantial).

Injections were performed using a 25 μl 702RN Hamilton (Hamilton, Nevada, USA) micro-syringe with type-4 26 s removable needle. Each experimental group had its own syringe; we also used several replacement needles kept in 96% ethanol to keep them clean and sterile. Prior to each injection the egg was gently swabbed with a small amount of ethanol to disinfect a portion of its shell. Then, a disposable sterile needle was used to make a small hole in the shell, and the Hamilton syringe was inserted through it, under the visual control thanks to a flashlight illuminating the egg from beneath it. To make sure that the content of the syringe was injected into the yolk, we performed several trial injections on eggs from deserted nests, using a food dye as the injected liquid. After freezing, these eggs were cut open to verify that the injection procedure delivered the liquid into the yolk and only there, which indeed was the case in 100% cases. Unfortunately, this verification would not indicate in how many cases the injection would compromise the integrity of the yolk membrane (freezing preserves the yolk shape, and even if severely damaged, its content would not leak out due to albumen pressure inside an egg). After injection the hole in egg’s shell was closed with a drop of Vetbond (3M, Minnesota, USA), a tissue adhesive used in surgical procedures. All egg manipulations were performed on a clean table frequently swiped with ethyl alcohol to minimise risk of egg contamination.

After injection eggs were cross-fostered by randomly assigning each egg to one of the triplet nests. Egg randomisation was ensured by their blinded sampling from transportation container, just before marking each egg with a unique code. Afterwards the codes assigned to each rearing nest were matched, ensuring that one random egg from each origin nest by treatment combination ended up in a given rearing nest. Hence, this cross-fostering protocol ensured that all combinations of the experimental treatment and nest-of-origin were present in each nest-of-rearing. After cross-fostering the eggs were returned to their assigned nests and left there for incubation. On the following days any additional eggs were treated similarly (transported to the laboratory, injected with a randomly chosen sham/CORT/TESTO solution, and returned to a nest); this protocol was stopped once on a given day incubation commencement was noted (i.e. eggs were not covered with nest material and warm). 1–2 days before the expected hatching date (11–12 days after the incubation start) all experimental nests were visited again. After verifying the development stage of eggs (by egg candling), all injected eggs were again gently collected and transported in a warmed box to the lab (leaving females with dummy eggs to ensure they would not desert their nests). There, they were placed in individual paper containers and put in an incubator set to 38 degrees and 70% relative humidity for hatching. From that moment the eggs were checked every hour. All chicks hatched between hours 0500 and 2000 were weighed, marked individually by nail clipping and taken back to their foster nests within 1 h of hatching. Chicks hatched after 2000 were left in the incubator until 0500 and brought to their nests the following day. Hatching the nestlings in the incubator allowed us to assign the experimental group to each chick upon hatching.

In 14 cases codes assigned to eggs were not fully legible on egg collection or were modified due to errors in marking the eggs. In all such cases the identity of nestlings (i.e. their assignment to the experimental group) was successfully recovered before recording the relevant data in the database. However, to remain conservative, we checked if omitting these eggs would have any impact on the main effects seen in our study. This sensitivity analysis indicated no such bias.

Statistical analyses

Analyses included between 689 (2-days old chicks) and 621 (14-days old chicks) individuals (but substantially less in case of PHA response). Individuals came from 156 genetic nests and were reared in 143 nests (the latter number being lower due to brood desertions).

To determine patterns of genetic variances in the offspring traits, we applied linear mixed models, fitting them to measured phenotypic traits: body weight measurements at days 2, 8 and 14, tarsus length (on the 14th day post-hatching) and PHA hypersensitivity response. In all models, response variables were assumed to be normally distributed (assumption checked by visually inspecting model residuals plotted against fitted values). In all models we also visually verified the distribution of estimated BLUPs to make sure they are approximately normal.

Each model contained fixed categorical effects of sex (males, females, and unknown sex; females as intercept reference group), study year (2014–2016; 2014 as intercept reference group), experimental treatment (control, CORT, TESTO; control as intercept reference group) and sex by treatment interaction. When interpreting fixed-effect results, non-significant sex-treatment interactions were removed. In addition, the PHA model also included body mass on day 12 as PHA-response is known to be partially correlated with body mass. Random effects included: nest-of-origin (genetic family, G or Gh – see Table 1 for details of abbreviations) effect, nest-of-rearing (foster nest, N or Nh) effect and residual error effect (R or Rh). Preliminary analyses included a random term of the nest triad. It was subsequently removed from final models to simplify analysis as it consistently explained non-significant amounts of variance and its omission had no impact on other estimates.

Table 1 Structure of random-effects models in successively simpler GLMMs fitted to the data

To estimate the effect sizes (i.e. the differences in genetic variances between treatment groups) and their sampling variances we used parametric bootstrapping. Bootstrapping was applied to the simplest model with separate treatment-specific genetic variances. We generated 1000 random samples from each model expressed as: y = Xb + ZGuG + ZNuN + e (where X – appropriate design matrix of fixed effects; b – vector of fixed effects estimates; ZG – design matrix of the nest-of-origin effect; uG ~ N(0, G) – vector o genetic effects with G = IVG, VG being the estimated 3 × 3 covariance matrix for treatment groups; ZN and uN ~ N(0, N), N = IVN – design matrix and vector of nest-of-rearing effects; e ~ N(0, R), R = IVR – vector of random error deviations sampled from; VR and VN were either scalar variances or 3 × 3 covariance matrices, depending on the form of the best model for a given response). Each sample was then re-analysed with an appropriate model to generate 1000 estimates for treatment-specific genetic variances, from which the distributions of differences in genetic variance between treatment groups were extracted. Summary statistics were calculated as means of bootstrap samples, their 95% confidence intervals obtained as 2.5 and 97.5% quantiles, and respective p values calculated as proportions of bootstrap samples above (for negative estimates)/below (for positive estimates) zero. We have also performed a parallel bootstrapping analysing by resampling only the residuals of each model (i.e. keeping the random effects fixed across all samplings)—but this generated qualitatively identical results and is not presented. At the stage of refitting the models with the resampled data, we discarded all models that failed to converge (i.e. in reality we generated >1000 samples, to have a final collection of at least 1000 estimates).

The choice of models eventually analysed using bootstrapping was based on a likelihood-driven simplification of random term structures (i.e. we aimed at models that maximised likelihood and model parsimony). All random effects were structured to allow for hormone-specific estimates of relevant variances. In all cases the variance structures were set to allow for heterogenous variances among the three experimental groups (3 × 3 covariance matrices; Xh models – Table 1). In the end we have fitted, for each response variable, a set of decreasingly complex models, testing various aspects of model variance structures, starting with the most complex model (that included all possible treatment-specific variances) and simplifying it to remove redundant terms. The order of tests made sure that factors possibly confounding the genetic variance (i.e. the treatment-specific residual and rearing variances) were tested first. The residual correlations between treatment groups are not identifiable in our experimental design and were not estimated (fixed at r = 0). To reduce the complexity of models and simplify presentation of results we have also fixed the cross-treatment correlations (which anyway require considerably greater power to estimate, compared to variances) for genetic and rearing effects at r = 1 (their expected value). Table 1 provides an overview of all types of fitted models, and (co)variance constraints involved. In Supplementary Materials we provide an expanded sequence of tested models, including the intermediate stages of testing whether the cross-treatment correlations are lower than r = 1, as well as correlations estimates. In each successive step the more complex model was tested against the simplified one using a likelihood-ratio test. Logged ratios of model likelihoods d = 2 log(ℓHa/ℓHo) were assumed to be distributed as a mixture of χ2 variates with k {s, 1, 2, …, s + q} degrees of freedom (Self and Liang 1987; Stram et al. 1994). The asymptotic distribution against which each likelihood-ratio statistic is tested is

$$d\sim \mathop {\sum}\nolimits_{k = q}^{s + q} {\left( {\begin{array}{*{20}{c}} q \\ {k - s} \end{array}} \right)2^{ - q}\chi _k^2}$$

where s – the number of tested (co)variance parameters that lie inside the parameter space (e.g. correlations/covariances, where the tested hypothesis is θ = 0), q – number of tested (co)variance parameters restricted by null-hypothesis at the boundary of their parameter space (e.g. θ = 0 for variances, θ = 1 or −1 for correlations) (Self and Liang 1987). For a simple df = 1 test of one variance component (H0: σ2 = 0) this simplifies to rescaling the p value of the resulting test by 0.5: p = 0.5[1 − P(χ2df=1 ≤ d)].

In addition to single-trait models that focused on differences between experimental groups we have also fitted a multivariate model that included tarsus length and body weights at days 2, 8 and 14, to estimate between-trait genetic correlations across three experimental groups. In this model all (co)variance matrices were assumed to have a block-diagonal structure (i.e. did not allow for correlations between different traits measured in different experimental groups), and estimated all cross-trait correlations at all random effects’ levels. We present these results in the Supplementary Materials section and discuss their power considerations in the Discussion.

Estimated variances were used to calculate heritabilities within experimental groups/traits (ratios of genetic variance to the sum of other variance components). Standard errors of were derived using the delta method. Similar method was applied to calculations of genetic correlations presented in the Supplement. Since the model we employed assumes the chicks in one nest of origin are full-siblings, the nest-of-origin effect estimates ½ of the total genetic variance in traits (which includes additive genetic variance, dominance variance if present, as well as variance generated by early maternal effects) and so heritability estimates we derive here are sensu lato. Dominance is assumed to be negligible in similar studies (Class and Brommer 2020; Tolvanen et al. 2020). Error variance is also composed of multiple components (it includes, apart from pure residual variance, also ½ of the additive genetic variance). However, interpreting relative contributions of residual and additive genetic variances is difficult as the exact sources of purely residual variance are unidentifiable. Maternal effects may contribute to variation between families but their effect should dissipate with time and hence their influence of traits in older nestlings should be small (Thomson et al. 2017), a pattern that we can clearly see in case of body weights between day 2 and day 8 in our data (see Results and Discussion). The full-siblings assumption is also partly violated by small but significant proportion of extra-pair young detected in the studied population in selected breeding seasons (Arct et al. 2013), resulting in some of the offspring being actually maternal half-siblings. However, resulting bias in genetic variance estimates should be and small and negligible (Firth et al. 2015) as contributions of this error to additive genetic variance and early maternal effects (both included in the nest-of-origin effect) would likely cancel out. We also assumed random distribution of within-pair and extra-pair offspring across treatments and so this issue should not bias genetic parameter estimates systematically.

For all models, we performed three types of sensitivity analyses. To explore the possibility of correcting for prenatal maternal effects by including egg mass in the models, we refitted a subset of data from 2015 to 2016 adding standardised egg mass as a fixed predictor. Secondly, to check the inclusion of nests where significant proportion of chicks failed to hatch could bias the estimates of genetic parameters in any way, we refitted the models only including genetic nests where at least four chicks successful hatched. Finally, we also checked whether reduction in brood size resulting from hatching failure may contribute to the observed patterns—we did this by refitting models with the final brood size as an additional predictor.

When reporting heritabilities and genetic correlations, the reported variance components values are obtained from the best possible model (according to sequential LR tests) but, for illustrative purposes, with the reported component (e.g. genetic variances when reporting heritabilities, and genetic variances and correlations, when reporting genetic correlations) left unconstrained. This way of reporting ensures providing meaningful numbers instead of e.g. three identical values in models where no heterogeneity in genetic variances was detected. Thus, heritabilities may be reported as different (although non-significantly) when in the simplest model the components are constrained to be equal to simplify the model.

Egg hatching success was analysed using a generalised linear mixed model, with egg mass, year and experimental group as fixed predictors, and nest-of-origin and nest-of-incubation as a random predictor. Hatching success was modelled as a binary (0 – failed; 1 – hatched) variable, the models used a logit link function. Since egg mass data is available only for years 2015 and 2016, differences between years and experimental groups were also validated using a reduced model without egg mass (i.e. one that includes the year 2014), but no significant differences in observed patterns were observed.

Mixed models were run in ASReml-R v. 4.1 (Butler 2019), all analyses were run in the R computational environment v. 4.0.2 (R Core Team and R Core Team 2014). Before analysis we have removed from the data all individuals where the initial assignments to experimental groups were lost for some reason, e.g. because of premature hatching (and consequent failure to assign chicks to their respective experimental groups).

Hatching success

Despite keeping all procedures as precise and aseptic as possible, our manipulation had a measurable impact on the chicks’ hatchability (likely resulting from water loss resulting from incompletely closed eggs, introduction of microorganisms interfering with embryos’ development, or bursting of particularly small egg yolks after delivering additional 3 ul of liquid), which is usual in similar studies. While natural hatchability in the studied population reaches 98.0% (mean based on a random sample of 57 non-manipulated nests in 2014), the hatchability in nests manipulated by egg injections dropped to 52.0% on average. Experimental groups did not differ in hatchability (control group: 51.1%, CORT: 49.0%, TESTO: 55.3%; binomial GLMM with control eggs as reference, estimates with SE: βCORT = −0.03 ± 0.14, βTESTO = 0.23 ± 0.14, p = 0.11). Experimental years did differ, with the year 2015 having slightly higher hatchability (binomial GLMM: p = 0.038). The median brood size in experimental nests was 5 hatchlings (IQR: 4–6).

When hatchability data are filtered by removing cases of egg dehydration (observation of sticky, thickened or completely dried egg content upon hatch checks), and cases of egg infection (egg content rotten and showing clear signs of bacterial infection) the hatchability levels are slightly higher: 63% overall (control: 60%; CORT: 61%; TESTO: 67%). They still do not differ significantly between treatment groups (binomial GLMM, estimates with SE: βCORT = 0.05 ± 0.15, βTESTO = 0.31 ± 0.15, p = 0.09).

In line with our expectation that smaller eggs would fail to hatch more frequently, we detected a significant positive relationship between egg size and probability of hatching (binomial GLMM, estimates ± SE: β = 2.96 ± 0.85, p < 0.001 for all hatching failures; β = 2.64 ± 0.92, p = 0.004 after excluding egg drying and egg infection). There was no significant interaction between experimental treatment and egg mass (p = 0.72 for filtered hatching failures, p = 0.76 for all unhatched eggs included). Several in ovo studies using species with small eggs with egg size similar to blue tits report similar and lower (down to ~50%) hatchabilities (Winter et al. 2013; Marri and Richner 2014). Small-egged species may be more sensitive to similar manipulations as their yolks are smaller (hence more prone to irreversible damage, or to accidental damage to the germinal disc itself), and their eggs contain smaller amounts of water (making them more prone to dehydration if shell puncture is not sealed completely). It is also possible that, apart from small size of blue tit eggs, transporting the eggs may have also negatively impacted hatching success, compared to studies where eggs were not cross-fostered between nests (majority of studies) and/or transported to incubators before hatching. Finally, within a random subsample of eggs that were collected during hatching checks and opened to determine the embryo stage, a vast majority of hatching failures was due to embryonic development stopping early, during the first stages of growth. We have defined 5 stages (0 = no signs of fertilisation/only germinal disc visible; 1 = visible vascularisation but embryo <2 mm in size, no dark eye pigmentation; 2 = first signs of dark eye pigmentation, embryo up to 5 mm in size; 3 = embryo with well-developed beak and toes, approx. 8 mm in size, not filling the entire egg; 4 = fully formed nestling with little/no residual albumen); according to this staging, over 55% of failed eggs stopped developing at stages 0 or 1; Fig. S1).

Results

Fixed effects

Means and standard deviations of raw data, together with sample sizes, are provided in Table S3. Body mass and tarsus length were sexually dimorphic (with males being heavier and larger, Table 2). Sex effect on body mass was not observed in 2nd day nestlings, although males still tended to be larger in this age group. Hormonal manipulations exerted no statistically significant effect on most of the measured variables, although day 14 body mass of nestlings tended to be lowest in the CORT-manipulated group (Table 2). However, body mass on the 2nd day was significantly influenced by an interaction between sex and treatment (Table 2). Males tended to be heavier than females in the control group, and in the other hormone-treated groups males had similar weights on the second day. In particular, in testosterone group, a masculinising effect was observed with females being on average heavier than males (although this difference was not statistically significant; Fig. S2). We detected no confounding effect of the person measuring the tarsus length.

Table 2 Fixed effects estimates for all response variables

Random effects

In all variables we observed statistically significant levels of genetic variance (Table 3), resulting in broad-sense heritabilities consistent with those reported elsewhere in the literature (Merilä and Fry 1998; Hadfield et al. 2007; Drobniak et al. 2015; Perrier et al. 2018). For tarsus length, bootstrapping indicated the presence of clear differences in nest-of-origin variances between he experimental groups. The difference between both C and TESTO group—and the CORT group—was substantial and positive (VG(C)VG(CORT) = 0.061, 95%CI: [0.001; 0.126], p = 0.026; VG(TESTO)VG(CORT) = 0.050, 95%CI: [−0.001; 0.110], p = 0.048; between control and TESTO groups: VG(C)VG(TESTO) = 0.012, 95%CI: [−0.060; 0.088], p = 0.632; Fig. 2). This pattern was confirmed by model likelihoods: the best supported model was the one showing a significant contribution of both nest-of-rearing and nest-of-origin effects, and a significant drop in genetic variance in the CORT-treated group, compared to the other two treatments (TESTO and C; model GhC ≡ TESTO+R+E; Table 3 and Table S1, Fig. 3A). When estimated separately for each experimental group, the genetic variance was highest in TESTO and C, resulting in highest sensu lato heritabilities (h2 ± SE: 0.38 ± 0.13 and 0.35 ± 0.12, respectively; Fig. 3A). Heritability in the CORT group was significantly lower (0.12 ± 0.07; Fig. 3B).

Table 3 Detailed sets of models considered for each variable, with interpretation of all performed model comparisons
Fig. 2: Heritabilities and genetic variances of the studied traits (symbol-coded) split by the treatment group (colour-coded).
figure 2

Genetic variances (A) are presented on trait-specific scales. Heritabilities (B) come from models with fully heterogenous genetic variance structures. Whiskers represent SE (estimated via the delta method for heritability or estimated by the model for variances)

Fig. 3: Results of bootstrapping simulations of the tarsus length variable mixed models.
figure 3

Histograms represent simulated distributions of differences in genetic variances calculated between treatment groups indicated on the x-axes. Overlaid on the histograms are: kernel density estimators (blue solid line), zero lines (blue dotted) and differences in variances from original mixed models (red solid lines). Panels represent differences in genetic variance between control and CORT groups (A), TESTO and CORT groups (B), and control and TESTO groups (C)

In day 14 and day 8 body masses, both the nest-of-rearing and nest-of-origin explained considerable amounts of variation, but there was no sign of any treatment-specific effect on the estimated genetic (nest-of-origin) variances (Fig. 3; preferred models G over Gh). The effect-sizes of variance differences between treatments were relatively small (the largest effect: for body mass on day 8, VG(C)VG(CORT) = −0.130, 95%CI: [−0.329; 0.080], p = 0.110; Fig. S3). Body mass heritabilities had similar magnitudes across treatment groups, both at day 8 (C: 0.48 ± 0.12, CORT: 0.59 ± 0.14, TESTO: 0.64 ± 0.12) and at day 14 (C: 0.35 ± 0.13, CORT: 0.43 ± 0.13, TESTO: 0.37 ± 0.12). Treatment effects on the PHA hypersensitivity response were mostly visible in a marked drop in the control group genetic variance, compared to hormone-treated groups (VG(C)VG(CORT) = −0.007, 95%CI: [−0.015; 0.001], p = 0.041; VG(C)VG(TESTO) = −0.006, 95%CI: [−0.014; 0.002], p = 0.070; between control and TESTO groups: VG(TESTO)VG(CORT) = −0.001, 95%CI: [−0.012; 0.010], p = 0.390; Fig. S3). Accordingly, heritabilities in steroid-treated groups were higher (CORT: 0.45 ± 0.19, TESTO: 0.47 ± 0.19) compared to the markedly smaller and statistically indistinguishable from zero heritability in the control group (C: 0.11 ± 0.11). Nevertheless, these effects should be treated with greater caution (due to reduced sample size in the PHA variable).

Genetic variances in the day 2 body mass were also heterogenous between experimental groups, with the CORT group having markedly larger origin-related variance then C and TESTO groups, a pattern clearly visible in the effect sizes of variance differences between groups (VG(C)VG(CORT) = −0.048, 95%CI: [−0.100; 0.005], p = 0.039; VG(TESTO)VG(CORT) = −0.030, 95%CI: [−0.083; 0.020], p = 0.130; VG(C)VG(TESTO) = −0.019, 95%CI: [−0.064; 0.030], p = 0.218; between control and TESTO groups; Fig. S3). This translated into differences in treatment-specific heritabilities (C: 0.51 ± 0.13; CORT: 0.83 ± 0.13; TESTO: 0.57 ± 0.12). In addition, two traits (body mass on day 2 and PHA response) exhibited significant, albeit relatively small, heterogeneity in residual variances (in both cases accounted for in the models used for bootstrapping; Table S1). Detailed estimates of all variance components from models selected as the best supported are presented in Table S1.

Body weight of nestlings on the day of hatching (day 0) was analysed separately and it did not show any differences due to experimental hormonal treatment (see Table S3). The nest-of-origin and nest-of-rearing effects showed no hormone-specific structuring. As expected, the nest-of-origin effect explained the majority of variance in this trait (55.7%); nest-of-rearing explained only 7.4% (which is expected as the only rearing component at this stage may result from incubation-related factors).

To account for some maternal effects that may correlate with natural maternally transmitted yolk hormones, we refitted all the above models with an additional fixed predictor of egg mass, and its interaction with experimental treatment. Explanatory power of these models was lower as egg masses were available only for 2 years (2015 and 2016). Nevertheless, inclusion of egg mass neither did affect the sequence of preferred models, nor change final conclusions about genetic variance patterns in tarsus length, PHA response and day 2 and 8 body masses. Models for body mass on day 14 showed lower estimates of nest-of-origin variance (Table S5) in the CORT group, but this effect was due to restricting the analysis to two out of three years. Nevertheless, when comparing compatible models (i.e. one including egg mass and one excluding it, both based on years 2015–2016) the inclusion of egg mass did lead to decrease in nest-of-origin effects that was (in terms of its magnitude) the largest among all random terms. Egg mass effect on tarsus length was substantial (standardised effect with SE: β = 0.14 ± 0.05, p < 0.001), similarly for mass on day 14 (β = 0.20 ± 0.06, p < 0.001),mass on day 8 (β = 0.20 ± 0.08, p < 0.001) and mass on day 2 (β = 0.07 ± 0.03, p = 0.054), but it did not vary between experimental groups in any model (see also Supplementary Table S5). The effect of egg mass on PHA response was negligible (β = 0.008 ± 0.013, p = 0.55).

A sensitivity analysis performed to check the robustness of our analysis to the presence of genetic nests with overall low survival (performed by removing from the data all nests-of-origin with <4 chicks surviving to trait measurement, the subset retained 107 out of 159 genetic nests) returned the same sequences of preferred models and qualitatively identical results regarding genetic variances. Since chick hatchability and mortality affected final brood sizes in experimental triads generating some differences in brood sizes, we have also checked if variation in intra-brood competition could contribute to the observed differences in variance components. In models including final brood size as a covariate, we observed no changes in final conclusions. Identical conclusion was true when the brood size of each nestling’s genetic nest was included. In line with this, the brood sizes continuous predictors were in both cases statistically non-significant in all analysed traits.

Discussion

Differences in heritabilities measured in different biological contexts are not uncommon in natural populations. Apart from population specificity of heritabilities (which contributes to marked variation in heritabilities even within one species (Lynch and Walsh 1998)), a great deal of attention has been paid to changes in trait genetic variances observed under varying biological conditions. Genetic variance in a trait is one of the most fundamental ingredients of phenotypic evolution (Lande and Shannon 1996; Walsh and Lynch 2018). As such, changes to genetic variance “expressed”—or visible to natural selection—induced, e.g. by individual characteristics or environmental variability should play an important role in modulating the course of evolutionary change and conserving the levels of genetic variance (Lynch and Walsh 1998). This issue is especially interesting as several physiological systems (e.g. heat-shock proteins) can act as “evolutionary capacitors” releasing, under certain conditions, cryptic genetic variance (Gibson and Dworkin 2004).

Here, we have observed that in ovo maternal hormones constitute one of the cues that may modulate observed levels of genetic variance (and consequently heritability). By altering the hormonal environment of developing embryos, in conjunction with a cross-fostering experiment that allows for separation of phenotypic variance contributors, we were able to show that steroid hormones acting early in individual development can alter the observed levels of genetic variance (see also the Supplement for additional results and discussion on cross-trait genetic correlations). Of all studied traits, we have observed an effect of hormonal manipulation on genetic variance in the tarsus length, a trait typically exhibiting moderate to high levels of heritability in wild populations. The heritability of tarsus length in our study (taking the control group as reference) agreed with other published estimates, including in the blue tit (Bonneaud et al. 2009; Teplitsky et al. 2009; Nilsson et al. 2009; Delahaie et al. 2017; Perrier et al. 2018), and with a more general set of published estimates of body size heritabilities from wild populations (Postma 2014) and yet, in the corticosterone-treated nestlings, heritability dropped to statistically indistinguishable from zero. These genetic variance differences (very low variance under one set of conditions, typical levels under two others) suggest an existence of hormone-mediated G × E interaction. An opposite pattern was visible in the body weight on the 2nd day after hatching; here, individuals in the corticosterone-treated group had markedly higher heritabilities than the control and testosterone groups. The effects we observed in tarsus length were robust to including egg mass in the model. It indicates that the observed effect is robust to maternal effects correlated with egg size. However, similar analysis performed on day 2 body mass (i.e. trait that should reflect early maternal effects to a greater extent compared to day 14 body mass or tarsus length, data not presented) showed no change to genetic variance after accounting for egg mass. Therefore, whether this correction reflected pre-treatment variation in some maternal effects magnitude is speculative and requires more studies looking at the relationships between egg size and deposition of maternally derived compounds into eggs.

One reason explaining the low heritability of tarsus length (and the matching trend on day 14 body weight) in the corticosterone group may be the physiological function of this steroid. This hormone is usually considered to mediate stress response and an organism’s mobilisation after experiencing stress (Schoech et al. 2011). In birds corticosterone has been shown to mimic stress induced body weight variations and supressed immune response (Roberts et al. 2007), trigger development of stress-like phenotypes (Roulin et al. 2008), amplify behavioural differences along the shy-bold personality axis (Baugh et al. 2012), and impair parental care (Angelier et al. 2009) or learning behaviour (Kitaysky et al. 2003). If corticosterone exposure is regarded as mimicking stress exposure, its effects on genetic variance in offspring traits may mimic those observed in genotype-by-environment interactions, when individuals are exposed to stressful or unfavourable conditions. Although the impact of such conditions may differ from species to species (Rowiński and Rogell 2017), the often observed pattern is reduction in observed levels of genetic variance (Hoffmann and Merilä 1999), supported by a meta-analysis of such results (Charmantier and Garant 2005). Hoffmann and Merilä (1999) argued that one possible mechanism of such reduction is stopping of offspring growth, caused by stress, before inter-individual variance in achieved body size can fully develop. In our study this explanation doesn’t seem to be valid: we haven’t observed any systematic differences in body size between the three treatment groups. Other mechanism that can be invoked in cases of condition-varying genetic variances involves changes in gene expression at a molecular level (Hodgins-Davis and Townsend 2009). Hormonal influence on gene expression patterns could lead to modifications of breeding values underlying phenotypes, and eventually to changes in the levels of genetic variance. Some studies suggests that in ovo corticosterone can induce gene expression changes in birds (Ahmed et al. 2016), but such evidence is not unambiguous (Lutyk et al. 2017). Function of corticosteroid receptors as transcription factors directly modulating expression of certain genes is known (Kawata 1995; Baker 1997). Nonetheless, more work is needed – especially at the very basic, molecular level and in early developing embryos.

It is also interesting that the observed effect is to certain extent complementary to the action of so called “evolutionary capacitors”—actors that can release cryptic genetic variation under certain conditions (Queitsch et al. 2002; Gibson and Dworkin 2004). In our setup corticosterone seems to reduce genetic variance—which, under stressful conditions, may allow for the maintenance of certain genetic variants beyond the nestling period. Context-dependent changes in the expressed genetic variance can be expected as one of the properties of evolving, adapting systems (Masel 2006) —and hormones with relatively short time windows of activity may constitute an important component of the general system regulating genetic variation “visible” to natural selection.

In contrast to corticosterone manipulation, testosterone-exposed nestlings showed much subtler and less convincing changes in genetic variances of their traits. Physiologically, testosterone is traditionally associated with sex-specific effects and is assumed to mediate trade-offs between body maintenance and resource use in production of secondary sexual characters (Peters 2007; Kingma et al. 2009). Exposure to testosterone early in development was shown to stimulate development of sexually selected traits, bias sex-ratios towards males and increase dominance and competitive behaviours (Podmokła et al. 2018). Unfortunately, due to sample size we couldn’t robustly test for sex-by-treatment interaction in genetic parameters. Nevertheless, in line with reported cases of sex-specific genetic variances in a number of traits (Wyman and Rowe 2014) we expected to see significant impact of hormonal manipulation in our study. One reason for not seeing such effect may be the use of testosterone alone, in contrast to many similar studies using both testosterone and androstenedione as two major sex-linked hormones (Podmokła et al. 2018). Our observation is similar to other studies where no significant in ovo testosterone impact was noted (Tschirren et al. 2005; Podmokła et al. 2018) and suggests that future studies should look more closely on complexes of similarly acting hormones (or even functionally different compounds applied together—e.g. Giraudeau et al. (2017)), applying them together to better mimic biological reality.

Of all other studied traits, only the PHA hypersensitivity response showed some heterogeneity in genetic variances. This was also the only trait where some heterogeneity was visible in the variation of nest-of rearing effects. As for the PHA response, the heterogeneity of genetic variances was visible in the control group having four-fold lower heritability than hormone-treated groups. Steroid hormones are known modulators of immune response and the observed effect may reflect this link, with evidence of both immunosuppressive and stimulating influence (Casto et al. 2001; Andersson et al. 2004; Rubolini et al. 2005; Navara et al. 2006; Roberts et al. 2007). However, this trait was analysed using a limited dataset (2 out of 3 years) and so this pattern should be treated with caution.

Lack of nest-of-rearing vs. treatment interaction in other traits emphasises that the effects of post-hatching parental provisioning and variation associated with the foster nest (including habitat variation) do not depend on the amounts of steroid hormones transferred to eggs. Even in the remaining measured traits our design may still be not powerful enough to detect such interactive effects. It is of course possible that nest-of-rearing variation in other traits not considered in our study, or nest-related variation in traits measured well after leaving the nest, could depend on maternal effects. Natal environment and rearing effects can be detected in phenotypic traits long after fledging (Evans and Sheldon 2012), but gradual disappearance of nest-of-origin (i.e. including maternal) effects even within the nesting period (Thomson et al. 2017) would make observation of such rearing vs. maternal hormones interactions difficult.

In terms of trait means, in most cases we observed no impact of hormonal treatment. The only exception was body mass on the 2nd day: here, female offspring tended to be influenced by both testosterone and corticosterone, with their masses drawn closer to the values of males. Masculinising effect of testosterone on morphology in not uncommon (Podmokła et al. 2018), evidence (albeit scarcer) also exists for such effects exerted by corticosterone (Roberts et al. 1997; Mankiewicz et al. 2013). By the time of fledging, all phenotypic effects of hormones on trait averages seem to dissipate, which also in line with generally weak evidence for strong mean effects of in ovo hormonal manipulations (Podmokła et al. 2018). All observed changes in patterns of genetic variances are therefore pure G × Es, without a measurable component of phenotypic plasticity (as trait means do not vary between treatments). In principle, G × E does not require the concurrent change in trait means (it is defined as variation in reaction norms of individual genotypes, which on average can cancel out rendering net zero change in means; Anholt and Mackay 2004; Saltz et al. 2018; Wang et al. 2019; Huang et al. 2020); still—by modifying the levels of genetic variance—such genotype-by-environment interactions can affect evolutionary trajectories of traits (Lynch and Walsh 1998).

On the methodological side, our study may suffer from several shortcomings. Our experiment was associated with considerable reduction in eggs hatching success. Smaller embryos generally hatch less successfully (Krist 2011), which was also the case in our data and could negatively select for hatchlings’ size and eventually reduce body size variation. However, we detected no differences in hatchability between treatment groups, i.e. lowered hatchability should not contribute significantly to the observed differences in the broad-sense genetic variation. Most hatching failures occurred at very early stages of embryonic development, suggesting random causes such as permanent disruption of the yolk or damage to the germinal disc. This would also explain increased hatching failure of smaller eggs: smaller yolks would be more prone to rupture upon solution injection or damaging of the delicate germinal disc. It should also be noted that any treatment-specific mortality of embryos could also be attributed to a potential G × E—if mortality would occur in a treatment- and genotype-specific way. Our study used a substantial increase in the in ovo hormone concentrations (up to 2 SDs over natural levels). It is unlikely that such concentrations could be toxic to embryos (partly because in such way toxicity would start at levels dangerously close to those seen in nature, and partly because embryos are expected to have some compensation mechanisms potentially shielding them from such detrimental effects of hormones (Groothuis and Schwabl 2008)).

One of the consequences of lower hatchability may be lowered statistical power, resulting from smaller numbers of individuals per nest-of-origin entering the analyses. After Klein (1974), we can very roughly estimate power to detect heritability differences similar to ours, with our numbers of genetic families and offspring per family, as varying between 47 and 54% (depending on the assumed numbers of offspring contributing to estimates). Klein’s method assumes simple comparison of heritabilities—multi-level mixed model and likelihood-ratio tests should provide higher power still. Lowered power would explain why in some cases the observed differences border the nominal significance threshold. It would especially affect the more challenging estimation of genetic covariances—hence we decided to fix the cross-treatment correlations at their respective values, and do not focus on the estimates of cross-trait genetic correlations (see Supplementary Materials for brief outline of those estimates). Nonetheless, focusing on effect-sizes rather than arbitrarily set statistical significance can provide valid insights and reduce biases in published estimates of variance components. Future studies should focus on building more powerful designs to provide more robust tests to genetic covariance parameters in similar experiments.

We used cross-fostering to separate sources of phenotypic variation in measured traits. The estimated variation between full-sib families (nest-of-origin effect) should therefore contain ½ of a trait’s additive genetic variance. However, if present, this effect may also contain pre-hatching maternal effects and ¼ of genetic dominance variance (the latter can be assumed to be negligible; Class and Brommer 2020). In birds, females seem to be repeatable in the amount of hormones transferred to eggs (Tschirren et al. 2009; Ruuskanen et al. 2016)—therefore, if present, initial between-mothers variation in yolk hormones would be confounded with additive genetic variance in traits. Studies looking at the ontogenetic changes in nest-of-origin effects show that their impact decreases over the nesting period, with a parallel increase of the relative contribution of genetic effects (Pick et al. 2016; Thomson et al. 2017). Thus, our estimates for day 14 body mass and tarsus length should robustly reflect additive genetic variance contributions. Indeed, our heritability estimates in those two traits agree with other published values (Postma 2014). Early (day 0, 2 and 8) body masses, on the other hand, will likely be influenced to a varying degree by significant maternal effects—which is also reflected in the gradual drop of estimated broad-sense heritabilities seen in our study (Fig. 3).

The sensitivity analysis (inclusion of egg masses in the analysed models, based on the 2015–2016 subset of data) does to a certain extent explain the contribution of some maternal effects—but it cannot be seen as a completely robust solution. Therefore, the variation in nest-of-origin variances of the 2nd day body masses could be due to an interaction of hormonal treatment and genetic effects, but it can also be caused by an interaction between treatment and early maternal effects (especially taking unusually high estimates of day 2 body mass heritabilities). Interestingly, the inclusion of egg mass does seem to explain at least part of the nest-of-origin effects (Table S5). Its inclusion leads to changes in genetic (nest-of-origin) variances that are proportionally larger than changes in the nest-of-rearing and residual variances. Magnitude of these changes also is greater for a more labile and condition-sensitive trait (body mass, compared to tarsus length). Direct tests of the impact of egg mass on early nest-of-origin effects are not common. Some show measurable impact both on variance components and nestling growth patterns (Hadfield et al. 2013a), other suggest little to no influence (Hadfield et al. 2013b). More studies are needed, especially that the impact of egg characteristics may escape simple descriptors such as hormone concentrations (Valcu et al. 2019).

Our interpretation of a drop in genetic variance in corticosterone-treated tarsus length seems to also contradict the lack of concurrent drop in the trait’s residual variance in this group (expected as within-family residual variance should contain a contribution of ½ of additive genetic variance). However, residual variances often interact with environmental factors on their own (Charmantier and Garant 2005). Within family residual variance is also expected to contain substantial amount of specific environmental variance, i.e. environment-generated variance arising at an individual level (e.g. due to developmental instabilities caused by environmental changes; Lynch and Walsh 1998). Thus, patterns of environment-specific changes in residual variance (or lack thereof) should be treated in caution as usually we lack information to attribute them causally to specific factors.

In theory, a physiological phenomenon could be responsible for the observed differences in nest-of-origin variances if indeed they were driven by maternal and not genetic effects. Even if variation in maternal hormone concentrations would be unaffected between control and hormone-treated groups (Fig. 4, horizontal axes; the distribution is shifted by an experimental increase in hormone concentration, but its variance remains constant), hormone-linked phenotypes could exhibit treatment-specific reductions in variance if their values would depend on hormone levels in a non-linear way. This could occur, e.g. due to hormones having a saturating effect on the affected phenotype (Fig. 4B, occurring when manipulation would shift hormone concentrations to the plateau region of the curve), or due to the hormone-phenotype mapping having different functional form (Fig. 4C, e.g. if compensation mechanism would decrease trait sensitivity to the hormone under its elevated concentrations). The only scenario when a change in trait variance does not occur (vertical axes on Fig. 4) is a linear mapping between hormones and affected phenotypes (Fig. 4A, D; the same pattern is seen assuming a normal (a) and lognormal (d; Lessells et al. 2016) hormone concentration). However, all scenarios assuming some change in variance would also require a shift in trait mean, a necessary consequence of a shift in hormone concentration along the mapping curve. Since in our case trait means remained unchanged, variance changes observed in our study should reflect genuine effects of reduction in genetic variances. Nonetheless, better understanding of patterns demonstrated in this study is needed, e.g. using laboratory setups and planned breeding designs to clearly separate additive genetic variance from other variance components.

Fig. 4: Possible scenarios of hormonal treatment affecting the variance in maternal effects, depending on distribution of hormone concentrations and the function mapping hormone levels to phenotype values.
figure 4

Symmetrical distributions – with linear (A), sigmoidal (saturating; B) or heterogenous linear (concentration-dependent) mapping functions (C), and a linear mapping function with skewed (log-normal) distributions (D)

In summary, our study provides the first experimental attempt at identifying mechanisms that may be responsible for the modulation of the expressed genetic variance in nature. Clearly, steroid hormones do offer an interesting study system and are capable of producing patterns mimicking those seen in G × E in wild systems. Complementary studies in wild populations, as well as more in-depth analyses looking at the molecular pathways involved, are needed to better understand and explain the mechanism of how steroid hormones may mediate the observed effects. Subsequent studies should also attempt to increase the statistical power of similar analyses, e.g. by using model systems with larger eggs, better suited for such studies through increased resilience to in ovo manipulations. Taking the widespread occurrence of environmental effects on the levels of genetic variance in nature, and the commonly accepted role of steroid hormones in mediating environment-induced influences during development, we believe that our study provides a refreshing and novel perspective on the issue.