Equivocal evidence for a link between megalencephaly-related genes and primate brain size evolution

A large brain is a defining feature of modern humans, and much work has been dedicated to exploring the molecular underpinnings of this trait. Although numerous studies have focused on genes associated with human microcephaly, no studies have explicitly focused on genes associated with megalencephaly. Here, we investigate 16 candidate genes that have been linked to megalencephaly to determine if: (1) megalencephaly-associated genes evolved under positive selection across primates; and (2) selection pressure on megalencephaly-associated genes is linked to primate brain size. We found evidence for positive selection for only one gene, OFD1, with 1.8% of the sites estimated to have dN/dS values greater than 1; however, we did not detect a relationship between selection pressure on this gene and brain size across species, suggesting that selection for changes to non-brain size traits drove evolutionary changes to this gene. In fact, our primary analyses did not identify significant associations between selection pressure and brain size for any candidate genes. While we did detect positive associations for two genes (GPC3 and TBC1D7) when two phyletic dwarfs (i.e., species that underwent recent evolutionary decreases in brain size) were excluded, these associations did not withstand FDR correction. Overall, these results suggest that sequence alterations to megalencephaly-associated genes may have played little to no role in primate brain size evolution, possibly due to the highly pleiotropic effects of these genes. Future comparative studies of gene expression levels may provide further insights. This study enhances our understanding of the genetic underpinnings of brain size evolution in primates and identifies candidate genes that merit further exploration.


Results
Aim 1: To determine whether megalencephaly-associated genes evolved under positive selection across primates. To detect positive selection across primates, we employed site models in PAML 50 , which allow dN/dS ratios to vary across sites but not across lineages 51 . We then compared the likelihoods of two types of models, including: (1) a "nearly neutral" model which allows sites to fall into two categories, representing purifying selection and neutral evolution; and (2) a "positive selection" model which allows sites to fall into three categories, including purifying selection, neutral evolution, and positive selection 52 . We did not find evidence for positive selection for most (15/16) of the genes analyzed here. In each of these cases, the likelihood of the model that allowed sites to evolve by positive selection was not significantly higher than the likelihood Table 1. Results for Aim 1: Testing whether megalencephaly-associated genes evolved under positive selection across primates. For both the nearly neutral (M1a) and positive selection (M2a) models, we provide the proportion of sites falling into each selection category (purifying: dN/dS < 1; neutral: dN/dS = 1; positive: dN/dS > 1) and the estimated dN/dS values for each category. We also provide the likelihood values for each of these models, in addition to the one-rate (M0) model. Likelihood ratios were calculated using the formula:   predictors in the second row of table), and associated p values are provided for 4 models for each gene, including: Absolute brain size models. (1) log(brain size) ~ log(dN/dS). (2) log(brain size) ~ log(dN) + log(dS). Relative brain size models. (3) log(brain size) ~ log(body size) + log(dN/dS). (4) log(brain size) ~ log(body size) + log(dN) + log(dS). There is no evidence of a significant association between dN/dS and either absolute or relative brain size for any gene. P-adj = FDR corrected p-values. Significant p-values (p < 0.05) are in bold. Est = coefficient estimate. N spp = number of species included in models. *Models were run using an averaged value of lambda (see "Methods"). Results did not change when lambda values of 0 or 1 were used in these models. www.nature.com/scientificreports/ of the model that allowed sites to evolve by purifying or neutral selection only (Table 1). We did find evidence for positive selection for one gene, OFD1 (likelihood ratio = 35.405, p = 2e−8, p-adj = 3.2e−7), with 1.8% of sites estimated to have dN/dS values greater than 1 (= 6.224). This supports our Prediction 1.

Aim 2:
To determine whether selection pressure on megalencephaly-associated genes is linked to measures of primate brain size. To examine relationships between selection pressure (i.e., root-to-tip dN/dS) and either absolute or relative brain size, we employed free-ratios branch models in PAML 50 , which allow dN/dS ratios to vary across branches but not across sites. We calculated root-to-tip dN/dS values for each species by the addition of dN values and dS values from the root to the terminal species branch and taking the ratio of the sums. These values were then used as predictors in phylogenetic generalized least squares (PGLS) regression models of brain size. We found no evidence for significant associations between selection pressure (i.e., root-to-tip dN/dS values) and absolute brain size for any of the genes analyzed (p-adj > 0.05; Table 2). When Callithrix jacchus and Microcebus murinus were excluded (i.e., species that have experienced secondary evolutionary decreases in brain size 3 ; see "Methods"), we detected a positive relationship between selection pressure and absolute brain size for GPC3 and TBC1D7; however, these associations were not significant after FDR correction (GPC3: estimate = 1.637, p = 0.017, p-adj = 0.119) and TBC1D7 (estimate = 3.463, p = 0.011, p-adj = 0.119) ( Fig. 1; Supplementary Table S1). The relationship for GPC3 may be linked to increased dN specifically (multiple regression: dN estimate = 1.883, p = 0.012, p-adj = 0.168) ( Fig. 1; Supplementary Table S1). These results are bolstered by sensitivity analyses (in which we removed one species at a time and re-ran the regression models). Specifically, the association between root-to-tip dN/dS and absolute brain size was only significant for TBC1D7 when Callithrix was excluded (p = 0.003) and for GPC3 when Callithrix (p = 0.034) or Mandrillus (p = 0.042) were excluded. Overall, these results provide weak support for Prediction 2, as they do not survive FDR correction and may be sensitive to species sampling. Body size significantly predicted brain size in all models of relative brain size, and we did not find evidence for any significant associations between selection pressure and relative brain size in any analysis (p > 0.05; Table 2, Supplementary Table S1), consistent with our Prediction 3. We could not test any of these relationships for two genes in our analysis, AKT3 and PTEN, since most species had rootto-tip dN/dS values of zero for these genes (Supplementary Tables S2, S3).

Discussion
In the present study, we investigated the evolutionary histories of genes associated with human megalencephaly to examine their potential roles in primate brain size evolution. We tested whether any of the candidate genes that we selected evolved under positive selection across the primate phylogeny, and identified one gene with this pattern, namely OFD1; however, we did not detect a relationship between selection pressure on this gene and brain size across species. This suggests that selection for changes to other (i.e., non-brain size) phenotypes facilitated evolutionary changes to this gene. Although we did not identify any significant associations between selection pressure and brain size for any of our candidate genes, we did find positive relationships for GPC3 and TBC1D7 when phyletic dwarfs were excluded; however, these findings should be interpreted with caution since: (i) they did not survive FDR correction; (ii) the association for GPC3 may be sensitive to species sampling; and (iii) increasing dN/dS values may reflect positive or relaxed selection. Accordingly, our results represent equivocal evidence that some megalencephaly-associated genes may have been involved in primate brain size evolution.  www.nature.com/scientificreports/ Although we found that one of our candidate genes, OFD1, evolved under positive selection across primates, selection pressure on this gene does not predict brain size. Accordingly, it is likely that selection for changes to other, non-brain size phenotypes influenced by this gene drove evolutionary changes to its coding regions. While certain mutations to this gene cause syndromes that include megalencephaly and central nervous system deficits, reflecting its role in brain development, OFD1 is also involved in the growth and development of other parts of the body, including limb bud patterning and bone development 53 . Accordingly, selection for changes to limb morphology or proportions, which vary greatly across primates, could account for our findings. Interestingly, previous work identified signatures of positive selection on OFD1 for certain mammalian branches, including the internal branch leading to the eutherians and the terminal branches leading to opossum, horse and tree shrew 54 . Although this study did not detect positive selection on any primate branches, this may reflect that only four primate species were included in their analysis.
We did identify two instances in which selection pressure predicted absolute brain size across species, namely for GPC3 and TBC1D7, when Callithrix jacchus and Microcebus murinus were excluded, but the association did not survive FDR correction. While a trend of increasing brain size is typical of most primate lineages, previous work has suggested that both callitrichids and cheirogaleids underwent secondary reductions in brain size as a consequence of selection for decreased body size (phyletic dwarfism 3 ), and that some microcephaly genes actually facilitated this decrease in callitrichids 35 . In line with this, studies of microcephaly genes suggest that callitrichids are outliers in primate-wide analyses 33 , and that, within this clade, there is a negative relationship between rootto-tip dN/dS and brain size 35 . Together with these findings, our results suggest that megalencephaly-associated genes may similarly be involved in both evolutionary increases and decreases in brain size. However, root-to-tip dN/dS values were less than 1, so we must acknowledge that increasing dN/dS values in larger-brained lineages may reflect either increasing positive selection or relaxed selection.
For GPC3, the detected relationship between selection pressure and absolute brain size appears to be driven by an acceleration of dN specifically (Supplementary Table S1), although these results may be sensitive to species sampling (see "Results"). Previous work suggests that certain alterations to this gene lead to Simpson-Golabi-Behmel syndrome, an overgrowth syndrome that is often associated with megalencephaly. Specifically, this gene regulates embryonic growth by inhibiting the Hedgehog signaling pathway through competition for the hedgehog receptor 55,56 . These pathways may directly affect brain growth since hedgehog signaling increases the proliferation of neocortical precursors 57 . Additionally, neuronal differentiation is associated with upregulation of GPC3, suggesting the extracellular components encoded by this gene also help regulate the distribution and activity of extracellular signaling molecules during neuronal development 58 .
For TBC1D7, the relationship between selection pressure and brain size may be driven by an acceleration of dN and/or a deceleration of dS (neither coefficient estimate is significant in the multiple regression model; Supplementary Table S2). Although this result is not as straightforward as that for GPC3, it does not necessarily represent evidence against selection. Specifically, both dN and dS depend on local mutation rates, so when dS decreases (thereby changing dN/dS), we would expect dN to also decrease in the absence of selection. In this case, maintaining an elevated dN relative to a decreasing dS may reflect selection on TPC1D7 that is relevant to brain size evolution. This gene negatively regulates cell growth via suppression of mTOR (mammalian target of rapamycin) signaling 59 , and mTOR complexes regulate many functions critical to brain development, including proliferation, differentiation and migration 60 . Interestingly, a recent comparison of human organoid, chimpanzee organoid, and macaque primary cells suggested that human radial glia exhibit relatively greater increased mTOR activation 61 .
After FDR correction, we did not detect an association between selection pressure and brain size in any analysis. These findings were unexpected given the established links between primate brain size and genes associated with microcephaly, another disorder that produces abnormal brain sizes 33 . While parameter uncertainty may be due to low species sample sizes (and in some analyses, the inclusion of phyletic dwarfs), lack of significant associations may also reflect that genes associated with megalencephaly are within intermediary signaling pathways that have highly pleiotropic functions. In line with this, most syndromes that cause megalencephaly, including those linked to the genes examined here, also cause body overgrowth, cancers, and epilepsy as part of generalized overgrowth syndromes 45 . These pleiotropic, and sometimes also deleterious, effects of mutations to megalencephaly-related genes are likely to constrain their evolvability due to widespread, multivariate stabilizing selection 62 . For example, certain alterations to these genes could shift some traits (e.g., brain size) closer to their optima while simultaneously shifting other traits (e.g., body size) away from their optima, thereby hampering the ability of these genes to respond to selection for increased brain size. This is in contrast to microcephaly, which is not usually associated with profound somatic growth abnormalities 63 . Although the specific mechanisms that lead this phenotype to be brain-specific are not fully understood, new work suggests it may reflect differences between tissues in the expression of certain inhibitory splicing proteins when certain mutations create binding sites for these proteins within regions containing specific microcephaly genes 64 .
A subset of the results presented here partially overlap with prior work from Boddy et al. 11 , who took a different approach-performing a genome-wide analysis of thousands of orthologous genes across three independent episodes of brain size increase in primates. Many of the genes examined here were not included in the prior study (i. e., PTEN, PIK3CA, AKT3, MTOR, RIN2, EZH2, MED12, OFD1, BRWD3, GPC3) due to their justifiably conservative filtering approach. However, some genes did overlap, and neither study found significant associations between brain size and selection pressure for AKT1, STRADA, CCND2, or SPRED1. For KIF7, Boddy and colleagues 11 found that the dN/dS ratio was higher in the Colobus versus Papio lineages, even though the latter exhibit larger brains, in line with our finding of a (non-significant) negative association between selection pressure on this gene and brain size. Finally, Boddy and colleagues 11 found that selection pressure on TBC1D7 predicts a measure of relative brain size (EQ), while we found a weak association with absolute brain size; however, our analyses differ greatly with regard to species sampling (6 species versus 23 species). www.nature.com/scientificreports/ The work presented here enhances our understanding of the shared genetic basis underlying changes in brain size across primates. While numerous studies have examined genes associated with human microcephaly, this is the first study, to our knowledge, to focus explicitly on genes associated with megalencephaly. While we focused on how changes to the structure of the proteins encoded by these genes may have played a role in primate brain size evolution, future studies of comparative variation in the regulation and expression levels of these genes may provide further insights. Interestingly, previous work suggests that the expression levels of multiple genes analyzed here: (1) are differentially expressed between human and rhesus macaque brains during the prenatal and early postnatal periods 65 ; (2) are differentially expressed between human and chimpanzee cerebral organoids 66 ; and (3) exhibit delayed patterns of change during human brain development relative to macaque brain development 67 . In addition, the expression level of PTEN appears to be correlated with brain size across primate species (N = 18), and there is evidence for selection in the proximal regulatory region of this gene 68 . Given that different genes in our analyses showed signatures of positive selection or equivocal gene-phenotype associations, this study highlights the importance of including phenotypic data when attempting to identify genes involved in brain size evolution. Furthermore, although the majority of neurological differences between species may be due to differences in gene expression, this work, along with studies of microcephaly-related genes, suggest that coding sequence changes also play an important role.  Tables S4, S5). Annotations of coding sequences were confirmed via BLAST searches against each species' genome assembly, using the Homo sapiens exon sequences as queries. Exons missing from automated annotations were completed via these BLAST searches, where possible. We confirmed the existence of or filled gaps in the coding regions of the candidate genes by conducting BLAST searches against the Short Read Archive (SRA). Exon sequences were aligned in Geneious 69 using MUSCLE and default settings (alignments are available in Supplementary Data). Sequences that aligned poorly were removed from alignments and excluded from downstream analyses. Missing alignment data were excluded from PAML analyses (cleandata = 1) 50 . Accordingly, to preserve robust sampling across each gene sequence, species with < 70% gene coverage were excluded, and species sample sizes  Supplementary Table S6).

Statistical analyses.
A common measure used to infer selection pressures acting on coding regions of genes is the ratio of the rates of nonsynonymous to synonymous base changes (dN/dS). Nonsynonymous changes refer to those that result in an amino acid change, while synonymous mutations refer to those that do not cause a change in the amino acid sequence. Aim 1: To determine whether megalencephaly-associated genes evolved under positive selection across primates. To detect positive selection across primates, we employed site models in PAML (version 4.8) 50 , which allow dN/dS ratios to vary across sites but not across lineages 51 . We compared two types of models, including: (1) a "nearly neutral" model (denoted as 'M1a' in PAML) which allows sites to fall into two categories, representing purifying selection (dN/dS < 1) and neutral evolution (dN/dS = 1); and (2) a "positive selection" model (denoted as 'M2a' in PAML) which allows sites to fall into three categories, including purifying selection (dN/dS < 1), neutral evolution (dN/dS = 1), and positive selection (dN/dS > 1) 52 . These nested models were compared using the likelihood ratio test statistic − 2[loglikelihood(M1a) − loglikelihood(M2a)], with degrees of freedom equal to the difference in the number of parameters estimated by each model. When model M2a exhibited a significantly higher likelihood value than model M1a, this was taken as evidence for positive selection. Small negative likelihood ratio values were assumed to be estimates of zero 73 .

Aim 2:
To determine whether selection pressure on megalencephaly-associated genes is linked to measures of primate brain size. To examine relationships between selection pressure (i.e., root-to-tip dN/dS) and either absolute or relative brain size, we followed other studies [74][75][76][77][78] in employing freeratios branch models in PAML (version 4.8) 50 , which allow dN/dS ratios to vary across branches but not across sites. We then calculated root-to-tip dN/dS values for each species by the addition of dN values and dS values from the root to the terminal species branch and taking the ratio of the sums. These values were set as species data and used as predictors in regression models of brain size. Although some previous studies that used this approach incorporated the dN/dS values of terminal branches 79,80 , the root-to-tip dN/dS is a more appropriate measure because it is more inclusive of the evolutionary history of a locus. In addition, some studies have used lineage-specific branch models (i.e., two-branch models), in which one model is run for each lineage with dN/ dS constrained to one value over all branches leading to that lineage, to obtain root-to-tip dN/dS values for each species 11,33 . The approach used here (and in other studies [74][75][76][77][78] ) may produce root-to-tip dN/dS values that are more comparable across species, since species with shared internal branches will have the same values for those branches included in their root-to-tip dN/dS calculations; however, we acknowledge that the free ratio model www.nature.com/scientificreports/ implemented here is relatively more parameter rich than the two-branch model, which may impact parameter estimation. In any case, the results of both methods are expected to be similar since they both rely on reconstructed nucleotide sequences and neither is able to address the uncertainty surrounding rate estimates. For models of absolute brain size, we modelled brain size as a function of root-to-tip dN/dS. We modeled relative brain size by also including body size as a predictor. Additionally, we performed multiple regressions for both relative and absolute brain size models in order to examine relationships with dN and dS as independent variables. Prior to analysis, all variables were log-transformed to ensure that residuals were normally distributed. We report nominal and FDR adjusted p-values 81 . Species do not represent independent data points due to their shared evolutionary history, such that closely related species are more likely to exhibit similar phenotypes to each other than are distantly related species. To control for shared evolutionary history, we used phylogenetic generalized least squares (PGLS) regression models, incorporating the topologies and branch lengths from the GenBank taxonomy consensus tree provided on the 10kTrees website 82 . For each model, we allowed the phylogenetic scaling factor (λ) to take the value of its maximum likelihood. In some of the analyses, maximum-likelihood estimations of λ were equal to zero, and the log-likelihood plots of lambda were very flat (i.e., all values of lambda had very similar likelihood values, suggesting uncertainty regarding the best value of lambda), which may reflect low species sample sizes. Accordingly, these models were run using an average value of lambda, weighted according to likelihood (although results did not change when lambda values of 0 were used for these models). We also repeated these analyses with lambda = 0 and 1.
Previous work has suggested that callitrichids and cheirogaleids underwent recent evolutionary decreases in brain size 3 and that some microcephaly genes actually facilitated this decrease in callitrichids, altering the expected relationship between root-to-tip dN/dS and brain size in these species 35 . Accordingly, we repeated our regression analyses excluding both Callithrix jacchus and Microcebus murinus. To test the sensitivity of these results, we re-ran relevant models removing one species at a time.

Data availability
GenBank accession numbers for all coding sequences retrieved and aligned for the current study are listed in Supplementary Tables S4, S5. All alignments generated and analyzed are available as supplementary material (as .phy files).