Abstract
Ant colonies are higher-level organisms consisting of specialized reproductive and non-reproductive individuals that differentiate early in development, similar to germ–soma segregation in bilateral Metazoa. Analogous to diverging cell lines, developmental differentiation of individual ants has often been considered in epigenetic terms but the sets of genes that determine caste phenotypes throughout larval and pupal development remain unknown. Here, we reconstruct the individual developmental trajectories of two ant species, Monomorium pharaonis and Acromyrmex echinatior, after obtaining >1,400 whole-genome transcriptomes. Using a new backward prediction algorithm, we show that caste phenotypes can be accurately predicted by genome-wide transcriptome profiling. We find that caste differentiation is increasingly canalized from early development onwards, particularly in germline individuals (gynes/queens) and that the juvenile hormone signalling pathway plays a key role in this process by regulating body mass divergence between castes. We quantified gene-specific canalization levels and found that canalized genes with gyne/queen-biased expression were enriched for ovary and wing functions while canalized genes with worker-biased expression were enriched in brain and behavioural functions. Suppression in gyne larvae of Freja, a highly canalized gyne-biased ovary gene, disturbed pupal development by inducing non-adaptive intermediate phenotypes between gynes and workers. Our results are consistent with natural selection actively maintaining canalized caste phenotypes while securing robustness in the life cycle ontogeny of ant colonies.
Similar content being viewed by others
Main
In his Ants monograph, William Morton Wheeler concluded that there is a striking analogy “between the ant colony and the cell colony which constitutes the body of a Metazoan animal; and many of the laws that control the cellular origin, development, growth, reproduction and decay of the individual Metazoan, are seen to hold good also of the ant society regarded as an individual of a higher order”1. This century-old statement highlights putative parallels between irreversible major transitions to organismal multicellularity with a differentiated germline and altruistic cellular soma on the one hand and to colonial superorganismality with physically differentiated queen and worker castes as higher-level germline and soma on the other hand2. It implies that, once cell fate or caste fate have been determined early in development, individual cells or ant larvae should follow analogous developmental trajectories that give rise to terminally specialized cell types or morphologically distinct adult caste phenotypes, respectively.
Some decades later, Conrad H. Waddington depicted metazoan embryogenesis as a pebble rolling downhill in a rugged epigenetic landscape of divergent valleys, with cells losing pluripotency as they, regardless of minor genetic or environment disturbances, commit to specific developmental trajectories, a process that he termed canalization3,4. In combination, these early insights suggest that there should be Waddington landscapes for ant colony development reflecting the analogous Wheelerian understanding of developmental processes at two hierarchical levels of organismality. More specifically, as (super)organismal differentiation proceeds, we expect individual cells or multicellular ants to increasingly commit to their target functional phenotype within the (super)organism by parallel transcriptomical and anatomical differentiation. Waddington also maintained that the degree of canalization is under natural selection and can be different across organs and tissues within metazoan organisms3, so we expected to also find such differences among castes within superorganismal colonies. While molecular biology technology based on differential gene expression has now largely replaced Waddington’s organismal perception of development, his diagrams remain instructive heuristic tools for analogous understanding of gene regulatory networks (GRNs) that affect cell differentiation in metazoan bodies and, as we conjecture here, caste differentiation in ant colonies.
Recent advances in single-cell transcriptomics have revealed many molecular details of Waddingtonian landscape differentiation5,6,7, while reconstructing developmental trajectories in unprecedented detail and identifying key GRNs for cell fate determination8. However, no studies of comparable ambition have been pursued to track ontogenetic development of superorganismal colonies sensu Wheeler and quantify the canalization properties of caste differentiation. The phylogenetically diverse holometabolic ants, with their clear developmental stages, are particularly inviting to embark on such investigations but studies have so far used pooled samples or obtained rather few individual transcriptomes9,10,11, which has precluded formal analyses of individual heterogeneity during the entire developmental process of caste differentiation. In particular, whether the sequence of larval and pupal caste differentiation is a canalized developmental process, in which specific genes initiate and regulate cascades of differential gene expression while shaping morphologically diverging phenotypes, is unknown.
We used low-input RNA sequencing to obtain >1,400 genome-wide individual transcriptomes covering the major developmental stages of the ants Monomorium pharaonis and Acromyrmex echinatior, while using Drosophila melanogaster for outgroup comparisons. These two ant species belong to the same subfamily, Myrmicinae, but differ in social and developmental characteristics. M. pharaonis is a highly polygynous (multiqueen) invasive ant with a monomorphic worker caste, where caste is known to be determined ‘blastogenically’ in early embryos before eggs hatch12,13. In contrast, the fungus-growing leaf-cutting ant A. echinatior has mostly single queen colonies but a polymorphic worker caste with small workers for brood nursing and gardening and large workers for foraging and colony defence, where caste determination occurs during early larval development14,15. All ants share the same common ancestor that evolved superorganismal colonies with specialized queens and lifetime unmated workers, with only a few cases of secondary loss of the queen caste being known16. The two ant species that we studied therefore probably represent typical models of colony and caste development found throughout most extant ants. We reconstructed developmental trajectories for gyne and worker caste differentiation via genome-wide gene expression profiling, using a novel algorithm for predicting caste phenotype before larvae express morphological differences. We then focused on the larval–pupal transition to quantify caste-specific canalization effects and their underlying pathways and we finally examined some of the key genes regulating caste phenotype canalization.
Results
A transcriptomic atlas for ant development
Developmental trajectory networks constructed from whole-genome transcriptomes (Fig. 1a and Extended Data Fig. 1; Methods) clustered individuals primarily by developmental stage and gradually also by caste phenotype. Adjacent developmental stages always grouped next to each other, as expected when development is a largely continuous process but we also observed distinct clusters in early embryonic stages (0–24 h) and for the late larvae to early pupae transition, indicating more discrete stage-specific transcriptomes. Principal component analysis (PCA) for the combined data from both species showed that the first axis separated the two ant species (P < 0.0001; two-sided t-test) while the second and third axes jointly separated individuals by developmental stage and caste identity (P < 0.05 for two-way analysis of variance (ANOVA) on the association between PC2 and developmental stage and caste; P < 0.005 for one-way ANOVA on the association between PC3 and developmental stage) (Fig. 1b).
Aligning developmental transcriptomes showed 67–81% similarity between the two ant species across the developmental stages (Fig. 1c), reflecting considerable conservation of developmental GRNs in the Myrmicinae subfamily to which both ant species belong. Developmental transcriptomes were more similar for gynes than for workers across the developmental stages, both when comparing the two ant species with each other and when contrasting the two M. pharaonis caste profiles with those of D. melanogaster females (P < 0.01 for all examined stages and for both comparisons; two-sided t-tests) (Fig. 1c). These patterns are consistent with gyne development being under stronger selection constraint than worker development across social insects with permanent caste differentiation11. However, it is important to acknowledge that gynes are elaborations of the ancestral reproductive phenotype of solitary female Hymenoptera, while differentiated worker castes are later innovations whose foundational GRNs evolved analogously, not homologously, in the superorganismal ants, bees and vespine wasps17.
Caste prediction in morphologically undifferentiated larvae
Morphological differences between gyne and worker individuals cannot be detected before the second and third larval instar in M. pharaonis and A. echinatior, respectively14,18. To identify individual caste phenotypes in earlier stages lacking morphological markers, we developed the backward progressives algorithm (BPA) that retrospectively infers the likelihood of individuals belonging to one caste or another (Methods; Extended Data Fig. 2a). BPA assumes that key genes active in the GRN at a specific stage should, albeit with modified expression, also participate in caste differentiation during the subsequent developmental stage, analogous to what is known for key transcription factors that specify cell types during metazoan development5,8,19. We validated BPA using embryonic sex differentiation data from Drosophila (Extended Data Fig. 2b) and confirmed the accuracy of BPA in samples of M. pharaonis larvae with known caste identity (Extended Data Fig. 2c).
We applied BPA to 54 transcriptomes of first instar M. pharaonis larvae (Fig. 2a) and predicted 12 of these to be reproductives (gynes and males) and 18 to be workers with >90% probability. We validated these predictions with RNA fluorescent in situ hybridization (HCR-FISH20) to assess the expression colocalization between vasa, a germline marker of first instar larvae and late embryos21 (Extended Data Fig. 2d) and LOC105839887 and histone-lysine N-methyltransferase SMYD3 (LOC105830671). These two genes exhibit strong differential expression between predicted first instar caste phenotypes (Supplementary Table 1) and have binary gyne-worker expression in second instar larvae. First instar LOC105839887 expression is visible in fat bodies while SMYD3 colocalizes with vasa in the larval gonads (Fig. 2b, left panel). Both genes could be unambiguously detected in individuals with a vasa-specified germline and were always absent in individuals without a germline (Fig. 2b, right panel).
We also applied BPA in A. echinatior where we lacked morphological markers for second and first instar larvae. While third instar gyne larvae of this species can be unambiguously distinguished from worker larvae by their full-body curly hairs14, this pilosity is not yet expressed in second instar larvae. BPA found that the first two PC axes constructed from the second instar transcriptomes separated second instar larvae by body size and third instar larvae by caste identities (Fig. 2c). Further inspection showed that the larger second instar larvae (suspected gynes) in fact have some gyne-like curly hairs in their ventral thorax region (Extended Data Fig. 2e), indicating that caste differentiation in A. echinatior begins before the second larval instar. To our knowledge, BPA is the first algorithm to achieve such accurate backwards predictions of developmental stages.
Caste differentiation in ants is developmentally canalized
We next focused on the overall degree of canalization in genome-wide gene expression. For practical purposes, we defined transcriptome-level canalization as the statistical tendency for individual transcriptomes to start with a unimodal (pluripotent) distribution and gradually change to a bimodal (phenotypically committed) distribution with increasingly distinct peaks as development proceeds. For this purpose, we quantified the distributions of genome-wide developmental potential (Δ) as a gyne or worker individual, using deviations in gene expression from average target profiles in subsequent developmental stages (Methods). The Δ-values range between −1 and 1, with a positive value indicating that development in a target individual is gyne-biased and a negative value indicating worker-biased development. We found that the absolute Δ-value between castes increased steadily in both ant species, while the variance of Δ-values within castes, which we validated not to have been notably affected by technical artifacts (Supplementary information), became gradually reduced as development proceeds (both P < 0.0001; two-way ANOVA) (Fig. 3a and Extended Data Fig. 3). During this process, transcriptomic canalization in gynes was invariably stronger than in workers (P < 0.05 in all stages; two-sided F-tests), both in transcriptomic variation per se and in PCA patterns (Extended Data Fig. 3), indicating a higher degree of transcriptomic canalization in colony germline individuals.
JH signalling regulates developmental caste canalization
Genome-wide transcriptomic canalization signatures amplified beyond the third instar when pupal metamorphosis starts (Fig. 3a), a critical stage in all holometabolous insects22. To understand the entirety of upstream regulation of caste differentiation, we used generalized linear models to account for the effect of larval body mass (Methods)23 and identified 65 conserved genes with parallel gyne-worker bias that were associated with larval differentiation in both ant species (Supplementary Table 2). These early caste differentially expressed genes (DEGs) are significantly enriched for genes involved in fatty acid and hormone metabolism (Supplementary Table 3)—orthologues of these genes with gyne-biased expression are also highly expressed in the fat-body tissues and tracheal system of Drosophila larvae (Extended Data Fig. 4a). In addition, multiple larval caste DEGs are associated with the juvenile hormone (JH) pathway, a key regulator of larval growth and molting in insects22,24. These included the genes daywake, encoding a haemolymph JH-binding protein, LOC118646735, a duplicate gene of Drosophila Juvenile hormone acid O-methyltransferase (jhamt) and hexamerin25 (Extended Data Fig. 4b), confirming the important role of JH for caste differentiation26,27.
We found that many genes involved in JH and ecdysone metabolism exhibited both body length-specific and caste-specific expression when larvae transition from the third instar to the prepupal stage (Fig. 3b and Extended Data Fig. 5a). In particular, the expression of jhamt, which delays the metamorphic molt28, started decreasing in third instar worker larvae when body length exceeded 0.7 mm, while expression in gyne larvae did not decrease until larvae had reached twice that length of 1.4 mm (Methods) (Fig. 3b). A similar difference occurred in the expression of Ecdysone-induced protein 93 (E93), a downstream transcription factor of the JH pathway for initiating metamorphosis29 (Fig. 3b), which started increasing when worker larval body length exceeded 1.4 mm but not in gyne larvae before they had reached 2.0 mm. These differences in body-length thresholds indicate heterochronic shifts between gyne and worker larvae in the JH signalling pathway, confirming the critical role of the JH pathway in regulating caste differentiation.
To experimentally prove the role of JH in caste canalization, we fed a JH analogue (JHA) to third instar worker larvae and precocene I, a JH inhibitor30, to third instar gyne larvae of M. pharaonis. This showed that JHA significantly increased worker body length and that precocene I significantly reduced gyne body length (Fig. 3c). Furthermore, wing buds and simple eyes (ocelli), both typical gyne characters, were induced in the JHA-fed worker larvae (Extended Data Fig. 5c) while precocene I-treated gyne larvae had a significantly higher frequency of abnormal wings when they hatched as adults (P < 0.05 for Fisher’s exact test; Extended Data Fig. 5d). Also these aberrant patterns of development confirmed that the JH signalling pathway is a key regulator for caste canalization in ants.
Focal genes mediating canalization
To understand which genes are canalized during caste differentiation in the pupal stage, we developed a gene-specific canalization score (C) to track the developmental dynamics of between-caste gene expression divergence via the ratio of between-caste gene expression difference and stage-specific expression variance within castes (Methods). We identified 1,140 and 2,478 genes showing canalized expression in M. pharaonis (gynes versus workers) and A. echinatior (gynes versus small workers), respectively (C > 3 and P < 0.05 in one-sided Spearman’s correlation tests) (Fig. 4a and Supplementary Table 4). Among these canalized genes, 457 showed the same caste-bias direction in both species, a significantly higher number than the background expectation of 88 (P < 10−15; Fisher’s exact test), indicating that gene-level canalization is evolutionarily conserved.
Comparison of the conserved canalized genes with tissue-level gene expression data in Drosophila31 showed that gyne-biased canalized genes were highly expressed in ovaries, whereas worker-biased canalized genes were highly expressed in the brain and central nervous system (Extended Data Fig. 6a). Further analyses indicated that gyne-biased canalized genes were significantly enriched in flight muscle functions (for example, tropomyosin-2 (Tm2), troponin I (TnI) and troponin C (TnC)) and female reproductive functions (for example, T-complex protein 1 (Tcp-1), krasavietz (kra) and merry-go-round (mgr)) (P < 0.05 for both categories in hypergeometric tests). In contrast, worker-biased canalized genes were significantly enriched in neuronal and behavioural processes (P < 0.05 for both categories in hypergeometric tests), including twin of eyeless (toy), hormone receptor 51 (Hr51) and octopamine receptors (Octbeta1R and Octalpha2R). These gene functions in a solitary insect are consistent with gyne caste specialization being targeted at dispersal and reproduction and worker caste specialization at more variable ‘somatic’ social tasks (Fig. 4a, Extended Data Fig. 6b and Supplementary Table 5).
Freja as a crucial regulator of queen phenotype
The top gene with gyne-biased canalization in M. pharaonis is Hymenoptera-specific (Methods) and encodes a protein containing a predicted signal peptide and a leucine-rich repeat domain (Extended Data Fig. 7a). Caste-biased expression of this gene begins in second instar larvae and amplifies as development progresses (Extended Data Fig. 7b). In adult gynes, this gene is mainly expressed in the ovaries (Extended Data Fig. 7c) where its expression is restricted mostly to the ovarian follicle cells (Fig. 4b) which are essential for oogenesis32. We therefore named this gene follicle related [gene-]expression in juvenile ants (Freja, goddess of fertility in Old Norse).
We investigated Freja’s function through RNAi knock-down in late third instar gyne larvae. Relative to GFP-RNAi controls, adults of the Freja-RNAi treatment group had significantly reduced body and head size (two-sided t-tests; P < 0.0001 for both; Fig. 4c) and a higher frequency of abnormal wing morphology. As wings and large bodies relative to workers are unambiguous phenotypic markers of the adult reproductive (gyne/queen) caste in ants, we conclude that manipulation of Freja expression disturbed normal developmental canalization.
Because Freja remains highly expressed after pupal eclosion, we also manipulated Freja expression in adult Monomorium gynes with RNAi and examined effects on fertility. Both the size and the number of oocytes were significantly reduced in the Freja-RNAi group compared to the controls (two-sided t-tests; P < 0.05 for both; Fig. 4d and Extended Data Fig. 7e), indicating that Freja’s continued expression is crucial for gyne maturation and fertility after insemination. Thus, in addition to its necessary role in canalization of larval caste divergence, Freja maintains its differentiating functionality in adult gyne phenotypes. However, Freja was not a canalized gene in A. echinatior, a species where workers have retained ovaries to produce unfertilized (male) or inviable trophic eggs33 (Extended Data Fig. 6).
Discussion
We have shown that ant caste differentiation is canalized throughout development in a way that is remarkably analogous to the canalized development of metazoan cell lineages starting with the first cell divisions of a zygote, a potential functional similarity that was noted more than a century ago34. Also the deeper conservation of colony-level germline development is consistent with the secondary evolution of worker subcastes in ants being more common than the evolution of queen subcastes35,36 and appears broadly convergent to what is known from development in animal bodies5,6,37. This suggests that ant colony-level ontogeny, unfolding as caste differentiation, has been maintained by consistent stabilizing selection, in spite of substantial modifications in the details of caste differentiation during the huge adaptive radiation of the ants. We also showed that the JH signalling pathway, known to be important for caste differentiation in social insects in general26,38,39,40,41, mediates a heterochronic shift between gyne and worker larvae, suggesting that crucial metabolic changes control the regulation of both individual body size and the amplification of caste phenotype canalization.
Our study used an algorithmic approach (BPA) to predict caste phenotypes backwards in developmental time to identify gene markers before morphological caste differences emerge, a technique that should be broadly applicable in other kinds of developmental differentiation studies. Our analyses strongly suggest that caste in ants is determined by the interaction of body size and gene expression, rather than by one of these factors alone42. We covered the entire developmental process, starting with genome-wide transcriptomics and then zooming in on the JH and ecdysone pathways to finally focus on specific genes with key roles in canalization of caste development. Intriguingly, experimental inhibition of Freja finally showed that disruption of canalized genes results in non-adaptive intermediate phenotypes between gynes and workers, both early in development and in adults.
Our findings emphasize that caste-differentiated superorganismal colonies, as originally defined by Wheeler34, are shaped by higher-level adaptations to predictably reproduce entire life-cycles. In this process, the complementary development of caste phenotypes requires coordination and buffering via expression changes of canalized genes, analogous to the dynamic regulation of cell differentiation during development of metazoan bodies5,6,43,44. We conjecture that obligate canalization of caste development will be a defining feature for all insect lineages that independently realized irreversible evolutionary transitions to superorganismality (ants, corbiculate bees, vespine wasps and higher termites)2, setting the Wheeler-superorganisms—both annual and perennial—apart from their society-forming sister lineages45. However, our results indicate that even canalized developmental pathways can be changed when relevant genetic variation emerges and selection for directional change in caste phenotype is strong enough. This seems analogous once more with how cell lines diverge during adaptive radiation of multicellular organisms. Selection pressures thus appear broadly similar across domains of organizational complexity but the GRNs behind these convergent developmental processes will be uniquely different across lineages.
Methods
Experimental design
Ultralow-input transcriptome sequencing was performed for individual samples of two ant species, M. pharaonis and A. echinatior (both subfamily Myrmicinae), covering four larval stages, the prepupal stage, the early and late pupal stage and the adult stage (Extended Data Table 1). Individual transcriptomes were also obtained for nine embryonic stages (starting 12 h after egg laying and continuing sampling of older eggs with 12-h intervals) in M. pharaonis. In this ant species, caste is known to be determined ‘blastogenically’ in early embryos12,13, unlike most other ants where caste phenotype is determined during larval development46. For later developmental stages where caste phenotypes were morphologically distinguishable (Extended Data Table 2), ~30 individuals per caste (gynes and workers in M. pharaonis; gynes, large workers and small workers in A. echinatior) were collected for each stage (Extended Data Table 1). For stages where caste phenotype could not be identified morphologically (before the second instar in M. pharaonis and before the third instar in A. echinatior), 30 eggs or 60 larvae per stage were collected to transcriptomically infer caste. Body length and head capsule width were also measured for all larval individuals to obtain another measure of developmental age that integrated with transcriptomic information during analyses. Reference individual transcriptomes for D. melanogaster were further obtained, covering development from newly laid eggs to newly eclosed adults, sampling ~20 eggs at 3-h intervals and ~20 larvae at 6-h intervals. See Supplementary information for details of sample collection, stage and caste identification, RNA and DNA extraction, transcriptome profiling and sample quality control.
Testing RNA quality, constructing cDNA libraries and RNA sequencing
RNA quality testing, production of complementary DNA libraries and sequencing were performed at BGI, China. The quality of RNA samples was tested with Agilent 2100 Bioanalyzer and only samples with good RNA quality (RNA integrity number (RIN) > 4) were retained for RNA sequencing47. Ambion ERCC RNA Spike-In Mix (catalogue no. 4456740) was added to each sample according to the manufacturer’s instructions before cDNA library construction, to be able to later verify the qualities of sequenced RNAseq data.
RNA sequences for each sample were first reverse transcribed into cDNA following the Smart-seq2 protocol48, which was then randomly fragmented with Tn5 enzymes and linked with sequencing adaptors to obtain a complete cDNA library for each individual sample. Primers were then added to the cDNA libraries for PCR amplification and fragments ranging from 150 to 350 base pairs were selected for further cDNA circularization to construct sequencing libraries. The samples were then sequenced on a BGISEQ-500 platform using a 100 nucleotide paired-end ultralow-input RNA sequencing protocol.
In situ hybridization chain reaction
In situ probes
Sequences for LOC105837931 (Freja) (XM_012683128.3), vasa (XM_012686851.3), LOC105839887 (XM_036293539.1), LOC105830671 (Smyd3) (XM_036287663.1), actin5c (XM_012666578.3), tubulin (XM_012685189.3) and septin2 (XM_012667189.2) were downloaded from NCBI and provided to Molecular Instruments for probe set synthesis. Alexa Fluor 488 was used for the detection of vasa; Alexa Fluor 546 was used for the detection of Freja, actin5c and tubulin; Alexa Fluor 594 was used for the detection of LOC105839887; and Alexa Fluor 647 was used for the detection of Smyd3 and septin2.
RNA fluorescence in situ hybridization in M. pharaonis larvae and ovaries
RNA fluorescence in situ hybridization (FISH) was performed following the whole-mount Drosophila HCR v.3.0 protocol20 with some modifications in that M. pharaonis larvae and ovaries were fixed at room temperature in scintillation vials with 50% FPE (4% formaldehyde; 0.5× PBS; 25 mM EGTA) and 50% heptane. Fixation time was then adjusted so that ovaries were fixed for 30 min, first and second larval instars were fixed for 1–2 h and third larval instars were fixed for 12 h. Following fixation, the lower layer (FPE) was removed and replaced with methanol followed by vigorous shaking. The lower layer was replaced once more with methanol, at which point larvae and ovaries sink to the bottom of the vial. Larvae and ovaries were then dehydrated with several changes of methanol and stored at −20 °C. Proteinase K concentration and treatment time was adjusted to 30 µg ml−1 for 7 min for first and second larval instars and 10 min for third larval instars and ovaries. Following amplification, one SSCT wash (5× SSC; 0.1% Tween-20; pH 7.0) was extended to 1 h with the addition of 4′,6-diamidino-2-phenylindole (DAPI; 1:1,000) for nuclear staining in first and second larval instars or overnight for third larval instars and ovaries.
FISH-stained larvae and ovaries were then transferred into increasing concentrations of glycerol in 5× SSC and mounted in Vectashield or 70% glycerol/ 30% 5× SSC for imaging. Images were captured on a Leica SP5-X inverted confocal laser scanning microscope. Image stacks were processed using Fiji/ImageJ49.
RNA interference experiments
Third instar reproductive (gyne or male) larvae were taken from nests and fixed with double-sided adhesive tapes. Interference double-strand RNA (dsRNA) was then injected into the upper ventral abdomen with a capillary needle (1B100F, World Precision Instrument), equipped on a micropipette puller (P-2000, Sutter) and Eppendorf FemtoJet injector (Femtojet 4i and Transferman 4r system, Eppendorf). For adult gynes, dsRNA was injected into the thorax, using the same equipment as in the larval injections, 5-, 7- and 10-d after they had eclosed from the pupal stage. For both larval and adult RNAi, 7,500 ng μl−1 of dsRNA were used in both experimental (Freja) and control (GFP) groups. To improve the efficiency in larval injection, lipofectamine 2000 (11668019, Thermo Fisher) was added in Freja and GFP dsRNA liquid (dsRNA:lipofectamine 2000 was 3:1). The dsRNA was synthesized in vitro, following the instructions of the MEGAscript RNAi Kit (AM1626, Thermo Fisher).
The Freja T7 primer sequences for dsRNA synthesis were:
Forward: 5′-TAATACGACTCACTATAGGGCATCCATATCGTTGAAGGGC-3′
Reverse: 5′-TAATACGACTCACTATAGGGGTCCAGGTCGGTGAAGTTGT-3′
The GFP T7 primer sequences for dsRNA synthesis were:
Forward: 5′-TAATACGACTCACTATAGGGAGTGCTTCAGCCGCTACCC-3′
Reverse: 5′- TAATACGACTCACTATAGGGCATGCCGAGAGTGATCCCG-3′
Quantification of tissue-expression abundance with RT–qPCR
Heads, thoraxes and gasters (fourth and higher abdominal segments) were dissected from newly eclosed gynes and gasters were subdivided by tissue into digestive glands, cuticles, fat bodies and ovaries. Total messenger RNA of the dissected tissues was then isolated using TRIzol reagent (15596018, Thermo Fisher). Reverse transcription was performed using the PrimeScript RT reagent Kit with gDNA Eraser (RR047B, Takara and mRNA levels were quantified using TB Green Premix EX TaqTM II (Tli RNaseH Plus, RR820A, Takara) on a CFX96TM Real-Time system (BIO-RAD). Expression of Freja was normalized to the expression of EF1α in each sample.
The quantitative PCR with reverse transcription (RT–qPCR) primer sequences for Freja were:
Forward: 5′- AACAGGGCAAACTCAGATATTTAC-3′
Reverse: 5′- AGGCATCGATCGTTATCTCGG-3′
The RT–qPCR primer sequences for EF1α were:
Forward: 5′-TTCATTTATTGCTCTCACATCTACG-3′
Reverse: 5′- ACCGTTGCCCTTTCTACTCTAA-3′
Quantification of ovary developmental status
Ovary developmental status was quantified by the number and the surface area of yolky oocytes. Ovaries were dissected and collected from 12-day-old gynes, where the number and the total surface area of yolky oocytes were counted and measured, respectively, from individual samples.
JHA and precocene I feeding experiment
Third instar worker larvae of intermediate body length were treated with JHA (Methoprene, MCE HY-B1161; 5 mg ml−1 in 10% EtOH PBS solution), while control worker larvae were fed with 10% EtOH PBS solution. Both 0.5 mg ml−1 JHA and 10% EtOH were mixed with foods and offered on days 1, 3 and 8. To confirm the efficiency of JHA, the treated worker larvae were collected 24 h after day 1 feeding. After isolating total RNA, the expression of Kr-h1, a downstream gene of JH, was determined in both control and JHA groups. The expression of Kr-h1 was normalized to EF1a in each sample. The RT–qPCR primer sequence for Kr-h1 were:
Forward: 5′- AGGATATAACGCAGCTTCCTGT-3′
Reverse: 5′- GTGTGGCAGCGAACATTGTG-3′
Third instar gyne larvae with intermediate body length were treated with 1% precocene I (Sigma-195855; in 10% EtOH in PBS), while control gyne larvae were fed with 10% EtOH PBS solution. Both 1% precocene I and 10% EtOH were mixed with foods and offered on days 1, 3, 5 and 7.
To reveal the effects of JHA and precocene I on larval development, pupal stage samples were collected and body lengths were measured under a stereomicroscope (SMZ18, Nikon). For JHA and control groups, cohort percentages of pupation on days 10, 15 and 20 were also recorded to check whether development time was affected. When pupae eclosed into adults, their head width across the eyes and scape lengths (proxies of body size) were measured in the control and JHA groups using the same stereomicroscope.
Constructing the developmental trajectory network
Within each species, the developmental trajectory network was constructed on the basis of the transcriptomic similarities among all samples. These similarities, as described in the Supplementary Information ‘Sample quality control’, formed a pair-wise similarity matrix among all samples. Values of this matrix were used to construct a weighted undirected network, where nodes represent samples and edges represent similar transcriptome profiles between connected samples. The weights of edges were measured by the Spearman’s correlation coefficient among samples, indicating the level of transcriptome similarity. To increase the overall signal of the network, weak connecting edges (weight < 0.8) were removed and the threshold criterion is based on the empirical suggestion that correlation coefficients for anisogenic samples should be >0.8 (ref. 50).
The weighted undirected network was then visualized with the force-directed layout algorithm using the igraph package (v.1.2.9) in R (ref. 51). This algorithm takes the weights of edges into account, so that nodes with strong edges (samples with high transcriptome abundance similarities) are clustering together. For visualization purposes, edges’ colour was set according to edges’ weight, ranging from white (weight = 0.8) to black (weight = 1).
Alignment of developmental transcriptomes and measurement of between-species transcriptomic similarity
Rate of development and number of instars differ between ant species and castes1,52. In our study species, there are three larval instars in M. pharaonis and three to four in A. echinatior. Embryogenesis in M. pharaonis lasts for ~9 d, which is long compared to D. melanogaster where eggs hatch after ~24 h. Developmental stages thus need to be aligned between species before comparative analyses can be done.
To align developmental stages, either between castes or between species, developmental stage similarities were measured by their overall transcriptomic distance, calculated as 1 − Spearman’s correlation coefficient of transcriptomes (or orthologous transcriptomes for cross-species comparison). Stages with the lowest overall transcriptomic distance (mean value for all same-stage samples) were assessed as the best-matched (aligned) stages and used for downstream between-caste and between-species comparisons.
Between-species transcriptomic similarity for each developmental stage was calculated as the mean value of 1 − Spearman’s correlation coefficients among samples of the best-aligned stages, separately for each caste.
The BPA
We developed a new algorithm that allows backwards prediction of caste identity on the basis of the sequential overall transcriptomic patterns. The algorithm compares (1) the transcriptomes of individuals at a target developmental stage with unknown caste identity and (2) the transcriptomes of individuals at the later stage where caste identity was known or had been assigned in a previous round. For each round of prediction, BPA performed four steps: normalization, feature selection, model training and prediction.
Because prediction data (of differentiation in unassigned transcriptomes of the target developmental stage) and training data (caste-assigned transcriptomes from the subsequent stage) represent two continuous developmental stages next to each other in time, developmental effects are expected to always contribute, but with quantitatively different effects, to both datasets. The first step of BPA (normalization) therefore removes such developmental effects by subtracting the mean expression levels and normalizing the expression variation across the two datasets, using the Combat package (from sva (v.3.40.0)) in R, which sets developmental stage as batch covariate53. The normalized transcriptomes thus represent expression levels that are independent of developmental stage and with a maximal likelihood to reflect segregation by caste in both datasets.
The second essential step of BPA is feature selection. This starts with a PCA of the prediction data, assuming that one or several of the PC axes from this dataset should be related to as-yet-unspecified caste identities in the target stage. We thus assumed that one or several of the top PC axes should include the best-possible set of caste PC axes driven by the expression difference between the caste phenotypes not yet identified. The second substep of feature selection is then to confront these PC axes with training data, by projecting them on the samples of the subsequent developmental stage. This comparison uses singular value decomposition to extract the coefficients of each PC axis and then multiplies them with the training data. This process produces new PC scores for each individual in the training data, which then allows the third substep of identifying the PC axes that best separate the known caste identities in the training data when performing ANOVA. The most significant PC axes are then assumed to be the shared caste PCs for both prediction and training data.
With the selected candidate feature (the best-fitting PC axes for caste) in place, the third and fourth steps of BPA are model training and prediction. We first applied linear discriminant analysis on the known caste PC values to train a predictive model for caste segregation. We then predicted individual caste identities of target stage individuals and assigned a probability of being gyne (reproductive of unspecified sex before the second larval instar) or worker to each individual. Once a complete round is completed, a prediction result for developmental stage n (Sn) can then be used to predict individual caste identities of transcriptomes at stage n − 1 (Sn−1), following the same four steps as in the previous round of prediction, except that now the samples at Sn−1 were used as prediction data and the samples at Sn as training data.
For BPA prediction of first instar larvae in M. pharaonis, body size independent caste DEGs of second instar larvae (number of genes = 173) were used for feature selection. Body size independent caste DEGs were identified from samples of these second instar larvae with approximately equal body lengths (ranging between 0.52 and 1.08 mm), using DESeq2 (ref. 54) with the model: Exp ≈ caste + log(body length) (see section Detecting DEGs between castes). A gene was retained for feature selection if its adjusted P value was <0.05 and its log2 expression fold-change between castes was >0.5. Compared to using all genes, our use of caste DEGs for feature selection substantially increased the accuracy in our testing dataset (Extended Data Fig. 2c), probably because it removed the housekeeping genes whose expression is unrelated to caste differentiation. See Supplementary information for ‘Validating the accuracy of BPA’ and ‘Testing the influence of sex on early developmental transcriptomes in M. pharaonis’.
Quantification of transcriptome variation and difference
Within-stage transcriptome variation was measured by the mean value of 1 − Spearman’s correlation coefficients between the transcriptomes of target samples and the transcriptomes of all other samples within the same stage, regardless caste identities for overall measurement or separately for each caste for within-caste measurement.
Between-caste transcriptomic differences were calculated as the average expression difference between gynes and workers (of the same stage) for all genes. The expression difference of a single gene was calculated as:
where the absolute value serves to remove the likelihood sign difference for worker or gyne bias.
As the gene expression levels were already normalized by log2, the expression difference between castes is equivalent to the absolute value of log2 expression fold-change between castes.
Quantification of caste developmental potential
Developmental potential of target individuals was based on the transcriptomic distance between a target individual and its focal caste (measured as the average transcriptome of all same-caste individuals) at the subsequent developmental stage. If the transcriptomic distance between a target individual and a representative gyne was smaller than the distance between that target individual and a representative worker, the target individual was classified as being more likely to start developing into (or continue its development into) a gyne rather than a worker.
The mean transcriptomic differences between stages were first normalized by standardizing the expression level of each gene for all same-stage individuals. This step removed the quantitative differences between developmental stages (see first step (normalization) in The BPA section). The developmental potential (Δ) of each individual (i) was then calculated with the following formula:
Here, dist (i, castet+1) is the transcriptomic distance between individual i and the focal caste at the subsequent stage and is calculated as the Manhattan distance, a robust measure for transcriptomes that is commonly used for arithmetic calculations. The transcriptomic distance difference between castes was then normalized by dist (gynet+1, workert+1), which measures the transcriptomic distance between a focal gyne and a focal worker at the subsequence stage, so that ΔX,t becomes a dimensionless measure. ΔX,t = 1 then indicates that individual i is equivalent to a representative gyne in the subsequent stage while ΔX,t = −1 indicates that the individual is equivalent to the representative worker.
Quantification of gene-level canalization scores
Gene-level canalization scores were calculated based on the developmental dynamics of expression divergence between gyne and worker castes. For each developmental stage with known/predicted caste phenotypes, a modified t score for expression divergence of a target gene (g) was first calculated as:
Here, sp is the pooled standard deviation for the expression levels of gynes and workers:
A high absolute value of a t score indicates a high between-caste expression difference or a low within-caste expression variance(s).
On the basis of the t scores across developmental stages, the canalization score (C) was then quantified to measure the developmental trend for the expression differences between castes. We defined the canalization score as:
where Pg is the P value of the correlation test for the absolute values of t scores across developmental stages and tg,final is the t score of the late pupal stage when the morphological differentiation process between gynes and workers is largely completed. With the canalization score, we can thus capture the canalization level for each gene because a high value of −log10(Pg) indicates an increasing between-caste difference or a decreasing within-caste variance across developmental stages and a high absolute value of tg,final indicates a large between-caste expression divergence at the terminal stage.
On the basis of these canalization scores, we defined a gene as being canalized if Pg was <0.05 and the absolute value of Cg was >3.
Identifying the phylogenetic origin of Freja
To investigate the phylogenetic origin of Freja (NCBI ID: LOC105837931), the orthologue group of this gene, that is, the set of genes (including both orthologues and paralogues) descended from a single gene in the last common ancestor, was identified (with Orthofinder) among 17 selected species, including 11 Hymenoptera and 6 species outside the Hymenoptera (Supplementary information on ‘Detection of orthologues and homologues across species’). All gene members of the Freja orthologue group were found exclusively in the Hymenoptera, indicating that Freja is a hymenopteran order-specific gene.
Multiple sequence alignments were further performed on the protein sequences of the Freja orthologue group, using T-coffee (v.13.45.0) with default parameters55. With the aligned protein sequences, a gene tree was reconstructed, using IQ-TREE (v.2.1.4) with 1,000 replicates for bootstrapping and 1,000 replicates for Shimodaira–Hasegawa approximate likelihood ratio test56.
Body-length threshold regression model
To identify the threshold for a change in expression dynamics at a certain larval body size, gene expression levels of each caste were fitted with a continuous two-phase (segmented) model (M11) using the R package chngpt (v.2021.5-12)57. The threshold model can be expressed as:
Here, the threshold regression model detects a significant change of slopes (α and β) before and after a body size threshold and a threshold model is significant if the P value for the log likelihood ratio test between the threshold model and the null model, Exp ≈ body size is < 0.05.
Detecting DEGs between castes
Caste DEGs at each developmental stage were detected using the DESeq2 (v.1.32.0) package in R (ref. 54). For all samples within each developmental stage, a read count matrix of transcriptomes (output from the Salmon mapping; see above) was loaded using tximport58, which integrated expression profiles from the transcript level to the gene level. With these gene-level transcriptomes, DESeq2 was then used to model the expression level of each gene as: Exp ≈ caste and genes were defined as caste DEGs for a target developmental stage if their adjusted P values were <0.05.
To reduce the confounding effect of body size on gene expression difference between castes (for example, second instar sexuals are always larger than second instar workers, so the expression difference might be the result of a larger body size)23, body length measurements were further integrated to adjust slopes for the influence of body length and thus identify body size independent caste DEGs for second instar larvae. The expression level of each gene was modelled as:
A gene was then assessed as a body size independent caste DEG if the adjusted P value for caste was <0.05 and there was at least a 1.6-fold expression difference between castes. The expression difference between castes was estimated with a robust linear regression model59.
Determination of tissue origins of gene expression based on Drosophila database
Tissue origin of expression of target genes were based on the D. melanogaster gene expression atlas (FlyAtlas2)31. For larval stages, the expression data from larval main tissues, including brain, midgut, hindgut, Malpighian tubules, fat body, salivary gland and trachaea, were used and their relative expression levels were calculated on the basis of their transcripts per million (TPM) values:
where one pseudo count was added to obtain a robust estimation in case of low TPM values.
For the pupal and adult stages, the expression data from adult female tissues, including brain, eye, thoracicoabdominal ganglion, midgut, hindgut, Malpighian tubules, fat body, salivary gland and ovary, were used. Relative expression levels were calculated as in the larval stages (see above).
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Data availability
All transcriptomic data and the Whole Genome Shotgun project of A. echinatior are deposited at NCBI GEO and GenBank (BioProject Number: PRJNA767561).
Code availability
Schematic diagrams and computational codes developed for this project and our in-house genome annotation for A. echinatior can be found at: https://github.com/BitaoQiu/devo-ants
References
Wheeler, W. M. Ants: Their Structure, Development and Behavior (Columbia Univ. Press, 1910).
Boomsma, J. J. & Gawne, R. Superorganismality and caste differentiation as points of no return: how the major evolutionary transitions were lost in translation. Biol. Rev. 93, 28–54 (2018).
Waddington, C. H. The Strategy of the Genes: A Discussion of Some Aspects of Theoretical Biology (George Allen & Unwin, 1957).
Waddington, C. H. Organisers and Genes (Cambridge Univ. Press, 1940).
Farrell, J. A. et al. Single-cell reconstruction of developmental trajectories during zebrafish embryogenesis. Science 360, eaar3131 (2018).
Wagner, D. E. et al. Single-cell mapping of gene expression landscapes and lineage in the zebrafish embryo. Science 360, eaar4362 (2018).
Sladitschek, H. L. et al. MorphoSeq: full single-cell transcriptome dynamics up to gastrulation in a chordate. Cell 181, 922–935 (2020).
Schiebinger, G. et al. Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming. Cell 176, 928–943 (2019).
Schrader, L., Simola, D. F., Heinze, J. & Oettler, J. Sphingolipids, transcription factors, and conserved toolkit genes: developmental plasticity in the ant Cardiocondyla obscurior. Mol. Biol. Evol. 32, 1474–1486 (2015).
Warner, M. R., Qiu, L., Holmes, M. J., Mikheyev, A. S. & Linksvayer, T. A. Convergent eusocial evolution is based on a shared reproductive groundplan plus lineage-specific plastic genes. Nat. Commun. 10, 2651 (2019).
Warner, M. R., Mikheyev, A. S. & Linksvayer, T. A. Genomic signature of kin selection in an ant with obligately sterile workers. Mol. Biol. Evol. 34, 1780–1787 (2016).
Edwards, J. P. Caste Regulation and Determination in the Pharaoh’s Ant Monomorium pharaonis (L.). PhD thesis, Univ. of Southampton (1985).
Khila, A. & Abouheif, E. Evaluating the role of reproductive constraints in ant social evolution. Philos. Trans. R. Soc. B 365, 617–630 (2010).
Adams, R. M. M. et al. Hairs distinguish castes and sexes: identifying the early ontogenetic building blocks of a fungus-farming superorganism (Hymenoptera: Formicidae). Myrmecol. News 31, 201–216 (2021).
Schmidt, A. M., D’Ettorre, P. & Pedersen, J. S. Low levels of nestmate discrimination despite high genetic differentiation in the invasive pharaoh ant. Front. Zool. 7, 20 (2010).
Peeters, C. Convergent evolution of wingless reproductives across all subfamilies of ants, and sporadic loss of winged queens (Hymenoptera: Formicidae). Zenodo https://doi.org/10.5281/zenodo.844241 (2012).
Qiu, B. et al. Towards reconstructing the ancestral brain gene-network regulating caste differentiation in ants. Nat. Ecol. Evol. 2, 1782–1791 (2018).
Pontieri, L. et al. From egg to adult: a developmental table of the ant Monomorium pharaonis. Preprint at bioRxiv https://doi.org/10.1101/2020.12.22.423970 (2020).
Holmberg, J. & Perlmann, T. Maintaining differentiated cellular identity. Nat. Rev. Genet. 13, 429–439 (2012).
Choi, H. M. T. et al. Third-generation in situ hybridization chain reaction: multiplexed, quantitative, sensitive, versatile, robust. Development 145, dev165753 (2018).
Khila, A. & Abouheif, E. Reproductive constraint is a developmental mechanism that maintains social harmony in advanced ant societies. Proc. Natl Acad. Sci. USA 105, 17884–17889 (2008).
Truman, J. W. & Riddiford, L. M. The evolution of insect metamorphosis: a developmental and endocrine view. Philos. Trans. R. Soc. B 374, 20190070 (2019).
Montgomery, S. H. & Mank, J. E. Inferring regulatory change from gene expression: the confounding effects of tissue scaling. Mol. Ecol. 25, 5114–5128 (2016).
Nijhout, H. F. et al. The developmental control of size in insects. WIREs Dev. Biol. 3, 113–134 (2014).
Zhou, X., Tarver, M. R. & Scharf, M. E. Hexamerin-based regulation of juvenile hormone-dependent gene expression underlies phenotypic plasticity in a social insect. Development 134, 601–610 (2006).
Libbrecht, R. et al. Interplay between insulin signaling, juvenile hormone, and vitellogenin regulates maternal effects on polyphenism in ants. Proc. Natl Acad. Sci. USA 110, 11050–11055 (2013).
Penick, C. A., Prager, S. S. & Liebig, J. Juvenile hormone induces queen development in late-stage larvae of the ant Harpegnathos saltator. J. Insect Physiol. 58, 1643–1649 (2012).
Shinoda, T. & Itoyama, K. Juvenile hormone acid methyltransferase: a key regulatory enzyme for insect metamorphosis. Proc. Natl Acad. Sci. USA 100, 11986–11991 (2003).
Belles, X. Krüppel homolog 1 and E93: the doorkeeper and the key to insect metamorphosis. Arch. Insect Biochem. 103, e21609 (2020).
Amsalem, E., Teal, P., Grozinger, C. M. & Hefetz, A. Precocene-I inhibits juvenile hormone biosynthesis, ovarian activation, aggression and alters sterility signal production in bumble bee (Bombus terrestris) workers. J. Exp. Biol. 217, 3178–3185 (2014).
Leader, D. P., Krause, S. A., Pandit, A., Davies, S. A. & Dow, J. A. T. FlyAtlas 2: a new version of the Drosophila melanogaster expression atlas with RNA-Seq, miRNA-Seq and sex-specific data. Nucleic Acids Res. 46, gkx976 (2017).
Wu, X., Tanwar, P. S. & Raftery, L. A. Drosophila follicle cells: morphogenesis in an eggshell. Semin. Cell Dev. Biol. 19, 271–282 (2008).
Dijkstra, M. B., van Zweden, J. S., Dirchsen, M. & Boomsma, J. J. Workers of Acromyrmex echinatior leafcutter ants police worker-laid eggs, but not reproductive workers. Anim. Behav. 80, 487–495 (2010).
Wheeler, W. M. The ant‐colony as an organism. J. Morphol. 22, 307–325 (1911).
Hölldobler, B. & Wilson, E. O. The Ants (Harvard Univ. Press, 1990).
Rüppell, O. & Heinze, J. Alternative reproductive tactics in females: the case of size polymorphism in winged ant queens. Insect Soc. 46, 6–17 (1999).
Extavour, C. G. & Akam, M. Mechanisms of germ cell specification across the metazoans: epigenesis and preformation. Development 130, 5869–5884 (2003).
Corona, M., Libbrecht, R. & Wheeler, D. E. Molecular mechanisms of phenotypic plasticity in social insects. Curr. Opin. Insect Sci. 13, 55–60 (2015).
Korb, J. & Belles, X. Juvenile hormone and hemimetabolan eusociality: a comparison of cockroaches with termites. Curr. Opin. Insect Sci. 22, 109–116 (2017).
Barchuk, A. R. et al. Molecular determinants of caste differentiation in the highly eusocial honeybee Apis mellifera. BMC Dev. Biol. 7, 70 (2007).
Nijhout, H. F. & Wheeler, D. E. Juvenile hormone and the physiological basis of insect polymorphisms. Q. Rev. Biol. 57, 109–133 (1982).
Trible, W. & Kronauer, D. J. C. Ant caste evo-devo: size predicts caste (almost) perfectly. Trends Ecol. Evol. 36, 671–673 (2021).
Briggs, J. A. et al. The dynamics of gene expression in vertebrate embryogenesis at single-cell resolution. Science 360, eaar5780 (2018).
Cao, J. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502 (2019).
Boomsma, J. J. Domains and Major Transitions of Social Evolution (Oxford Univ. Press., 2022).
Schwander, T., Lo, N., Beekman, M., Oldroyd, B. P. & Keller, L. Nature versus nurture in social insect caste differentiation. Trends Ecol. Evol. 25, 275–282 (2010).
Schroeder, A. et al. The RIN: an RNA integrity number for assigning integrity values to RNA measurements. BMC Mol. Biol. 7, 3 (2006).
Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 9, 171 (2014).
Schindelin, J. et al. Fiji: an open-source platform for biological-image analysis. Nat. Methods 9, 676–682 (2012).
ENCODE Guidelines and Best Practices for RNA-Seq (Encode, 2016); https://www.encodeproject.org/documents/cede0cbe-d324-4ce7-ace4-f0c3eddf5972/@@download/attachment/ENCODE%20Best%20Practices%20for%20RNA_v2.pdf
Csardi, G. & Nepusz, T. The igraph software package for complex network research (InterJournal, 2006).
Schultner, E., Oettler, J. & Helanterä, H. The role of brood in eusocial Hymenoptera. Q. Rev. Biol. 92, 39–78 (2017).
Leek, J. T., Johnson, W. E., Parker, H. S., Jaffe, A. E. & Storey, J. D. The SVA package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28, 882–883 (2012).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Notredame, C., Higgins, D. G. & Heringa, J. T-coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302, 205–217 (2000).
Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).
Fong, Y., Huang, Y., Gilbert, P. B. & Permar, S. R. chngpt: threshold regression model estimation and inference. BMC Bioinf. 18, 454 (2017).
Soneson, C., Love, M. I. & Robinson, M. D. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences [version 2; peer review: 2 approved]. F1000Research 4, 1521 (2016).
Huber, P. J. Robust Statistics (John Wiley & Sons, 1981). https://doi.org/10.1002/0471725250.ch1
Acknowledgements
We thank the China National Genebank for providing transcriptomics sequencing service with MGI sequencers and Danish National Life Science Supercomputing Center, Computerome, for providing computational resources. We thank Z. Xiong for uploading the A. echinatior genomic data to NCBI. This work was supported by the Lundbeck Foundation (grant no. R190-2014-2827 to G.Z.), the Villum Foundation (Villum Investigator Grant, grant no. 25900 to G.Z.), the European Research Council (advanced grant no. 323085 to J.J.B.), the National Natural Science Foundation of China (grant no. 31820573 to G.Z.; no. 31501057 to Q.L. and no. 31900399 to W.L.).
Author information
Authors and Affiliations
Contributions
B.Q., J.J.B. and G.Z. conceived the project. B.Q., X.D., R.S.L., W.L. and G.Z. designed the experiments. B.Q. and G.Z. developed the bioinformatic methods. B.Q. analysed the data. X.D., P.L., R.S.L. and R.L. led experimental work, assisted by B.Q., A.L.P., G.D., M.J.T., X.Z., D.Z., Q.G., T.W., L.P. and L.W. B.Q. and P.L. managed the data, assisted by R.S.L., W.J. and C.G. K.R., Q.L., W.L. and G.Z. provided resources for the experiments. B.Q., J.J.B. and G.Z. prepared the manuscript, assisted by X.D., R.S.L., R.L., A.L.P. and M.J.T.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Ecology & Evolution thanks Christopher Wyatt and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Developmental transcriptomes in ants.
a, Transcriptomic developmental trajectories in A. echinatior, based on Spearman rank correlation similarity in gene expression across individual transcriptomes. Trajectories were constructed and visualized with a similar procedure as in Fig. 1a, except for A. echinatior having three worker subcastes, which exhibited increasing transcriptomic divergence across the pupal and adult stages. b, Between-stage transcriptomic similarity matrix in M. pharaonis (upper panel) and A. echinatior (lower panel), based on the mean values of within-group and between-group correlation coefficients. In M. pharaonis, correlation coefficients within the same stage were always higher than between stages, and transcriptomic similarities consistently clustered by adjacent developmental stages. In A. echinatior, correlation coefficients within the same caste were higher than within the same stage, consistent with morphological caste differences being more substantial in A. echinatior.
Extended Data Fig. 2 Prediction of caste identities with BPA.
a, The Backward Progressives Algorithm (BPA) predicts caste identities in previous developmental stages, using non-validated transcriptomes at the target stage (St) to construct PCAs and then projects these onto the subsequent stage (St+1) where caste identities are known. The BPA then uses the known caste labels at St+1 to identify PC axes that are associated with confirmed caste identity and uses linear discriminant analysis to train a predictive model, assuming that the PC axes at St+1 are also associated with caste identities at St as expected under developmental continuity. The trained model then predicts caste identities at St, after which it assumes these predicted caste identities to be real and initiates a next round to predict caste identities at stage St-1. This process continues until the prediction likelihoods at St-n become too low to be informative. b, BPA predicted the sex of sampled individuals among 1st instar (44 hour) larvae in D. melanogaster (left panel). While sex can be distinguished in 2nd instar (50 hour) larvae (right panel) via genotyping after simultaneous DNA and RNA extraction, the biomass of 1st instar larvae was too small to perform such simultaneous extractions. By examining the proportion of reads that mapped to the Y chromosome of Drosophila, we found that predicted males in 1st instar larvae had a significantly higher proportion of reads mapped to the Y chromosome, confirming our prediction. Individual samples are coloured according to their predicted probability to be female or male in the 1st-instar and symbols were sized according to the number of reads mapped to Y chromosome per million reads (YPM). c, Validating BPA on samples of developmental stages with known caste identities used individual M. pharaonis transcriptomes of individuals with distinct morphology. The table presents prediction accuracies for each target stage, calculated as the ratio of the number of correctly assigned individuals and the total number of individuals sampled at each stage, using two alternative approaches: 1. Independent: Predicting caste at each targeted stage (Sn) using the observed (true) caste labels as training stage (Sn+1) to examine the prediction accuracy when the training caste identities are in fact known from morphological information. Here, the ratio in each stage reflects the accuracy of BPA in each stage. Progressive: Predicting caste starts from the late pupal stage using adults of known caste identity as training data. Here, BPA is then performed progressively using the predicted caste labels in late pupae to predict the caste identity in early pupae. This process was repeated recurrently until the 2nd larval instar. As the first step of BPA constructs a PCA from target stage data, we also compared the accuracies between PCAs obtained from whole transcriptomes and PCAs obtained from caste DEGs at the subsequent stage (training data). We achieved a higher prediction accuracy when PCAs were constructed with caste DEGs at the subsequent stage compared to using whole transcriptome PCAs, probably because the DEG method excluded uninformative housekeeping genes. d, Anti-body staining (VASA protein, red. RRID: AB_2893405) and in situ hybridization (vasa RNA, green) in 192-hour old embryos of M. pharaonis, showing that germline differentiation has already occurred at this stage. Among the 67 examined embryos, 18 (27%) could be documented to have no germline, indicating that it should be possible to match these presence/absence results among 192-hour embryos with BPA predictions based on 1st instar larval transcriptomes. e, Second instar larvae of A. echinatior lack the full-body curly hairs that distinguish gynes from workers in the 3rd larval instar, which means caste cannot be identified morphologically. We applied BPA to predict caste identities among 2nd-instar larvae (Fig. 2a). A closer inspection showed that suspected gynes have in fact some gyne-like curly hairs, which are thicker than those in suspected workers, on their ventral thorax (arrows). These observations indicated these individuals are future gynes and were consistent with our BPA predictions. f, Among 2nd instar larvae of M. pharaonis, PCA with whole transcriptomes showed that the overall transcriptomic difference between gynes and males was not significant (P = 0.28) while reproductive larvae of both sexes were always separated from worker larvae (P < 5e-5 and P < 5e-6 for gynes and males, respectively). This is consistent with images of 2nd instar gyne and male larvae being indistinguishable after we used microsatellite genotyping to determine whether individuals were haploid (male) or diploid (female). Numbers of differentially expressed genes (adjusted P value < 0.05, detected with a generalized linear model that accounted for body size differences) also support this conclusion: 152 genes were differentially expressed between gyne and worker larvae, while 50 genes were differentially expressed between gyne and male larvae.
Extended Data Fig. 3 Transcriptomic canalization during caste differentiation in ants.
a, Within-stage transcriptome variation in M. pharaonis (upper panel) from 0–12 h old embryos to late pupae, plotted separately for gynes (red), workers (blue) and all individuals within each stage (black) depending on available information. The lower panel gives the same information for A. echinatior, where embryonic data were not available and 1st instar caste phenotypes (grey) were inseparable with BPA. Transcriptome variation was quantified as 1 – r, the extent of imperfection of transcriptome-level Spearman’s correlations between a target individual and all other same-stage and same-caste individuals. Caste identities of 1st instar individuals of M. pharaonis and 2nd instar individuals of A. echinatior were predicted by BPA. In M. pharaonis, transcriptome variation for all individuals peaked in 36–48 h old embryos (equivalent to the gastrulation stage, 6–7, in Drosophila larvae). For both species, transcriptome variation among gynes was consistently lower than among workers. In pupal stages of M. pharaonis, transcriptome variation across all individuals exceeded transcriptome variation for the gyne and worker subsets, indicating that transcriptome differences primarily reflected realized caste differentiation, in contrast to the pattern observed across the larval stages, where the black curve was intermediate between the red and blue curves. Fourth larval instar and prepupal gyne samples of A. echinatior were excluded from this analysis, because these samples were sequenced in a different technical batch, making their transcriptome variation incomparable with the other samples. b, Developmental potential (∆) for individual gynes and workers in A. echinatior, measured as the transcriptomic distance between a focal individual and an average gyne or worker (pooling all three worker subcastes) phenotype in the next developmental stage. Developmental potential was quantified and presented as in M. pharaonis (Fig. 3a), except that all three worker subcastes were included. Caste identities for gynes and (pooled) workers in 2nd instar larvae were predicted by BPA. As in the panel a, fourth larval instar and prepupal gyne samples were excluded to avoid a batch effect. c, PCAs of early and late pupal stage transcriptomes in M. pharaonis (early pupa gynes, n = 17; early pupa workers, n = 28; late pupa gynes, n = 30; late pupa workers, n = 22) (left) and A. echinatior (early pupa gynes, n = 18; early pupa workers, n = 47; late pupa gynes, n = 18; late pupa workers, n = 42) (right). For both species, the first PC axis (PC1) separates individual transcriptomes by developmental stage (early pupae to the left and late pupae to the right) while PC2 captures the caste-related transcriptomic variation. The overall transcriptomic difference between gynes (red) and workers (blue) increases from the early to the late pupal stage (upper panels), and the absolute values of the PC2 residuals (lower panels), representing the variation within each caste, were always lower among gynes than among workers (P < 1 × 10−3 for both species, two-sided t-tests). This is consistent with the mean extent of canalization being stronger in gynes than in workers. In A. echinatior, the absolute residual differences increase for the workers, consistent with A. echinatior having worker subcastes that differentiate rather late in development. Box plots show the median (centre line), 25% and 75% quartiles (boxes), outermost values (whiskers) and data points (overlapping with box and whiskers).
Extended Data Fig. 4 Early larval caste differentiation in ants.
a, Tissue-specific relative expression levels for the conserved caste-biased DEGs in early larvae, shown separately for gyne-biased (rows marked in red) and worker-biased (blue) genes. Heatmap brightness of cells reflects tissue specificity, the percentage of transcripts from targeted tissues (columns), ranging from 0% (black) to 100% (yellow). These relative abundances, based on the larval gene expression atlas of Drosophila, show that the gyne-biased DEGs in the early larval stages were mainly expressed in the midgut, fat body, and tracheal tissues, while the worker-biased DEGs were mainly expressed in the brain and central nervous system. b, Expression profiles of circadian clock-controlled protein (daywake), juvenile hormone acid O-methyltransferase-like (jhamt-like) and hexamerin among gynes and workers of the two ant species as larvae grow. All three genes are associated with the juvenile hormone signalling pathway and are significantly differentially expressed between castes in 2nd and 3rd instar larvae. Expression profiles are plotted against body length (log scale) to show expression dynamics as larvae grow in body length. c, DAPI staining of a representative early 3rd instar worker larva and a representative 2nd instar gyne larva of M. pharaonis. These animals display similar body size but wing discs (arrows) were only visible in the gyne larvae, indicating that caste determination and differentiation has already been initiated well before this early larval stage.
Extended Data Fig. 5 JH and E20 signalling pathways play key role in the regulation of canalized caste phenotypes.
a, Expression profiles of eight key regulators for insect metamorphosis that are part of the juvenile hormone and ecdysone signalling pathways (Fig. 3b), plotted against body length (log scale) of 2nd and 3rd instar M. pharaonis larvae. The expression levels of half of these genes (jheh2, jhamt, usp and E93) showed caste-specific body length thresholds in the 3rd larval instar. This pattern indicates gyne and worker individuals are gated by different critical masses for entering the metamorphic molt. b, Compared to the control group (3rd instar worker larvae fed with 10% EtOH PBS), feeding JH analogue (JHA) to 3rd instar worker larvae delayed achieving pupation. c, JHA-fed 3rd instar worker larvae induced inter-caste with phenotype intermediate between gyne and worker. JHA-fed workers have larger body size and developed wing buds (arrowed), however, they never developed ovaries (not show), indicating early bifurcation between colony germ–soma phenotypes. d, Compared to the control group (3rd instar gyne larvae fed with 10% EtOH PBS), precocene I fed 3rd instar gyne larvae were smaller and developed abnormal wings.
Extended Data Fig. 6 Canalized genes play important roles in producing adaptive caste phenotypes.
a, Tissue-specific relative expression abundances (based on Drosophila gene orthologs) among canalized genes in M. pharaonis (left panel) and A. echinatior (right panel), plotted separately for gyne-biased and worker-biased genes (M. pharaonis gyne-biased, n = 411; M. pharaonis worker-biased, n = 482; A. echinatior gyne-biased, n = 1119, and A. echinatior worker-biased, n = 899). Compared to the whole-genome background (M. pharaonis background genes, n = 8887; A. echinatior background genes, n = 7651), worker-biased canalized genes had a significantly higher relative expression in the brain, eyes, and thoracic ganglia in both ant species (one-sided t-tests; PM. pharaonis, brain = 2.04 × 10−93; PM. pharaonis, thoracic ganglia = 2.29 × 10−87; PM. pharaonis, eye = 3.04 × 10−31; PA. echinatior, brain = 9.20 × 10−120; PA. echinatior, thoracic ganglia = 2.73 × 10−181; PA. echinatior, eye = 1.74 × 10−130). In M. pharaonis, gyne-biased canalized genes had a significantly higher relative transcript abundance in the midgut, fat body and ovaries (one-sided t-tests; PM. pharaonis, midgut = 7.01 × 10−5; PM. pharaonis, fat body = 1.93 × 10−6; PM. pharaonis, ovary = 3.04 × 10−10). However, in A. echinatior, gyne-biased canalized genes showed no difference with background genes for their relative transcript abundance in ovaries (one-sided t-tests; P = 0.97), consistent with the workers having retained smaller ovaries. Box plots show the median (centre line), 25% and 75% quartiles (boxes) and outermost values (whiskers); *P < 0.01, red for higher relative expression abundance in gyne-biased genes, blue for in worker-biased genes. b, Diagrammatic illustration of gyne-biased canalized genes being associated with traits in ovaries and wing muscles, whereas worker-biased canalized genes are associated with brain function and behaviour (see Supplementary Table 4 for full list of canalized genes). c, Canalization scores for flight related (left) and ovary specific (right) genes in the two ant species. Flight related genes were identified based on their D. melanogaster homologues associated either with flight performance itself or with striated muscle functionality (the crucial wing muscles tissue in insects). Ovary specific genes were genes having > 30% expression abundance in ovaries of D. melanogaster females compared to the sum of their expression in all tissues (see Methods). Colours of cells represents the canalization score, ranging from −3 (blue, canalized in worker-biased direction) to 3 (red, canalized in gyne-biased direction). Canalization scores in A. echinatior were calculated by comparing gyne and small worker transcriptomes. d, Developmental expression dynamics of ATP- dependent RNA helicase vasa (vas), vitellogenin receptor (yl) and serine/threonine-protein kinase Chk2 (lok) in the two ant species. All three genes are ovary specific with high expression abundance in D. melanogaster ovaries. Although these three genes showed increasing gyne-biased canalization in M. pharaonis as development proceeds, there was little expression difference between gyne and worker individuals in A. echinatior, except for yl in prepupae and to a lesser extent also in 3rd instar larvae.
Extended Data Fig. 7 Freja’s functional role in canalizing gyne phenotypes.
a, Predicted functional domains of the protein encoded by Freja (LOC10587931), annotated with InterPro (see Methods). Freja contains a signal peptide domain at the N terminus, indicating secretion or membrane insertion. In addition, Freja contains a leucine-rich-repeat domain, suggesting its role in protein binding. b, Freja is the most strongly canalized gene in M. pharaonis, showing an increasing between-caste expression difference and a decreasing within-caste expression variance as development proceeds (ngyne = 168 and nworker = 161). The caste identities in 1st instar larvae are based on BPA prediction. c, Tissue-specific RT-PCR quantification of Freja transcript abundance in adult gynes, showing Freja’s expression is restricted to the abdomen and especially highly abundant in the ovaries. d, RT-PCR quantification of the efficiency of Freja-RNAi in 3rd instar larvae (left) and adult gynes (right). Compared to the GFP-RNAi control group, Freja-RNAi significantly reduced the expression level of Freja (p < 1e-3 in one-way ANOVAs in each age group). e, Compared with the control group, Freja-RNAi significantly reduced the number of yolky oocytes in adult gynes (p = 0.004 in one-way ANOVA). For Extended Data Figure 7b–e, box plots show the median (centre line), 25% and 75% quartiles (boxes), outermost values (whiskers) and data points (overlapping with box and whiskers).
Supplementary information
Supplementary Information
Supplementary Methods and references.
Supplementary Tables
Supplementary Table 1, Predicted caste DEGs for first instar larvae; Supplementary Table 2, List of caste DEGs at each developmental stage; Supplementary Table 3, Functional enrichment for caste DEGs at each developmental stage; Supplementary Table 4, Canalized genes; Supplementary Table 5, Functional enrichment of canalized genes.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Qiu, B., Dai, X., Li, P. et al. Canalized gene expression during development mediates caste differentiation in ants. Nat Ecol Evol 6, 1753–1765 (2022). https://doi.org/10.1038/s41559-022-01884-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41559-022-01884-y
This article is cited by
-
A genetic toolkit underlying the queen phenotype in termites with totipotent workers
Scientific Reports (2024)
-
Task-specific odorant receptor expression in worker antennae indicates that sensory filters regulate division of labor in ants
Communications Biology (2023)