## Main

In his Ants monograph, William Morton Wheeler concluded that there is a striking analogy “between the ant colony and the cell colony which constitutes the body of a Metazoan animal; and many of the laws that control the cellular origin, development, growth, reproduction and decay of the individual Metazoan, are seen to hold good also of the ant society regarded as an individual of a higher order”1. This century-old statement highlights putative parallels between irreversible major transitions to organismal multicellularity with a differentiated germline and altruistic cellular soma on the one hand and to colonial superorganismality with physically differentiated queen and worker castes as higher-level germline and soma on the other hand2. It implies that, once cell fate or caste fate have been determined early in development, individual cells or ant larvae should follow analogous developmental trajectories that give rise to terminally specialized cell types or morphologically distinct adult caste phenotypes, respectively.

Some decades later, Conrad H. Waddington depicted metazoan embryogenesis as a pebble rolling downhill in a rugged epigenetic landscape of divergent valleys, with cells losing pluripotency as they, regardless of minor genetic or environment disturbances, commit to specific developmental trajectories, a process that he termed canalization3,4. In combination, these early insights suggest that there should be Waddington landscapes for ant colony development reflecting the analogous Wheelerian understanding of developmental processes at two hierarchical levels of organismality. More specifically, as (super)organismal differentiation proceeds, we expect individual cells or multicellular ants to increasingly commit to their target functional phenotype within the (super)organism by parallel transcriptomical and anatomical differentiation. Waddington also maintained that the degree of canalization is under natural selection and can be different across organs and tissues within metazoan organisms3, so we expected to also find such differences among castes within superorganismal colonies. While molecular biology technology based on differential gene expression has now largely replaced Waddington’s organismal perception of development, his diagrams remain instructive heuristic tools for analogous understanding of gene regulatory networks (GRNs) that affect cell differentiation in metazoan bodies and, as we conjecture here, caste differentiation in ant colonies.

Recent advances in single-cell transcriptomics have revealed many molecular details of Waddingtonian landscape differentiation5,6,7, while reconstructing developmental trajectories in unprecedented detail and identifying key GRNs for cell fate determination8. However, no studies of comparable ambition have been pursued to track ontogenetic development of superorganismal colonies sensu Wheeler and quantify the canalization properties of caste differentiation. The phylogenetically diverse holometabolic ants, with their clear developmental stages, are particularly inviting to embark on such investigations but studies have so far used pooled samples or obtained rather few individual transcriptomes9,10,11, which has precluded formal analyses of individual heterogeneity during the entire developmental process of caste differentiation. In particular, whether the sequence of larval and pupal caste differentiation is a canalized developmental process, in which specific genes initiate and regulate cascades of differential gene expression while shaping morphologically diverging phenotypes, is unknown.

We used low-input RNA sequencing to obtain >1,400 genome-wide individual transcriptomes covering the major developmental stages of the ants Monomorium pharaonis and Acromyrmex echinatior, while using Drosophila melanogaster for outgroup comparisons. These two ant species belong to the same subfamily, Myrmicinae, but differ in social and developmental characteristics. M. pharaonis is a highly polygynous (multiqueen) invasive ant with a monomorphic worker caste, where caste is known to be determined ‘blastogenically’ in early embryos before eggs hatch12,13. In contrast, the fungus-growing leaf-cutting ant A. echinatior has mostly single queen colonies but a polymorphic worker caste with small workers for brood nursing and gardening and large workers for foraging and colony defence, where caste determination occurs during early larval development14,15. All ants share the same common ancestor that evolved superorganismal colonies with specialized queens and lifetime unmated workers, with only a few cases of secondary loss of the queen caste being known16. The two ant species that we studied therefore probably represent typical models of colony and caste development found throughout most extant ants. We reconstructed developmental trajectories for gyne and worker caste differentiation via genome-wide gene expression profiling, using a novel algorithm for predicting caste phenotype before larvae express morphological differences. We then focused on the larval–pupal transition to quantify caste-specific canalization effects and their underlying pathways and we finally examined some of the key genes regulating caste phenotype canalization.

## Results

### A transcriptomic atlas for ant development

Developmental trajectory networks constructed from whole-genome transcriptomes (Fig. 1a and Extended Data Fig. 1; Methods) clustered individuals primarily by developmental stage and gradually also by caste phenotype. Adjacent developmental stages always grouped next to each other, as expected when development is a largely continuous process but we also observed distinct clusters in early embryonic stages (0–24 h) and for the late larvae to early pupae transition, indicating more discrete stage-specific transcriptomes. Principal component analysis (PCA) for the combined data from both species showed that the first axis separated the two ant species (P < 0.0001; two-sided t-test) while the second and third axes jointly separated individuals by developmental stage and caste identity (P < 0.05 for two-way analysis of variance (ANOVA) on the association between PC2 and developmental stage and caste; P < 0.005 for one-way ANOVA on the association between PC3 and developmental stage) (Fig. 1b).

Aligning developmental transcriptomes showed 67–81% similarity between the two ant species across the developmental stages (Fig. 1c), reflecting considerable conservation of developmental GRNs in the Myrmicinae subfamily to which both ant species belong. Developmental transcriptomes were more similar for gynes than for workers across the developmental stages, both when comparing the two ant species with each other and when contrasting the two M. pharaonis caste profiles with those of D. melanogaster females (P < 0.01 for all examined stages and for both comparisons; two-sided t-tests) (Fig. 1c). These patterns are consistent with gyne development being under stronger selection constraint than worker development across social insects with permanent caste differentiation11. However, it is important to acknowledge that gynes are elaborations of the ancestral reproductive phenotype of solitary female Hymenoptera, while differentiated worker castes are later innovations whose foundational GRNs evolved analogously, not homologously, in the superorganismal ants, bees and vespine wasps17.

### Caste prediction in morphologically undifferentiated larvae

Morphological differences between gyne and worker individuals cannot be detected before the second and third larval instar in M. pharaonis and A. echinatior, respectively14,18. To identify individual caste phenotypes in earlier stages lacking morphological markers, we developed the backward progressives algorithm (BPA) that retrospectively infers the likelihood of individuals belonging to one caste or another (Methods; Extended Data Fig. 2a). BPA assumes that key genes active in the GRN at a specific stage should, albeit with modified expression, also participate in caste differentiation during the subsequent developmental stage, analogous to what is known for key transcription factors that specify cell types during metazoan development5,8,19. We validated BPA using embryonic sex differentiation data from Drosophila (Extended Data Fig. 2b) and confirmed the accuracy of BPA in samples of M. pharaonis larvae with known caste identity (Extended Data Fig. 2c).

We applied BPA to 54 transcriptomes of first instar M. pharaonis larvae (Fig. 2a) and predicted 12 of these to be reproductives (gynes and males) and 18 to be workers with >90% probability. We validated these predictions with RNA fluorescent in situ hybridization (HCR-FISH20) to assess the expression colocalization between vasa, a germline marker of first instar larvae and late embryos21 (Extended Data Fig. 2d) and LOC105839887 and histone-lysine N-methyltransferase SMYD3 (LOC105830671). These two genes exhibit strong differential expression between predicted first instar caste phenotypes (Supplementary Table 1) and have binary gyne-worker expression in second instar larvae. First instar LOC105839887 expression is visible in fat bodies while SMYD3 colocalizes with vasa in the larval gonads (Fig. 2b, left panel). Both genes could be unambiguously detected in individuals with a vasa-specified germline and were always absent in individuals without a germline (Fig. 2b, right panel).

We also applied BPA in A. echinatior where we lacked morphological markers for second and first instar larvae. While third instar gyne larvae of this species can be unambiguously distinguished from worker larvae by their full-body curly hairs14, this pilosity is not yet expressed in second instar larvae. BPA found that the first two PC axes constructed from the second instar transcriptomes separated second instar larvae by body size and third instar larvae by caste identities (Fig. 2c). Further inspection showed that the larger second instar larvae (suspected gynes) in fact have some gyne-like curly hairs in their ventral thorax region (Extended Data Fig. 2e), indicating that caste differentiation in A. echinatior begins before the second larval instar. To our knowledge, BPA is the first algorithm to achieve such accurate backwards predictions of developmental stages.

### Caste differentiation in ants is developmentally canalized

We next focused on the overall degree of canalization in genome-wide gene expression. For practical purposes, we defined transcriptome-level canalization as the statistical tendency for individual transcriptomes to start with a unimodal (pluripotent) distribution and gradually change to a bimodal (phenotypically committed) distribution with increasingly distinct peaks as development proceeds. For this purpose, we quantified the distributions of genome-wide developmental potential (Δ) as a gyne or worker individual, using deviations in gene expression from average target profiles in subsequent developmental stages (Methods). The Δ-values range between −1 and 1, with a positive value indicating that development in a target individual is gyne-biased and a negative value indicating worker-biased development. We found that the absolute Δ-value between castes increased steadily in both ant species, while the variance of Δ-values within castes, which we validated not to have been notably affected by technical artifacts (Supplementary information), became gradually reduced as development proceeds (both P < 0.0001; two-way ANOVA) (Fig. 3a and Extended Data Fig. 3). During this process, transcriptomic canalization in gynes was invariably stronger than in workers (P < 0.05 in all stages; two-sided F-tests), both in transcriptomic variation per se and in PCA patterns (Extended Data Fig. 3), indicating a higher degree of transcriptomic canalization in colony germline individuals.

### JH signalling regulates developmental caste canalization

Genome-wide transcriptomic canalization signatures amplified beyond the third instar when pupal metamorphosis starts (Fig. 3a), a critical stage in all holometabolous insects22. To understand the entirety of upstream regulation of caste differentiation, we used generalized linear models to account for the effect of larval body mass (Methods)23 and identified 65 conserved genes with parallel gyne-worker bias that were associated with larval differentiation in both ant species (Supplementary Table 2). These early caste differentially expressed genes (DEGs) are significantly enriched for genes involved in fatty acid and hormone metabolism (Supplementary Table 3)—orthologues of these genes with gyne-biased expression are also highly expressed in the fat-body tissues and tracheal system of Drosophila larvae (Extended Data Fig. 4a). In addition, multiple larval caste DEGs are associated with the juvenile hormone (JH) pathway, a key regulator of larval growth and molting in insects22,24. These included the genes daywake, encoding a haemolymph JH-binding protein, LOC118646735, a duplicate gene of Drosophila Juvenile hormone acid O-methyltransferase (jhamt) and hexamerin25 (Extended Data Fig. 4b), confirming the important role of JH for caste differentiation26,27.

We found that many genes involved in JH and ecdysone metabolism exhibited both body length-specific and caste-specific expression when larvae transition from the third instar to the prepupal stage (Fig. 3b and Extended Data Fig. 5a). In particular, the expression of jhamt, which delays the metamorphic molt28, started decreasing in third instar worker larvae when body length exceeded 0.7 mm, while expression in gyne larvae did not decrease until larvae had reached twice that length of 1.4 mm (Methods) (Fig. 3b). A similar difference occurred in the expression of Ecdysone-induced protein 93 (E93), a downstream transcription factor of the JH pathway for initiating metamorphosis29 (Fig. 3b), which started increasing when worker larval body length exceeded 1.4 mm but not in gyne larvae before they had reached 2.0 mm. These differences in body-length thresholds indicate heterochronic shifts between gyne and worker larvae in the JH signalling pathway, confirming the critical role of the JH pathway in regulating caste differentiation.

To experimentally prove the role of JH in caste canalization, we fed a JH analogue (JHA) to third instar worker larvae and precocene I, a JH inhibitor30, to third instar gyne larvae of M. pharaonis. This showed that JHA significantly increased worker body length and that precocene I significantly reduced gyne body length (Fig. 3c). Furthermore, wing buds and simple eyes (ocelli), both typical gyne characters, were induced in the JHA-fed worker larvae (Extended Data Fig. 5c) while precocene I-treated gyne larvae had a significantly higher frequency of abnormal wings when they hatched as adults (P < 0.05 for Fisher’s exact test; Extended Data Fig. 5d). Also these aberrant patterns of development confirmed that the JH signalling pathway is a key regulator for caste canalization in ants.

### Focal genes mediating canalization

To understand which genes are canalized during caste differentiation in the pupal stage, we developed a gene-specific canalization score (C) to track the developmental dynamics of between-caste gene expression divergence via the ratio of between-caste gene expression difference and stage-specific expression variance within castes (Methods). We identified 1,140 and 2,478 genes showing canalized expression in M. pharaonis (gynes versus workers) and A. echinatior (gynes versus small workers), respectively (C > 3 and P < 0.05 in one-sided Spearman’s correlation tests) (Fig. 4a and Supplementary Table 4). Among these canalized genes, 457 showed the same caste-bias direction in both species, a significantly higher number than the background expectation of 88 (P < 10−15; Fisher’s exact test), indicating that gene-level canalization is evolutionarily conserved.

Comparison of the conserved canalized genes with tissue-level gene expression data in Drosophila31 showed that gyne-biased canalized genes were highly expressed in ovaries, whereas worker-biased canalized genes were highly expressed in the brain and central nervous system (Extended Data Fig. 6a). Further analyses indicated that gyne-biased canalized genes were significantly enriched in flight muscle functions (for example, tropomyosin-2 (Tm2), troponin I (TnI) and troponin C (TnC)) and female reproductive functions (for example, T-complex protein 1 (Tcp-1), krasavietz (kra) and merry-go-round (mgr)) (P < 0.05 for both categories in hypergeometric tests). In contrast, worker-biased canalized genes were significantly enriched in neuronal and behavioural processes (P < 0.05 for both categories in hypergeometric tests), including twin of eyeless (toy), hormone receptor 51 (Hr51) and octopamine receptors (Octbeta1R and Octalpha2R). These gene functions in a solitary insect are consistent with gyne caste specialization being targeted at dispersal and reproduction and worker caste specialization at more variable ‘somatic’ social tasks (Fig. 4a, Extended Data Fig. 6b and Supplementary Table 5).

### Freja as a crucial regulator of queen phenotype

The top gene with gyne-biased canalization in M. pharaonis is Hymenoptera-specific (Methods) and encodes a protein containing a predicted signal peptide and a leucine-rich repeat domain (Extended Data Fig. 7a). Caste-biased expression of this gene begins in second instar larvae and amplifies as development progresses (Extended Data Fig. 7b). In adult gynes, this gene is mainly expressed in the ovaries (Extended Data Fig. 7c) where its expression is restricted mostly to the ovarian follicle cells (Fig. 4b) which are essential for oogenesis32. We therefore named this gene follicle related [gene-]expression in juvenile ants (Freja, goddess of fertility in Old Norse).

We investigated Freja’s function through RNAi knock-down in late third instar gyne larvae. Relative to GFP-RNAi controls, adults of the Freja-RNAi treatment group had significantly reduced body and head size (two-sided t-tests; P < 0.0001 for both; Fig. 4c) and a higher frequency of abnormal wing morphology. As wings and large bodies relative to workers are unambiguous phenotypic markers of the adult reproductive (gyne/queen) caste in ants, we conclude that manipulation of Freja expression disturbed normal developmental canalization.

Because Freja remains highly expressed after pupal eclosion, we also manipulated Freja expression in adult Monomorium gynes with RNAi and examined effects on fertility. Both the size and the number of oocytes were significantly reduced in the Freja-RNAi group compared to the controls (two-sided t-tests; P < 0.05 for both; Fig. 4d and Extended Data Fig. 7e), indicating that Freja’s continued expression is crucial for gyne maturation and fertility after insemination. Thus, in addition to its necessary role in canalization of larval caste divergence, Freja maintains its differentiating functionality in adult gyne phenotypes. However, Freja was not a canalized gene in A. echinatior, a species where workers have retained ovaries to produce unfertilized (male) or inviable trophic eggs33 (Extended Data Fig. 6).

## Discussion

We have shown that ant caste differentiation is canalized throughout development in a way that is remarkably analogous to the canalized development of metazoan cell lineages starting with the first cell divisions of a zygote, a potential functional similarity that was noted more than a century ago34. Also the deeper conservation of colony-level germline development is consistent with the secondary evolution of worker subcastes in ants being more common than the evolution of queen subcastes35,36 and appears broadly convergent to what is known from development in animal bodies5,6,37. This suggests that ant colony-level ontogeny, unfolding as caste differentiation, has been maintained by consistent stabilizing selection, in spite of substantial modifications in the details of caste differentiation during the huge adaptive radiation of the ants. We also showed that the JH signalling pathway, known to be important for caste differentiation in social insects in general26,38,39,40,41, mediates a heterochronic shift between gyne and worker larvae, suggesting that crucial metabolic changes control the regulation of both individual body size and the amplification of caste phenotype canalization.

Our study used an algorithmic approach (BPA) to predict caste phenotypes backwards in developmental time to identify gene markers before morphological caste differences emerge, a technique that should be broadly applicable in other kinds of developmental differentiation studies. Our analyses strongly suggest that caste in ants is determined by the interaction of body size and gene expression, rather than by one of these factors alone42. We covered the entire developmental process, starting with genome-wide transcriptomics and then zooming in on the JH and ecdysone pathways to finally focus on specific genes with key roles in canalization of caste development. Intriguingly, experimental inhibition of Freja finally showed that disruption of canalized genes results in non-adaptive intermediate phenotypes between gynes and workers, both early in development and in adults.

Our findings emphasize that caste-differentiated superorganismal colonies, as originally defined by Wheeler34, are shaped by higher-level adaptations to predictably reproduce entire life-cycles. In this process, the complementary development of caste phenotypes requires coordination and buffering via expression changes of canalized genes, analogous to the dynamic regulation of cell differentiation during development of metazoan bodies5,6,43,44. We conjecture that obligate canalization of caste development will be a defining feature for all insect lineages that independently realized irreversible evolutionary transitions to superorganismality (ants, corbiculate bees, vespine wasps and higher termites)2, setting the Wheeler-superorganisms—both annual and perennial—apart from their society-forming sister lineages45. However, our results indicate that even canalized developmental pathways can be changed when relevant genetic variation emerges and selection for directional change in caste phenotype is strong enough. This seems analogous once more with how cell lines diverge during adaptive radiation of multicellular organisms. Selection pressures thus appear broadly similar across domains of organizational complexity but the GRNs behind these convergent developmental processes will be uniquely different across lineages.

## Methods

### Experimental design

Ultralow-input transcriptome sequencing was performed for individual samples of two ant species, M. pharaonis and A. echinatior (both subfamily Myrmicinae), covering four larval stages, the prepupal stage, the early and late pupal stage and the adult stage (Extended Data Table 1). Individual transcriptomes were also obtained for nine embryonic stages (starting 12 h after egg laying and continuing sampling of older eggs with 12-h intervals) in M. pharaonis. In this ant species, caste is known to be determined ‘blastogenically’ in early embryos12,13, unlike most other ants where caste phenotype is determined during larval development46. For later developmental stages where caste phenotypes were morphologically distinguishable (Extended Data Table 2), ~30 individuals per caste (gynes and workers in M. pharaonis; gynes, large workers and small workers in A. echinatior) were collected for each stage (Extended Data Table 1). For stages where caste phenotype could not be identified morphologically (before the second instar in M. pharaonis and before the third instar in A. echinatior), 30 eggs or 60 larvae per stage were collected to transcriptomically infer caste. Body length and head capsule width were also measured for all larval individuals to obtain another measure of developmental age that integrated with transcriptomic information during analyses. Reference individual transcriptomes for D. melanogaster were further obtained, covering development from newly laid eggs to newly eclosed adults, sampling ~20 eggs at 3-h intervals and ~20 larvae at 6-h intervals. See Supplementary information for details of sample collection, stage and caste identification, RNA and DNA extraction, transcriptome profiling and sample quality control.

### Testing RNA quality, constructing cDNA libraries and RNA sequencing

RNA quality testing, production of complementary DNA libraries and sequencing were performed at BGI, China. The quality of RNA samples was tested with Agilent 2100 Bioanalyzer and only samples with good RNA quality (RNA integrity number (RIN) > 4) were retained for RNA sequencing47. Ambion ERCC RNA Spike-In Mix (catalogue no. 4456740) was added to each sample according to the manufacturer’s instructions before cDNA library construction, to be able to later verify the qualities of sequenced RNAseq data.

RNA sequences for each sample were first reverse transcribed into cDNA following the Smart-seq2 protocol48, which was then randomly fragmented with Tn5 enzymes and linked with sequencing adaptors to obtain a complete cDNA library for each individual sample. Primers were then added to the cDNA libraries for PCR amplification and fragments ranging from 150 to 350 base pairs were selected for further cDNA circularization to construct sequencing libraries. The samples were then sequenced on a BGISEQ-500 platform using a 100 nucleotide paired-end ultralow-input RNA sequencing protocol.

### In situ hybridization chain reaction

#### In situ probes

Sequences for LOC105837931 (Freja) (XM_012683128.3), vasa (XM_012686851.3), LOC105839887 (XM_036293539.1), LOC105830671 (Smyd3) (XM_036287663.1), actin5c (XM_012666578.3), tubulin (XM_012685189.3) and septin2 (XM_012667189.2) were downloaded from NCBI and provided to Molecular Instruments for probe set synthesis. Alexa Fluor 488 was used for the detection of vasa; Alexa Fluor 546 was used for the detection of Freja, actin5c and tubulin; Alexa Fluor 594 was used for the detection of LOC105839887; and Alexa Fluor 647 was used for the detection of Smyd3 and septin2.

#### RNA fluorescence in situ hybridization in M. pharaonis larvae and ovaries

RNA fluorescence in situ hybridization (FISH) was performed following the whole-mount Drosophila HCR v.3.0 protocol20 with some modifications in that M. pharaonis larvae and ovaries were fixed at room temperature in scintillation vials with 50% FPE (4% formaldehyde; 0.5× PBS; 25 mM EGTA) and 50% heptane. Fixation time was then adjusted so that ovaries were fixed for 30 min, first and second larval instars were fixed for 1–2 h and third larval instars were fixed for 12 h. Following fixation, the lower layer (FPE) was removed and replaced with methanol followed by vigorous shaking. The lower layer was replaced once more with methanol, at which point larvae and ovaries sink to the bottom of the vial. Larvae and ovaries were then dehydrated with several changes of methanol and stored at −20 °C. Proteinase K concentration and treatment time was adjusted to 30 µg ml−1 for 7 min for first and second larval instars and 10 min for third larval instars and ovaries. Following amplification, one SSCT wash (5× SSC; 0.1% Tween-20; pH 7.0) was extended to 1 h with the addition of 4′,6-diamidino-2-phenylindole (DAPI; 1:1,000) for nuclear staining in first and second larval instars or overnight for third larval instars and ovaries.

FISH-stained larvae and ovaries were then transferred into increasing concentrations of glycerol in 5× SSC and mounted in Vectashield or 70% glycerol/ 30% 5× SSC for imaging. Images were captured on a Leica SP5-X inverted confocal laser scanning microscope. Image stacks were processed using Fiji/ImageJ49.

### RNA interference experiments

Third instar reproductive (gyne or male) larvae were taken from nests and fixed with double-sided adhesive tapes. Interference double-strand RNA (dsRNA) was then injected into the upper ventral abdomen with a capillary needle (1B100F, World Precision Instrument), equipped on a micropipette puller (P-2000, Sutter) and Eppendorf FemtoJet injector (Femtojet 4i and Transferman 4r system, Eppendorf). For adult gynes, dsRNA was injected into the thorax, using the same equipment as in the larval injections, 5-, 7- and 10-d after they had eclosed from the pupal stage. For both larval and adult RNAi, 7,500 ng μl−1 of dsRNA were used in both experimental (Freja) and control (GFP) groups. To improve the efficiency in larval injection, lipofectamine 2000 (11668019, Thermo Fisher) was added in Freja and GFP dsRNA liquid (dsRNA:lipofectamine 2000 was 3:1). The dsRNA was synthesized in vitro, following the instructions of the MEGAscript RNAi Kit (AM1626, Thermo Fisher).

The Freja T7 primer sequences for dsRNA synthesis were:

Forward: 5′-TAATACGACTCACTATAGGGCATCCATATCGTTGAAGGGC-3′

Reverse: 5′-TAATACGACTCACTATAGGGGTCCAGGTCGGTGAAGTTGT-3′

The GFP T7 primer sequences for dsRNA synthesis were:

Forward: 5′-TAATACGACTCACTATAGGGAGTGCTTCAGCCGCTACCC-3′

Reverse: 5′- TAATACGACTCACTATAGGGCATGCCGAGAGTGATCCCG-3′

### Quantification of tissue-expression abundance with RT–qPCR

Heads, thoraxes and gasters (fourth and higher abdominal segments) were dissected from newly eclosed gynes and gasters were subdivided by tissue into digestive glands, cuticles, fat bodies and ovaries. Total messenger RNA of the dissected tissues was then isolated using TRIzol reagent (15596018, Thermo Fisher). Reverse transcription was performed using the PrimeScript RT reagent Kit with gDNA Eraser (RR047B, Takara and mRNA levels were quantified using TB Green Premix EX TaqTM II (Tli RNaseH Plus, RR820A, Takara) on a CFX96TM Real-Time system (BIO-RAD). Expression of Freja was normalized to the expression of EF1α in each sample.

The quantitative PCR with reverse transcription (RT–qPCR) primer sequences for Freja were:

Forward: 5′- AACAGGGCAAACTCAGATATTTAC-3′

Reverse: 5′- AGGCATCGATCGTTATCTCGG-3′

The RT–qPCR primer sequences for EF1α were:

Forward: 5′-TTCATTTATTGCTCTCACATCTACG-3′

Reverse: 5′- ACCGTTGCCCTTTCTACTCTAA-3′

### Quantification of ovary developmental status

Ovary developmental status was quantified by the number and the surface area of yolky oocytes. Ovaries were dissected and collected from 12-day-old gynes, where the number and the total surface area of yolky oocytes were counted and measured, respectively, from individual samples.

### JHA and precocene I feeding experiment

Third instar worker larvae of intermediate body length were treated with JHA (Methoprene, MCE HY-B1161; 5 mg ml−1 in 10% EtOH PBS solution), while control worker larvae were fed with 10% EtOH PBS solution. Both 0.5 mg ml−1 JHA and 10% EtOH were mixed with foods and offered on days 1, 3 and 8. To confirm the efficiency of JHA, the treated worker larvae were collected 24 h after day 1 feeding. After isolating total RNA, the expression of Kr-h1, a downstream gene of JH, was determined in both control and JHA groups. The expression of Kr-h1 was normalized to EF1a in each sample. The RT–qPCR primer sequence for Kr-h1 were:

Forward: 5′- AGGATATAACGCAGCTTCCTGT-3′

Reverse: 5′- GTGTGGCAGCGAACATTGTG-3′

Third instar gyne larvae with intermediate body length were treated with 1% precocene I (Sigma-195855; in 10% EtOH in PBS), while control gyne larvae were fed with 10% EtOH PBS solution. Both 1% precocene I and 10% EtOH were mixed with foods and offered on days 1, 3, 5 and 7.

To reveal the effects of JHA and precocene I on larval development, pupal stage samples were collected and body lengths were measured under a stereomicroscope (SMZ18, Nikon). For JHA and control groups, cohort percentages of pupation on days 10, 15 and 20 were also recorded to check whether development time was affected. When pupae eclosed into adults, their head width across the eyes and scape lengths (proxies of body size) were measured in the control and JHA groups using the same stereomicroscope.

### Constructing the developmental trajectory network

Within each species, the developmental trajectory network was constructed on the basis of the transcriptomic similarities among all samples. These similarities, as described in the Supplementary Information ‘Sample quality control’, formed a pair-wise similarity matrix among all samples. Values of this matrix were used to construct a weighted undirected network, where nodes represent samples and edges represent similar transcriptome profiles between connected samples. The weights of edges were measured by the Spearman’s correlation coefficient among samples, indicating the level of transcriptome similarity. To increase the overall signal of the network, weak connecting edges (weight < 0.8) were removed and the threshold criterion is based on the empirical suggestion that correlation coefficients for anisogenic samples should be >0.8 (ref. 50).

The weighted undirected network was then visualized with the force-directed layout algorithm using the igraph package (v.1.2.9) in R (ref. 51). This algorithm takes the weights of edges into account, so that nodes with strong edges (samples with high transcriptome abundance similarities) are clustering together. For visualization purposes, edges’ colour was set according to edges’ weight, ranging from white (weight = 0.8) to black (weight = 1).

### Alignment of developmental transcriptomes and measurement of between-species transcriptomic similarity

Rate of development and number of instars differ between ant species and castes1,52. In our study species, there are three larval instars in M. pharaonis and three to four in A. echinatior. Embryogenesis in M. pharaonis lasts for ~9 d, which is long compared to D. melanogaster where eggs hatch after ~24 h. Developmental stages thus need to be aligned between species before comparative analyses can be done.

To align developmental stages, either between castes or between species, developmental stage similarities were measured by their overall transcriptomic distance, calculated as 1 − Spearman’s correlation coefficient of transcriptomes (or orthologous transcriptomes for cross-species comparison). Stages with the lowest overall transcriptomic distance (mean value for all same-stage samples) were assessed as the best-matched (aligned) stages and used for downstream between-caste and between-species comparisons.

Between-species transcriptomic similarity for each developmental stage was calculated as the mean value of 1 − Spearman’s correlation coefficients among samples of the best-aligned stages, separately for each caste.

### The BPA

We developed a new algorithm that allows backwards prediction of caste identity on the basis of the sequential overall transcriptomic patterns. The algorithm compares (1) the transcriptomes of individuals at a target developmental stage with unknown caste identity and (2) the transcriptomes of individuals at the later stage where caste identity was known or had been assigned in a previous round. For each round of prediction, BPA performed four steps: normalization, feature selection, model training and prediction.

Because prediction data (of differentiation in unassigned transcriptomes of the target developmental stage) and training data (caste-assigned transcriptomes from the subsequent stage) represent two continuous developmental stages next to each other in time, developmental effects are expected to always contribute, but with quantitatively different effects, to both datasets. The first step of BPA (normalization) therefore removes such developmental effects by subtracting the mean expression levels and normalizing the expression variation across the two datasets, using the Combat package (from sva (v.3.40.0)) in R, which sets developmental stage as batch covariate53. The normalized transcriptomes thus represent expression levels that are independent of developmental stage and with a maximal likelihood to reflect segregation by caste in both datasets.

The second essential step of BPA is feature selection. This starts with a PCA of the prediction data, assuming that one or several of the PC axes from this dataset should be related to as-yet-unspecified caste identities in the target stage. We thus assumed that one or several of the top PC axes should include the best-possible set of caste PC axes driven by the expression difference between the caste phenotypes not yet identified. The second substep of feature selection is then to confront these PC axes with training data, by projecting them on the samples of the subsequent developmental stage. This comparison uses singular value decomposition to extract the coefficients of each PC axis and then multiplies them with the training data. This process produces new PC scores for each individual in the training data, which then allows the third substep of identifying the PC axes that best separate the known caste identities in the training data when performing ANOVA. The most significant PC axes are then assumed to be the shared caste PCs for both prediction and training data.

With the selected candidate feature (the best-fitting PC axes for caste) in place, the third and fourth steps of BPA are model training and prediction. We first applied linear discriminant analysis on the known caste PC values to train a predictive model for caste segregation. We then predicted individual caste identities of target stage individuals and assigned a probability of being gyne (reproductive of unspecified sex before the second larval instar) or worker to each individual. Once a complete round is completed, a prediction result for developmental stage n (Sn) can then be used to predict individual caste identities of transcriptomes at stage n − 1 (Sn−1), following the same four steps as in the previous round of prediction, except that now the samples at Sn−1 were used as prediction data and the samples at Sn as training data.

For BPA prediction of first instar larvae in M. pharaonis, body size independent caste DEGs of second instar larvae (number of genes = 173) were used for feature selection. Body size independent caste DEGs were identified from samples of these second instar larvae with approximately equal body lengths (ranging between 0.52 and 1.08 mm), using DESeq2 (ref. 54) with the model: Exp ≈ caste + log(body length) (see section Detecting DEGs between castes). A gene was retained for feature selection if its adjusted P value was <0.05 and its log2 expression fold-change between castes was >0.5. Compared to using all genes, our use of caste DEGs for feature selection substantially increased the accuracy in our testing dataset (Extended Data Fig. 2c), probably because it removed the housekeeping genes whose expression is unrelated to caste differentiation. See Supplementary information for ‘Validating the accuracy of BPA’ and ‘Testing the influence of sex on early developmental transcriptomes in M. pharaonis’.

### Quantification of transcriptome variation and difference

Within-stage transcriptome variation was measured by the mean value of 1 − Spearman’s correlation coefficients between the transcriptomes of target samples and the transcriptomes of all other samples within the same stage, regardless caste identities for overall measurement or separately for each caste for within-caste measurement.

Between-caste transcriptomic differences were calculated as the average expression difference between gynes and workers (of the same stage) for all genes. The expression difference of a single gene was calculated as:

$${{{\mathrm{Dif}}}}_{{{{\mathrm{gene}}}}} = \left| {\overline {{\mathrm{Exp}_{\rm{gyne}}}} - \overline {{\mathrm{Exp}_{\rm{worker}}}} } \right|$$

where the absolute value serves to remove the likelihood sign difference for worker or gyne bias.

As the gene expression levels were already normalized by log2, the expression difference between castes is equivalent to the absolute value of log2 expression fold-change between castes.

### Quantification of caste developmental potential

Developmental potential of target individuals was based on the transcriptomic distance between a target individual and its focal caste (measured as the average transcriptome of all same-caste individuals) at the subsequent developmental stage. If the transcriptomic distance between a target individual and a representative gyne was smaller than the distance between that target individual and a representative worker, the target individual was classified as being more likely to start developing into (or continue its development into) a gyne rather than a worker.

The mean transcriptomic differences between stages were first normalized by standardizing the expression level of each gene for all same-stage individuals. This step removed the quantitative differences between developmental stages (see first step (normalization) in The BPA section). The developmental potential (Δ) of each individual (i) was then calculated with the following formula:

$${\varDelta}_{{{{X}}},{{{t}}}} = \frac{{{{{\mathrm{dist}}}}\left( {{{{i}}},{{{\mathrm{worker}}}}_{{{{t}}} + 1}} \right) - {{{\mathrm{dist}}}}\left( {{{{i}}},{{{\mathrm{gyne}}}}_{{{{t}}} + 1}} \right)}}{{{{{\mathrm{dist}}}}\left( {{{{\mathrm{gyne}}}}_{{{{t}}} + 1},{{{\mathrm{worker}}}}_{{{{t}}} + 1}} \right)}}$$

Here, dist (i, castet+1) is the transcriptomic distance between individual i and the focal caste at the subsequent stage and is calculated as the Manhattan distance, a robust measure for transcriptomes that is commonly used for arithmetic calculations. The transcriptomic distance difference between castes was then normalized by dist (gynet+1, workert+1), which measures the transcriptomic distance between a focal gyne and a focal worker at the subsequence stage, so that ΔX,t becomes a dimensionless measure. ΔX,t = 1 then indicates that individual i is equivalent to a representative gyne in the subsequent stage while ΔX,t = −1 indicates that the individual is equivalent to the representative worker.

### Quantification of gene-level canalization scores

Gene-level canalization scores were calculated based on the developmental dynamics of expression divergence between gyne and worker castes. For each developmental stage with known/predicted caste phenotypes, a modified t score for expression divergence of a target gene (g) was first calculated as:

$$t_g = \frac{{\overline {\mathrm{Exp}} _{\mathrm{gyne}} - \overline {\mathrm{Exp}} _{\mathrm{worker}}}}{{s_\mathrm{p}}}$$

Here, sp is the pooled standard deviation for the expression levels of gynes and workers:

$$s_\mathrm{p} = \sqrt {S_{\mathrm{gyne}}^2 + S_{\mathrm{worker}}^2}$$

A high absolute value of a t score indicates a high between-caste expression difference or a low within-caste expression variance(s).

On the basis of the t scores across developmental stages, the canalization score (C) was then quantified to measure the developmental trend for the expression differences between castes. We defined the canalization score as:

$$C_g = - {{{\mathrm{log}}}}_{10}\left( {P_g} \right)\times t_{g, \times \mathrm{final}}$$

where Pg is the P value of the correlation test for the absolute values of t scores across developmental stages and tg,final is the t score of the late pupal stage when the morphological differentiation process between gynes and workers is largely completed. With the canalization score, we can thus capture the canalization level for each gene because a high value of −log10(Pg) indicates an increasing between-caste difference or a decreasing within-caste variance across developmental stages and a high absolute value of tg,final indicates a large between-caste expression divergence at the terminal stage.

On the basis of these canalization scores, we defined a gene as being canalized if Pg was <0.05 and the absolute value of Cg was >3.

### Identifying the phylogenetic origin of Freja

To investigate the phylogenetic origin of Freja (NCBI ID: LOC105837931), the orthologue group of this gene, that is, the set of genes (including both orthologues and paralogues) descended from a single gene in the last common ancestor, was identified (with Orthofinder) among 17 selected species, including 11 Hymenoptera and 6 species outside the Hymenoptera (Supplementary information on ‘Detection of orthologues and homologues across species’). All gene members of the Freja orthologue group were found exclusively in the Hymenoptera, indicating that Freja is a hymenopteran order-specific gene.

Multiple sequence alignments were further performed on the protein sequences of the Freja orthologue group, using T-coffee (v.13.45.0) with default parameters55. With the aligned protein sequences, a gene tree was reconstructed, using IQ-TREE (v.2.1.4) with 1,000 replicates for bootstrapping and 1,000 replicates for Shimodaira–Hasegawa approximate likelihood ratio test56.

### Body-length threshold regression model

To identify the threshold for a change in expression dynamics at a certain larval body size, gene expression levels of each caste were fitted with a continuous two-phase (segmented) model (M11) using the R package chngpt (v.2021.5-12)57. The threshold model can be expressed as:

$$\mathrm{Exp}\approx \alpha \times {\mathrm{body}}\,{\mathrm{size}}_{{\mathrm{before}}} + \beta \times {\mathrm{body}}\,{\mathrm{size}}_{{\mathrm{after}}}$$

Here, the threshold regression model detects a significant change of slopes (α and β) before and after a body size threshold and a threshold model is significant if the P value for the log likelihood ratio test between the threshold model and the null model, Exp ≈ body size is < 0.05.

### Detecting DEGs between castes

Caste DEGs at each developmental stage were detected using the DESeq2 (v.1.32.0) package in R (ref. 54). For all samples within each developmental stage, a read count matrix of transcriptomes (output from the Salmon mapping; see above) was loaded using tximport58, which integrated expression profiles from the transcript level to the gene level. With these gene-level transcriptomes, DESeq2 was then used to model the expression level of each gene as: Exp ≈ caste and genes were defined as caste DEGs for a target developmental stage if their adjusted P values were <0.05.

To reduce the confounding effect of body size on gene expression difference between castes (for example, second instar sexuals are always larger than second instar workers, so the expression difference might be the result of a larger body size)23, body length measurements were further integrated to adjust slopes for the influence of body length and thus identify body size independent caste DEGs for second instar larvae. The expression level of each gene was modelled as:

$$\mathrm{Exp}\approx \mathrm{caste + log}\left( {\mathrm{body}}\,{\rm{length}} \right)$$

A gene was then assessed as a body size independent caste DEG if the adjusted P value for caste was <0.05 and there was at least a 1.6-fold expression difference between castes. The expression difference between castes was estimated with a robust linear regression model59.

### Determination of tissue origins of gene expression based on Drosophila database

Tissue origin of expression of target genes were based on the D. melanogaster gene expression atlas (FlyAtlas2)31. For larval stages, the expression data from larval main tissues, including brain, midgut, hindgut, Malpighian tubules, fat body, salivary gland and trachaea, were used and their relative expression levels were calculated on the basis of their transcripts per million (TPM) values:

$${\mathrm{RExp}_{\rm{tissue}}} = \frac{{\mathrm{log}_{2}\left( {\mathrm{TPM}_{\rm{tissue}} + 1} \right)}}{{{\sum} {\left( {\mathrm{log}_{2}\left( {\mathrm{TPM}_{\rm{tissue}} + 1} \right)} \right)} }},$$

where one pseudo count was added to obtain a robust estimation in case of low TPM values.

For the pupal and adult stages, the expression data from adult female tissues, including brain, eye, thoracicoabdominal ganglion, midgut, hindgut, Malpighian tubules, fat body, salivary gland and ovary, were used. Relative expression levels were calculated as in the larval stages (see above).

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.