Introduction

Angiosperms have sexual systems including hermaphroditism, monoecy, and dioecy, among others (reviewed by Baránková et al. 2020), and approximately 5-6% of flowering plant species are dioecious plants (Charlesworth 1985; Renner 2014). Sex determination systems (SDSs) have evolved independently in different dioecious plant lineages (Bull 1985; Charlesworth 1985; Ming et al. 2011; Westergaard 1958), and are classified into male (XX/XY) and female heterogametic (ZW/ZZ) systems. The chromosomes carrying sex determining regions (SDRs) may differ cytologically (heteromorphic) and include extensive non-recombining regions (Bull 1985; Westergaard 1958), but are often indistinguishable (homomorphic). It is thought that suppressed recombination has sometimes evolved, and that this leads to the heteromorphic state (reviewed by Charlesworth 2017). Sex chromosomes often have recombining regions at one or both ends, termed pseudo-autosomal regions (PARs, reviewed by Otto et al. 2011). Genetic maps of dioecious plants are scarce, but the species for which maps or genomic data provide evidence about recombination suggest that plant sex chromosomes also have PARs, and that these are sometimes physically extensive.

In some taxa, sex chromosomes have been maintained for long evolutionary times. However, there are many examples of changes, termed “sex chromosome turnovers”, between closely related species (reviewed by Vicoso 2019 and Kuhl et al. 2021), including in fish, amphibia, and plants (reviewed in Renner and Müller 2021). Two major kinds of turnovers are possible. Either a new sex determining gene arises on an autosome (or in a different gene on the same chromosome as the previous sex-determining locus), replacing the ancestral one, in a process termed “allelic diversification” by Pan et al. (2021) and “SNP” or “expression” differences by Tao et al. (2021, see also Saunders et al. 2018), or a pre-existing sex determining gene (or set of genes) is transposed to a new genome location (termed “duplication” in Tao et al. and Pan et al. 2021); the new region created will be hemizygous in the heterozygous sex, and not present in the homozygous sex. Either type of turnover can create a new, non-degenerated sex-linked region (reviewed by Kuhl et al. 2021). Changes between female and male heterogamety also fall into two categories (Bull 1983; van Doorn and Kirkpatrick 2010). In “nonhomologous” transitions, the ancestral and the new sex determining loci are on different linkage groups, and the turnover converts the ancestral sex chromosome pair into an autosomal pair, as in the fish taxa Tilapia and Oryzias (reviewed in Pan et al. 2021). In homologous transitions, the chromosome that carries the sex determining gene does not change. A change from female to male heterogamety may, for example, occur, if a Z chromosome (whose sex-determining function is recessive to that of the W) acquires a masculinizing effect gene that is dominant to the W, and this Y chromosome replaces its homolog. This situation was found in Platyfish (Kallman 1984).

Recent studies have detected turnovers in plant lineages (Charlesworth 2015; Moore et al. 2016; Henry et al. 2018). A change from female to male heterogamety, or vice versa, involving different chromosomes, was discovered in section Otites of the genus Silene (Balounova et al. 2019; Martin et al. 2019). In the Salicaceae, heterogamety in Populus has apparently changed from female to male, or vice versa (Renner and Müller 2021; Tuskan et al. 2012). Whether these turnovers involved duplication or allelic diversification is not known, though, in both Silene and the Salicaceae, they involve changes of the sex determining locus to different chromosomes. In the wild North American octoploid strawberries (Fragaria) with female heterogamety, the “duplication” process is involved, as the sex-determining locus maps to three different chromosomes in different taxa; these turnovers therefore involve translocations to different chromosomes (Tennessen et al. 2018).

The ecological or genetic factors involved in observed turnovers are still unknown. Four hypotheses have been proposed: 1) accumulation of deleterious mutation on the sex chromosomes (Blaser et al. 2013, 2014), 2) neutrality, in which a new sex determining allele can undergo genetic drift (random variations of allelic frequencies) and becomes ‘fixed’ (Bull and Charnov 1977; Saunders et al. 2018; Veller et al. 2017), 3) selection for a new sex-determining system to equalise the sex ratio (Jaenike 2001; Scott et al. 2018; Werren and Beukeboom 1998), and 4) presence of a sexually antagonistic polymorphism on an autosome favouring the fixation of a new sex-determining locus in a genome region closely linked to it, as modelled by van Doorn and Kirkpatrick (2007, 2010).

The recent development of whole genome re-sequencing has created the possibility of identifying SDSs in non-model species, sometimes even if they are physically small, using genome-wide association study (GWAS) and analysis of sequence differentiation between the sexes (often using FST) (Darolti et al. 2019; Franchini et al. 2018; Gammerdinger et al. 2020; He et al. 2021a). This can reveal differences in SDS locations, even between closely related species, using the genome of one species as the reference (He et al. 2022).

Populus and Salix are closely related genera in the family Salicaceae, almost all members of which are dioecious (Argus et al. 2010; Fang et al. 1999). The two genera are estimated to have diverged about 45 million years ago, and to share a common dioecious ancestor (Dai et al. 2014; Tuskan et al. 2006). Despite this long history of dioecy, their sex chromosomes are homomorphic (Zhou et al. 2020). The sex-determining locus is on different chromosome in different Populus species (reviewed in Renner and Müller 2021), and both XY and ZW systems exist in both genera (He et al. 2021a) (Table 1). These observations suggest the possibility of sex chromosome turnovers, and that at least four turnover events have been suggested in the Salicaceae (Manchester et al. 2006; Mank et al. 2014; Yang et al. 2021). More recent studies detected at least four turnovers in Populus alone; two translocation events may have occurred in P. alba (Li et al. 2022; Müller et al. 2020). Here, we study diploid Salix species.

Table 1 Possible transitions in sex determination and sex chromosomes in poplars and willows.

The genus Salix includes ~450 species, in two main groups Salix and Vetrix (Gulyaev et al. 2022; He et al. 2021b). Studies of diploid species show that most in the Vetrix clade have female heterogamety, while Salix clade species have male heterogamety (Gulyaev et al. 2022; He et al. 2021a). The Salix clade includes species from subgenera Salix, Longifoliae and Protitea, and the Vetrix clade includes subgenera Chamaetia and Vetrix (Chamaetia/Vetrix clade), sect. Amygdalinae and the formerly recognized genera Chosenia (S. arbutifolia) and Toisusu (S. cardiophylla) (Gulyaev et al. 2022; Lauron-Moreau et al. 2015; Wu et al. 2015). Three Salix clade species have XY SDS on chromosome 7, including S. nigra (Sanderson et al. 2021), S. dunnii (He et al. 2021a), and newly identified S. chaenomeloides (Wang et al. 2022), whereas all five Vetrix clade species so far studied (S. suchowensis, S. viminalis, S. purpurea, S. triandra, and S. polyclona) have ZW SDS on chromosome 15 (Almeida et al. 2020; He et al. 2022; Hou et al. 2015; Li et al. 2020; Zhou et al. 2020). However, a very recent study suggested that one species, S. arbutifolia, in the Vetrix clade, has an XY SDS on chromosome 15 (Wang et al. 2022). Willows usually have physically longer sex-linked regions than poplars (He et al. 2021a), and these may be slightly degenerated (Almeida et al. 2020), though more data are needed; if so, willows’ sex chromosomes could be more stable, as degenerated sex chromosomes are predicted to be less likely to undergo turnovers, which can generate low fitness of individuals homozygotes for these chromosomes (Bull 1983).

The genome structure of Salix species is conserved, with clear synteny between chromosomes of different species, allowing homologs to be reliably identified (He et al. 2021a). Therefore, it should be possible to identify the SDS locations of willows reliably using reference genomes of close relatives to assemble sequences from heterogametic (with phased Z and W or X and Y) or homogametic individuals.

Our present study involved two species, S. triandra and S. arbutifolia, whose relationships within the genus Salix have been debated. Our results resolve some of the uncertainties, and illuminate the evolution of Salix sex chromosomes. As will be described, this revealed turnover events in willows, which were previously unknown in this genus. These findings suggest that turnover events may be common in the Salicaceae, not just in the genus Populus. The Salicaceae may be similar to frogs (Stock et al. 2021) or some fish (Ross et al. 2009), in which SDS are diverse, sometimes even within different populations of the same species (for example, the platyfish, Volff and Schartl 2001).

Salix triandra is a species of sect. Amygdalinae native to Europe and Western and Central Asia, and its male flowers usually have three stamens (Ohashi 2006; Skvortsov 1999). Abdollahzadeh et al. (2011), Chen et al. (2010), Wagner et al. (2021) and Wu et al. (2015) concluded that S. triandra and S. triandroides formed a distinct clade (subg. Triandrae) sister to Chamaetia/Vetrix, but Lauron-Moreau et al. (2015) and Gulyaev et al. (2022) suggested that S. triandra is on a basal branch of the Vetrix clade. Li et al. (2020) suggested that S. triandra may have a ZW SDS on chromosome 15, based on linkage analysis. However, they used female S. purpurea v1.0 (https://phytozome-next.jgi.doe.gov/info/Spurpurea_v1_0) as the reference genome. Chromosome 15 in this assembly contains unphased W- and Z-linked regions (Zhou et al. 2020). Given its phylogenetic position (Gulyaev et al. 2022; Wagner et al. 2021; Wu et al. 2015) and its currently identified SDS, this species is key to drawing conclusions about the evolutionary changes in SDS locations in Salix, but it was not included by Wang et al. (2022).

Salix arbutifolia is a diploid dioecious species, mainly distributed in northeastern Asia (Ohashi 2006; Skvortsov 1999). It was treated as a member of the genus Chosenia by Nakai (1920), based on its morphological characteristics, drooping male catkins like poplars and single bud scale like willows, which suggested that it is an intermediate phenotype between genus Salix and Populus. Some molecular systematic studies supported merging Chosenia into the genus Salix (Chen et al. 2010; Hardig et al. 2010; Kimura 1988), though its phylogenetic position has remained controversial. Lauron-Moreau et al. (2015) concluded that S. arbutifolia belongs to the subg. Vetrix (Vetrix clade), while Wu et al. (2015) proposed that it belongs to subg. Pleuradenia, and is sister to Chamaetia/Vetrix clade. The SDS of S. arbutifolia was recently identified as XY on chromosome 15, using female and male genome sequences of the species as references (Wang et al. 2022). As its close relatives have a ZW-linked region on chromosome 15 (reviewed by Gulyaev et al. 2022), it is worth testing whether some populations of S. arbutifolia could have ZW systems, or whether this species consistently has an XY system.

We use whole genome re-sequencing data to reconstruct the phylogeny of S. arbutifolia, S. triandra and 6 willow species with known SDSs, with P. euphratica as an outgroup, (i) to investigate the phylogenetic position of S. arbutifolia and S. triandra; (ii) to investigate the SDS of S. triandra to test the results of previous research; (iii) to test whether S. arbutifolia’s sex-determining locus is on chromosome 15 as Wang et al. (2022) revealed, by using a closely related willow as reference, and whether different populations have consistent SDS; (iv) to test the association between sex determining systems and morphological characters of male flowers; (v) to investigate turnover events in Salicaceae, to explore the possible role of turnover in homomorphic sex chromosomes in Salicaceae species.

Materials and methods

Taxa sampling

We collected 39 individuals (20 females and 19 males, Table S1) from a wild population (Tieling City, Liaoning Province, China; latitude, 42.13194227 N; longitude, 123.4225462 E) of S. triandra for whole genome re-sequencing. We also sampled 41 individuals (21 females and 20 males, Table S1) from two wild populations (Mohe City, Heilongjiang Province, China, 52.8348395 N, 123.1632497 E; and Huma County, Heilongjiang Province, 51.7521601 N, 123.5635467 E) of S. arbutifolia far from Baishan (Jilin, China) where Wang et al. (2022)’s samples were collected. The two species were identified according to the descriptions of Fang et al. (1999) and Ohashi (2006). Leaves of each individual were dried with silica gel. Voucher specimens of these samples are deposited in the Beijing Forestry University herbarium (BJFC) and Shanghai Chenshan Botanical Garden (CSH). To estimate the phylogeny, we also downloaded whole genome sequencing data for six Salix species whose SDSs are known, covering the two main clades (Gulyaev et al. 2022): Salix clade S. dunnii (SRR12893418) and S. nigra (SRR16018717); and Vetrix clade S. polyclona (SRR16018674), S. purpurea (SRR3927002), S. suchowensis (SRR10197854), and S. viminalis (ERR1558612). We did not include S. chaenomeloides, which was recently studied by Wang et al. (2022) in our analysis, because it is a close relative of S. dunnii (Gulyaev et al. 2022), and therefore will not affect our results. We chose Populus euphratica (SRR13324572) as outgroup.

Sequencing, Reads Mapping, and Variant Calling

Total genomic DNA was extracted from leaves of S. triandra and S. arbutifolia using the Qiagen DNeasy Plant Mini Kit (Qiagen, Valencia, CA) following the manufacturer instructions. Whole genome re-sequencing using paired-end libraries were performed on Illumina NovaSeq by Beijing BerryGenomics and Beijing Novogene Bioinformatics Technology. The S. purpurea genome assembly v5.1 (Zhou et al. 2020), excluding chromosome 15 W, was used as a reference genome for mapping S. triandra and S. arbutifolia reads and subsequent variant calling. The genome of S. purpurea v5.1 (Zhou et al. 2020), again excluding chromosome 7 and 15 W, was used as the reference genome for mapping the downloaded samples of reads from the seven other species.

The sequenced reads were filtered and trimmed by fastp 0.20.0 (Chen et al. 2018). Then clean reads were aligned to the reference genomes using the BWA-MEM algorithm from BWA 0.7.12 (Li and Durbin 2009; Li 2013). We used SAMTOOLS 0.1.19 (Li et al. 2009) to extract primary alignments, sort, and merge the mapped data. Sambamba 0.7.1 (Tarasov et al. 2015) was used to mark clonal duplicates of library preparation. The variants were called using ‘HaplotypeCaller’ and ‘GenotypeGVCFs’ (Genome Analysis Toolkit v. 4.1.8.1). ‘HaplotypeCaller’ was run with ‘–sample-ploidy 2’. Joint genotyping was conducted using ‘GenotypeGVCFs’. Hard filtering of the SNP calls was carried out with best practices quality recommendations (QD < 2.0, FS > 60.0, MQ < 40.0, MQRankSum < −12.5, ReadPosRankSum < −8.0, SOR > 3.0). We kept the biallelic sites for subsequent filtering. We excluded sites with coverage exceeding twice the mean depth at variant sites across all samples. Genotypes with depth less than four in a sample were treated as no-call, and sites with no-call genotypes in more than 10% samples were filtered out, and sites with minor allele frequency less than 0.05 were discarded.

Phylogenetic analysis

We analysed sequences of one individual from each of the 9 species. The dataset included 870,166 single nucleotide variants at sites that are four-fold degenerate in the S. purpurea genome sequence, detected using a python script (https://github.com/zhangrengang/degeneracy) based on the gene annotation (S. purpurea GFF3 file, Zhou et al. 2020). Phylogenetic relationships were inferred by a maximum likelihood approach using RAXML v.8.2.4 (Stamatakis 2014). Support values were calculated using 100 rapid bootstrap replicates (-f a option) based on the GTR + GAMMA nucleotide substitution model (Pattengale et al. 2009), following He et al. (2021b) and Gulyaev et al. (2022).

Genome wide association studies between SNPs and sexes

We extracted 1,467,556 and 1,619,524 high-quality SNPs, respectively, from the 39 S. triandra and 41 S. arbutifolia samples. These two datasets were used in a standard case-control GWAS of allele frequencies and sex phenotypes using PLINK v1.90b6.18 (Chang et al. 2015). SNPs with α < 0.05 after Bonferroni correction for multiple testing yielded 5 and 91 sex linked-SNPs on chromosome 15Z, respectively, in S. triandra and S. arbutifolia.

BLASTP (Madden 2002) was used to identify the Arabidopsis thaliana homologs of the genes containing these putatively sex linked-SNPs), with “blastp -evalue -e-5 -max_target-seqs 1”.

Divergence of female and male populations

We also used the two SNP datasets to evaluate genetic differentiation (estimated as FST) between the male and female individuals. Weighted FST values between the sexes were calculated using the Weir and Cockerham (1984) estimator with 100-kb windows and 10-kb steps using VCFtools (Danecek et al. 2011). Regions with significantly higher FST values than other parts of the chromosome are considered candidate sex-linked regions (He et al. 2021a). A changepoint package (Killick and Eckley 2014) was used to assess significance of differences in the mean and variance of the FST values in chromosome 15 windows, using the function cpt.meanvar, algorithm PELT and penalty CROPS. We extracted SNPs in windows with FST values > 0.3 in the S-LRs identified from the two SNP datasets. VCFtools (Danecek et al. 2011) was then used to calculate heterozygote frequencies of these SNPs on a per-individual basis. High heterozygosity in males compared with females suggests male heterogamety, and higher heterozygosity in females suggests female heterogamety. Since ARR17-like genes (Potri.019G133600, Müller et al. 2020; Xue et al. 2020) have been identified as playing a role in Populus sex determination, we also blasted the Potri.019G133600 sequence (of Müller et al. 2020; Xue et al. 2020) against the S. purpurea reference genome to search for possible homologs within the putative sex-linked regions (S-LRs) identified by our FST and changepoint analyses.

Ancestral character and SDS reconstruction

We recorded the stamen numbers of the eight selected diploid willows described in floras and papers (Argus et al. 2010; Fang et al. 1999; Liu et al. 2020; Ohashi 2006; Skvortsov 1999), and the SDSs of these species using the relevant papers (reviewed in Gulyaev et al. 2022). To test the possible association between these two characters, we reconstructed the characters ancestral states. Willows with two (filament connate or distinct) stamens per flower were coded as 0, and those at least three stamens per flower as 1. An XY system on chromosome 7 was coded as 0, a ZW system on chromosome 15 as 1, and an XY system on chromosome 15 as 2. To reconstruct the ancestral states, we used a current probability model in Mesquite v. 2.75 (Maddison and Maddison 2014) based on the ML tree (Fig. 1).

Fig. 1: Phylogenetic tree of willow species with known sex systems and the outgroup Populus euphratica.
figure 1

Values above branches refer to bootstrap. The scale bar refers the mean number of nucleotide substitution per site. Species in red font are studied in this study.

The sex-linked region and proportion on sex chromosomes of willows in Vetrix clade

To test whether the S-LR sizes of S. arbutifolia and S. triandra were affected by the turnovers (see below), we compared the proportion of the sex chromosome that our analyses above found to be completely sex-linked in each Vetrix clade species, using S. purpurea as the reference genome. These were compared with the Z- and W-linked regions in S. purpurea (Zhou et al. 2020) and sex-linked region of S. polyclona (He et al. 2022). One-way analysis of variance (ANOVA, performed with SPSS 16.0 (SPSS, Inc., Chicago, IL, USA) was used to analyze the significance of the size differences.

Results

DNA resequencing data and variant calling

After quality control, a total of 2737.71 and 2684.69 million clean reads were obtained for 39 and 41 individuals of S. triandra (from 31.98 to 38.97 million reads per individual, mean 35.1; Table S2) and S. arbutifolia (from 23.06 to 68.27 million reads per individual, mean 32.74; Table S3), respectively. The mapping ratio of the reads that mapped to the S. purpurea reference genome ranged from 79.2 to 85.2% in S. triandra (mean 82.85%; Table S4) and 84.50 to 90.00% in S. arbutifolia (mean 88.07%; Table S5). The average sequencing depths were 25.0× and 28.2×, respectively. Using S. purpurea as the reference genome, we detected 1,467,556 and 1,619,524 high-quality SNPs (Table 2).

Table 2 List of the high-quality SNP datasets used for relevant analyses.

Phylogenetic analysis

We reconstructed the phylogenetic tree of willows with known SDSs based on 870,166 four-fold site variants (Fig. 1). This resolved four clades with high bootstrap support. Salix arbutifolia (clade 2) and S. triandra (clade 3) sister to clade 1 (S. polyclona, S. viminalis, S. suchowensis, and S. purpurea) formed a monophyletic group here described as Vetrix clade. Clade 4 (the Salix clade) includes S. nigra and S. dunnii. This topology is consistent with previous conclusions (Gulyaev et al. 2022; Wu et al. 2015).

Identification the sex determination systems of Salix triandra and S. arbutifolia

Using S. purpurea as the reference genome, GWAS analysis in S. triandra and S. arbutifolia recovered a total of 5 and 91 SNPs on chromosome 15Z, respectively, with significant associations with sex (Figs. 2a, b, S1a, 3a, b, S1b, Table S6; Table S7). Changepoint analysis of FST between male and female individuals of S. triandra and S. arbutifolia detected candidate fully sex-linked regions between 3.67 and 6.47 Mb (about 20.9% of chromosome 15Z; Figs. 2c, d; S2) and 3.21 and 6.54 Mb (about 24.9% of chromosome 15Z; Figs. 3c, d; S3). A total of 748 and 8026 SNPs, respectively, were within windows with FST values > 0.3 in S. triandra and S. arbutifolia. More than 89 and 74% of these SNPs are heterozygous in S. triandra and S. arbutifolia males, while only 15.53 and 18.80% of these SNPs are heterozygous for the females, respectively, suggesting that both species have XY systems on chromosome 15.

Fig. 2: Identification of the sex determination system of Salix triandra.
figure 2

a Results of genome wide association studies (GWAS) between SNPs and the sexes of 39 individuals. The y axis is the negative logarithm of p values, and the red line shows the Bonferroni-corrected significance level corresponding to α < 0.05. b Plot for GWAS p-values of all SNPs of chromosome 15Z. Red dots show significantly sex-associated SNPs. c FST values between the sexes for 100-kb overlapping windows of chromosome 15Z with 10-kb steps. Red lines represent significant regions on chromosome 15Z suggested by changepoint analysis. d The positions of the PAR1, S-LR and PAR2 regions of chromosome 15Z.

Fig. 3: Identification of the sex determination system of Salix arbutifolia.
figure 3

a Results of genome wide association studies (GWAS) between SNPs and the sexes of 41 individuals. The y axis is the negative logarithm of p values, and the red line shows the Bonferroni-corrected significance level corresponding to α < 0.05. b Plot for GWAS p-values of all SNPs of chromosome 15Z. Red dots show significantly sex-associated SNPs. c FST values between the sexes for 100-kb overlapping windows of chromosome 15Z with 10-kb steps. Red lines represent significant regions on chromosome 15Z suggested by changepoint analysis. d The positions of the PAR1, S-LR and PAR2 regions of chromosome 15Z.

Our GWAS identified only a single likely fully sex-linked gene in each of S. triandra (Sapur.15ZG056900) and S. arbutifolia (Sapur.15ZG052500). The Sapur.15ZG056900 (reciprocal best hits found the A. thaliana gene ATPPRT3) is involved in positive regulation of abscisic acid-activated signaling pathway and the Sapur.15ZG052500 (reciprocal best hits found the A. thaliana gene ATGUS2) is involved in cell elongation. We found no ARR17 like genes in our identified S-LRs.

Ancestral character and SDS reconstruction

The SDS of previously studied Vetrix clade species is on chromosome 15, while the Salix clade SDS is on chromosome 7 (Gulyaev et al. 2022). Our analysis suggests an equivocal state for ancestors of S. triandra and S. arbutifolia, but a transition to the XY system on chromosome 15 that is the ancestral state to clade 1 (S. polyclona, S. viminalis, S. suchowensis, and S. purpurea) (Figs. 1, 4a). The relative likelihoods of ancestral SDS states for the genus Salix are 0.395 for XY on chromosome 7, 0.356 for XY on chromosome 15, and 0.249 ZW on chromosome 15 (Fig. 4a).

Fig. 4: Reconstructions of ancestral sex determination system and character of willow species.
figure 4

Ancestral state reconstruction of a the sex determination system of willows with known sex determination systems and b stamen numbers based on a ML tree (Fig. 1).

The ancestral character state reconstructions of stamen number and SDS imply that species with multiple stamens (≥ 3) usually have XY systems, while species with less than three stamens usually have a ZW system (Fig. 4b).

Sex-linked region sizes of willows of Vetrix clade

The S-LR of S. purpurea 15 W is the largest (6.7 M, or 42.7% of chromosome 15 in the reference S. purpurea v5.1 sequence), followed by that of S. polyclona (4.68 Mb, or 35.2% of the sex chromosome). The sex-linked regions of S. arbutifolia and S. triandra are smaller, accounting, respectively, for 24.9 and 20.9% of the sex chromosome 15Z (reference S. purpurea v5.1), significantly smaller (p < 0.01) than those of other willows in the Vetrix clade (Table 3).

Table 3 The sex-linked region and proportion on sex chromosomes of willows in Vetrix clade.

Discussion

A XY system on chromosome 15 of Salix triandra and S. arbutifolia, and possible turnovers in willows

The two species studied in detail here, S. triandra and S. arbutifolia, both have SDSs on chromosome 15 (Figs. 2, 3) and our heterozygosity analyses suggest that both have male heterogamety. Li et al. (2020) suggested S. triandra has a ZW SDS on chromosome 15, based on linkage mapping, but they used the female S. purpurea v1.0 reference genome sequence, whose Z and W were not phased, and represent a chimera (https://phytozome-next.jgi.doe.gov/info/Spurpurea_v1_0; Li et al. 2020; Zhou et al. 2020), which can cause false inferences. The chromosome 15 of S. purpurea v1.0 is 22.6 Mb in length, and after phasing (S. purpurea v5.1), its 15 W and 15Z sequences are 15.7 Mb and 13.3 Mb, respectively (Zhou et al. 2020). It is however worth considering that the SDSs are quite flexible in Salix, it might be possible that there is a change within S. triandra. It is a very widespread species and different evolution of SDSs could happen in different areas.

We used the phased S. purpurea genome sequence as the reference to identify the S. arbutifolia SDS, and obtained the same result as Wang et al. (2022) recently reported, indicating that the different S. arbutifolia populations share the same SDS. However, we estimate that the S-LR of S. arbutifolia is 24.9% of the 15Z chromosome of S. purpurea (Table 3), rather more than Wang et al.’s (2022) estimate (for the same species) that the X-LR is 15.34% of chromosome 15X. This difference may be due to the different sequencing and assembly strategies used (Almeida et al. 2020; He et al. 2021a; Wang et al. 2022; Zhou et al. 2020), or the sex chromosomes of these close relatives may have different gene/spacer density. Despite this, one can identify the SDS of willow species first and then assemble genomes of heterogametic individuals using accurate long HiFi (high-fidelity) reads to obtain high quality X and Y or Z and W sex chromosomes (reviewed by He and Hörandl 2022) rather than assemble genomes of both sexes.

Our ancestral state reconstruction suggests that S. arbutifolia and S. triandra are in a transitional stage between clades 1 and 4, whereas a Y-linked region on chromosome 7 appears to be the ancestral state for clade 4 (Salix) (Figs.1, 4a). Since our ancestral state analysis yielded similar probabilities for XY on chromosome 7 (0.395) or chromosome 15 (0.356) as the ancestral state for the genus Salix, we cannot rule out the possibility that XY on chromosome 7 or 15 is the ancestral state of the genus Salix. Two scenarios are thus possible, both involving at least two turnover events (Fig. 4a): 1) if the Y-linked region on chromosome 7 is the ancestral state for the genus Salix, at least two turnover events may have occurred in the Vetrix clade (XY on chromosome 7 to XY on chromosome 15, and to ZW on chromosome 15), while the entire Salix clade maintained the chromosome 7 ancestral state (Fig. 4a, blue arrow); or 2) if an XY system on chromosome 15 is the ancestral state for genus Salix, at least two turnover events must have occurred within the genus Salix (XY to ZW on chromosome 15 in the Vetrix clade, and XY on chromosome 15 to XY on chromosome 7 in the Salix clade, Fig. 4a, gray arrow). Wang et al. (2022) found some evidence supporting the first scenario. However, their preliminary hypothesis was based only on comparing two out of ~70 species, and finding that these two may have XY SDS on chromosomes 7 and 15 (Gulyaev et al. 2022; He et al. 2021a, 2021b). To draw a reliable conclusion, it will be necessary to include phased 15X and 15Y data from S. triandra, which is related to both their studied species (S. chaenomeloides and S. arbutifolia) and lies between them in the phylogeny (Fig. 1; Gulyaev et al. 2022; Wang et al. 2022). Furthermore, as the sex-linked regions of Populus, the sister genus of willows, are mostly on chromosome 19, the two groups of willows with sex-linked regions on different chromosome probably evolved independently from chromosome 19 (He et al. 2021a) (Table 1), perhaps involving different sex determination gene(s). This hypothesis is supported by the findings of Wang et al. (2022) that intact ARR17-like genes are present on autosome 19 of S. arbutifolia and S. chaenomeloides and expressed female-specifically. However, this remains to be tested, as willow sex determination gene(s) have not yet been confirmed.

Our phylogenomic analyses and the sex-linked region identified in S. arbutifolia and S. triandra supports their assignment to the Vetrix clade (Fig. 1), consistent with previously studies (Gulyaev et al. 2022; Lauron-Moreau et al. 2015).

Possible reasons for turnovers in the genus Salix

Bull (1983) and Vicoso (2019) proposed that some clades have extremely conserved sex chromosomes because they have become genetically degenerated, while others have not degenerated and can undergo sex chromosome turnovers. The physical sizes of sex-linked regions of willows are relatively large (Table 1), but their S-LRs are, at most, only slightly degenerated (Almeida et al. 2020; He et al. 2021a). Hence, turnover events should be possible in Salix and Populus (Gulyaev et al. 2022; He et al. 2021a). It has, however, been proposed that genetic degeneration can promote turnovers, replacing degenerated Y with new, non-degenerated sex chromosomes (Blaser et al. 2014). Accumulation of deleterious mutations on the ancestral sex chromosomes seems unlikely to explain the XY to ZW change on chromosome 15 in the Vetrix clade, because such “homologous” turnovers are predicted to maintain the ancestral heterozygous sex (Blaser et al. 2014; Bull 1983; Jeffries et al. 2018; van Doorn and Kirkpatrick 2010; Scott et al. 2018). Wang et al. (2022) suggested that deleterious mutation load may have contributed to the turnover (from XY to ZW) because more pseudogenes were identified in sex-linked regions in S. chaenomeloides than predicted in S. arbutifolia. However, they did not compare their results with the pseudogene proportions of any previously studied Salix W- or Z-linked regions (Almeida et al. 2020; Zhou et al. 2020). The alternative of neutral turnovers due to drift has not yet been excluded, and these are also generally expected to maintain the ancestral heterogamety (Veller et al. 2017).

The sex-ratio selection and sexually antagonistic selection models are more likely to lead to changes in the heterogametic sex (Jeffries et al. 2018) like those inferred in willows. To explain the XY to ZW transition in the Vetrix clade, the W-linked factor must be dominant and cause femaleness in the presence of the Y. Sexual selection in plants, such as pollinator attraction, is very similar to some forms of sexual selection in many animals (Charlesworth et al. 1987). In S. dunnii, Apis cerana’s significant preference for visiting female flowers suggests that sexual selection pressure can be placed on male flowers (Zeng et al. 2022). Likewise, evidence of sexual dimorphism was also found in S. purpurea (Gouker et al. 2021). Hence, sexually antagonistic selection may play a role in the turnover of the genus Salix.

Sex ratio selection is, however, likely in plants, and could be involved in the turnover events of the genus Salix. Sex ratio imbalances are common in nature (Scott et al. 2018), and can favour a new sex determining locus that restores an equal sex ratio (Beukeboom and Perrin 2014; Bull 1983; Mank et al. 2014). In willows, female biased sex ratios appear to be associated with ZW systems as known so far (Table S8), possibly reflecting incompatibility between maternally and paternally inherited Z-linked alleles (Pucholt et al. 2017a). Willows with XY SDS may generally have male-biased or balanced sex ratios (Table S8). However, more data are needed to support this hypothesis. If the ancestor of the Vetrix clade had male heterogamety and a male biased sex ratio, a feminizing gene might be able to invade and cause a transition to ZW SDS (Kozielska et al. 2010). Male or female bias can be also introduced during polyploidization, and many willows are polyploids (He and Hörandl 2022). Hence, the relationship between turnover events and biased sex ratios remains uncertain.

Our ancestral state reconstruction supports changes in both the heterogamety and the stamen number, with a reduction to two stamens accompanying the change from male to female heterogamety. However, a recent study reported that S. exigua (in the Vetrix clade) has an XY SDS on chromosome 15, but its male flowers have only two stamens (Argus et al. 2010; Hu et al. 2022). It is also worth noting that in Salix group there are some tetraploid species with just 2 stamens, eg. S. alba, S. babylonica, S. chienii, S. fragilis, and S. matsudana (Gulyaev et al. 2022; Wagner et al. 2021). Therefore, a correlation of stamen number and SDS system cannot be predicted at the present stage of research.

Can turnovers explain homomorphic sex chromosomes in Salicaceae?

The sex chromosomes in the Salicaceae stricto sensu (Populus and Salix) show no signs of major genetic degeneration, and most have remained homomorphic (Pucholt et al. 2017b), despite almost all species being dioecious, and dioecy having evolved more than 45 million years ago in the genus Salix. The rate at which genetic degeneration occurs after loss of recombination between sex-linked regions is not yet understood (reviewed by Charlesworth 2021). However, a turnover event will evidently create a new non-degenerated sex-linked region (Bachtrog et al. 2014; Beukeboom and Perrin 2014; Gamble et al. 2015; Jeffries et al. 2018; Kuhl et al. 2021; Vicoso 2019). Turnovers via chromosomal mutations (translocations) are in Salix probably feasible because the chromosomes are morphologically very similar to each other (Pucholt et al. 2017b). Hence, turnovers and also the return of previous SDS-bearing chromosomes to autosomes would be probably easier than in genera with more differentiated karyotypes. Turnovers, in turn, would inhibit degeneration and loss of size of sex chromosomes. Turnovers, including transitions between XY and ZW systems, may be associated with a lack of sex chromosome heteromorphism (Bachtrog et al. 2014; Zhou et al. 2018). The genome organization of the sister Populus and Salix are well conserved, except for differences in size and gene orders of chromosomes 1 and 16 (He et al. 2021a), so the S-LR locations in different species can be compared. We summarize current information in Table 1. The estimated sizes of Salix S-LRs are generally considerably larger than those in Populus. This might reflect a history of more recent turnover events in Populus, which has more known turnover events than willows (Table 1). Another possible explanation is that the S-LRs of Salix species are within pericentromeric regions that recombine very rarely (Chen et al. 2016; He et al. 2021a; Zhou et al. 2018), creating larger sex-linked regions than in Populus.

The S-LRs in S. arbutifolia and S. triandra were estimated above to be smaller than those of other willows of the Vetrix clade (see Table 3; Wang et al. 2022). This might seem to be evidence against the turnover events inferred here, since, if the ZW on chromosome 15 is the derived state, it might be expected to be smaller than the S-LRs in species with a male-determining region on chromosome 15. However, again the size of the regions detected may reflect the sizes of the non-recombining regions in which new sex-determining factors appear during turnovers.

We identified only two sex-link genes in the S-LRs identified here, among which Sapur.15ZG052500 (best hit in the A. thaliana genes: ATGUS2) is involved cell elongation and beta-glucuronidase activity, and Sapur.15ZG056900 (reciprocal best hits found the A. thaliana genes ATPPRT3) in positive regulation of the abscisic acid-activated signaling pathway. The homolog of Sapur.15ZG052500 also shows female-biased expression in shoot tip of S. purpurea (Carlson et al. 2017).

In poplars, turnovers have also been inferred, and different partial duplications of ARR17-like genes have been shown to be involved in different species (Li et al. 2022; Xue et al. 2020; Yang et al. 2020). We searched for ARR17-like genes in the S-LRs of S. arbutifolia and S. triandra, but did not find any sex-linked copies in either species. We speculate that willows probably do not share the same sex-determining gene as those in poplars, which (as summarized in Table 1) are partial duplications of an ARR17-like gene (He et al. 2021a; Müller et al. 2020; Xue et al. 2020). Wang et al. (2022) found partial duplications of an ARR17-like gene in the Y-LR of S. arbutifolia. However, copies of ARR17-like genes were also found on multiple autosomes (1, 3, 8, 13, and 19) of S. dunnii (He et al. 2021a). Moreover, turnovers that involve a change from male to female heterogamety, or vice versa, also seem unlikely to involve the same gene (for example, a dominant male-determining factor in a population with male heterogamety cannot cause a turnover by simply being duplicated to a different location, as a dominant factor is required that can convert individuals to females, even if they carry the ancestral male-determining factor). It is currently unknown whether the turnovers in Populus and Salix involved duplications or involved take-overs of sex determination by different genes.

The results presented here are consistent with the expectation that, when heterogamety changes, female heterogamety is generally the derived state. This is because, if the first step in de novo evolution of separate sexes in plants is probably a loss of function sterility mutation, which is likely to be recessive. A recessive male sterility mutation, a common kind of mutation, can initiate evolution of an XY system, but a ZW system requires an unusual type of mutation causing dominant male sterility. A turnover can, however, readily produce a ZW system through a mutation that dominantly suppresses male fertility, for example a duplication that leads to production of an RNA sequence that suppresses a function essential for anther or pollen development.