Island-specific evolution of a sex-primed autosome in a sexual planarian

Guo, Longhua; Bloom, Joshua S.; Dols-Serrate, Daniel; Boocock, James; Ben-David, Eyal; Schubert, Olga T.; Kozuma, Kaiya; Ho, Katarina; Warda, Emily; Chui, Clarice; Wei, Yubao; Leighton, Daniel; Lemus Vergara, Tzitziki; Riutort, Marta; Sánchez Alvarado, Alejandro; Kruglyak, Leonid

doi:10.1038/s41586-022-04757-3

Download PDF

Article
Open access
Published: 01 June 2022

Island-specific evolution of a sex-primed autosome in a sexual planarian

Nature volume 606, pages 329–334 (2022)Cite this article

15k Accesses
13 Citations
126 Altmetric
Metrics details

Subjects

Abstract

The sexual strain of the planarian Schmidtea mediterranea, indigenous to Tunisia and several Mediterranean islands, is a hermaphrodite^1,2. Here we isolate individual chromosomes and use sequencing, Hi-C^3,4 and linkage mapping to assemble a chromosome-scale genome reference. The linkage map reveals an extremely low rate of recombination on chromosome 1. We confirm suppression of recombination on chromosome 1 by genotyping individual sperm cells and oocytes. We show that previously identified genomic regions that maintain heterozygosity even after prolonged inbreeding make up essentially all of chromosome 1. Genome sequencing of individuals isolated in the wild indicates that this phenomenon has evolved specifically in populations from Sardinia and Corsica. We find that most known master regulators^{5,6,7,8,9,10,11,12,13} of the reproductive system are located on chromosome 1. We used RNA interference^14,15 to knock down a gene with haplotype-biased expression, which led to the formation of a more pronounced female mating organ. On the basis of these observations, we propose that chromosome 1 is a sex-primed autosome primed for evolution into a sex chromosome.

Network of large pedigrees reveals social practices of Avar communities

Article Open access 24 April 2024

Positive selection in the genomes of two Papua New Guinean populations at distinct altitude levels

Article Open access 30 April 2024

A draft human pangenome reference

Article Open access 10 May 2023

Main

Sex chromosomes evolve from homologous autosomes that acquire sex-determining genes and lose their ability to recombine^{16,17,18,19,20,21}. As such, sex chromosome evolution and recombination suppression are closely associated^{16,17,18,19,20,21}. However, because direct evidence of such homologous autosomes primed for evolution into sex chromosomes is difficult to capture, little is known about the molecular signatures associated with the evolution of recombination suppression.

The freshwater planarian Schmidtea mediterranea, an important model organism for studies of regeneration^22,23, exists as asexual and sexual reproductive strains. The sexual strain is distributed mostly in Tunisia and on the islands of Sardinia, Corsica and Sicily¹. The sexual strain is a simultaneous hermaphrodite that develops both male and female reproductive systems in the same adult individual and obligately outcrosses to fertilize other individuals^24,25. Individuals in the asexual strain do not develop sexual reproductive systems. We considered that studying chromosome evolution in a simultaneous hermaphrodite might provide insight into the early evolution of a primitive sex chromosome.

S. mediterranea has four pairs of chromosomes, which are stably diploid. The genome is reported to comprise 774 Mb assembled as 481 non-contiguous series of genomic sequences, or scaffolds^26,27,28. A previous study found that approximately 300 Mb of the genome remained heterozygous even after extensive inbreeding of laboratory strains, and that this phenomenon also occurs naturally in wild populations in Sardinia²⁴. The two sets of heterozygous alleles were collectively named J and V haplotypes. To define the chromosomal locations of these alleles and investigate the reasons underpinning the persistence of heterozygosity in S. mediterranea, a detailed assembly of all four chromosomes is needed.

Chromosome-scale genome assembly

To transform the 481 scaffolds²⁶ into a chromosome-scale genome reference, we carried out chromosome sequencing (ChrSeq)^29,30 of a laboratory strain, as well as chromatin proximity sequencing by Hi-C³ (Fig. 1a). To do so, we dissected individual chromosomes from mitotic cells using laser capture and amplified and sequenced each chromosome individually (Fig. 1b). We examined the sequencing depth of each scaffold across multiple samples of the same and different chromosomes to ensure reproducibility and specificity, respectively (Extended Data Fig. 1). Overall, we successfully amplified and confidently assigned 740 Mb of the 774-Mb genome to one of four chromosomes (Supplementary Table 1).

We used ChrSeq information and chromatin interaction data generated by Hi-C to correct and connect the individual scaffolds within a chromosome into a chromosome-scale genome, hereafter referred to as Smed_chr_ref_v1. The Hi-C data analysed using the SALSA scaffolding algorithm³ resolved the 481 scaffolds into 57 super-scaffolds and 104 singletons. The ChrSeq data indicated that 3 of the 57 super-scaffolds were disjointed inter-chromosomal fragments, consistent with the Hi-C contact heat map (Fig. 1c). We split the disjointed super-scaffolds, merged the scaffolds into chromosomes, and ordered and oriented all scaffolds within the chromosomes using Juicebox visualization software⁴ (Methods). Chromosome assignments by ChrSeq alone (Supplementary Table 1) and by Hi-C alone were inconsistent for only 3 of 384 (0.8%) scaffolds (Supplementary Table 2). We manually assigned these three scaffolds to chromosomes according to the Hi-C data. Hi-C also detected 26 inter-chromosomal or intra-chromosomal assembly errors (Supplementary Table 2) in the previous assembly²⁶, 5 of which were confirmed by ChrSeq to be inter-chromosomal disjointed scaffolds (Supplementary Table 2). The final genome assembly, Smed_chr_ref_v1 (Fig. 1c), had four chromosomes with a total size of 764 Mb, which is 98.4% of that reported previously²⁶. Of the 1.6% of the previous assembly not contained in Smed_chr_ref_v1 by Hi-C, about half (52.6%) of the scaffolds could not be assigned to a chromosome or were assigned to two chromosomes by ChrSeq, indicating lower assembly quality (Supplementary Table 3). Moreover, 33.0% and 12.4% of the scaffolds had chromatin interaction signals with structurally complex regions of chromosomes 1 and 2, respectively (Extended Data Fig. 2), suggesting that they might be alternative assemblies or repetitive sequences. In our new assembly, Smed_chr_ref_v1, the two ends of chromosome 4 were capped by >1,000 copies of the telomere repeat TTAGGG, indicating high assembly quality.

To validate the linearity of the chromosomes in Smed_chr_ref_v1, we generated a linkage map. We crossed two divergent laboratory strains of S. mediterranea, S2F10b and D5, to produce an F₂ population, and we genotyped individual worms with RADseq³¹. Eighty markers that were evenly distributed and genotyped in at least 98% of the F₂ segregants (91 of 93) were used to establish four linkage groups (Fig. 1d and Supplementary Table 4) representing the four chromosomes. The ordering of the 80 markers in the linkage map is consistent with Smed_chr_ref_v1, independently supporting the quality of our chromosome-scale genome assembly. This highly contiguous and complete genome assembly and linkage map together facilitate further genetic and epigenetic investigations of the functions of the genome in this model planarian.

Chromosome 1 recombination suppression

We next re-examined the heterozygous alleles that define the J and V haplotypes in the newly assembled genome. We found that 87.7% of the genetic markers that remained heterozygous were located on chromosome 1, spanning 333 Mb at a density of 30,148 variants per 10 Mb (Fig. 2a). The remaining 12.3% of the heterozygous markers were located on the other three chromosomes at a density of 3,274 variants per 10 Mb and probably correspond to differences between highly similar copies of repetitive elements rather than true polymorphisms (Supplementary Table 5). All F₂ worms (n = 93) in this study were heterozygous for chromosome 1 but not for chromosomes 2–4 (Supplementary Table 6), which is consistent with the previous study²⁴. Hence, we concluded that the J/V haplotypes were on chromosome 1.

**Fig. 2: Recombination suppression and inversions on chromosome 1.**

Our linkage map revealed an extremely low rate of recombination on chromosome 1 of only 0.5 cM for the entire chromosome (Fig. 1d). This is particularly notable because, at 333 Mb, chromosome 1 was the largest of the four chromosomes, containing more than 40% of the genome.

To directly examine whether chromosome 1 can recombine, we sequenced 45 single sperm cells and 28 single oocytes from J/V line S2 (Fig. 2d). Gamete sequencing is preferred because recombination events in hatchlings can be selected for by differential fertilization or embryonic lethality. We identified 3,197 single-nucleotide variants (SNVs) on chromosome 1 and 3,312 SNVs on chromosome 3, covering 99% of the length of each chromosome (Supplementary Table 7). The SNVs were distributed at a similar density across 20-Mb windows, with coefficients of variation of 0.38 and 0.31 for chromosomes 1 and 3, respectively (Supplementary Table 7). We observed that 98% of the sperm cells (44 of 45) and 93% of the oocytes (26 of 28) had no crossovers on chromosome 1. By contrast, most sperm cells and oocytes had crossovers on chromosomes 2, 3 and 4 (Fig. 2b, Extended Data Fig. 3a, b and Supplementary Table 7). We thus concluded that recombination on chromosome 1 was strongly suppressed.

Consistent with this conclusion, we found that during prophase I, when other chromosomes had numerous crossovers, chromosome 1 formed a ring structure in the oocyte in J/V worms but not in J/J worms (Fig. 2c and Extended Data Fig. 4). Fluorescence in situ hybridization (FISH) with telomere probes suggested that crossovers between the two homologous pairs of chromosome 1 occurred only in regions close to the telomeres, leading to the observed ring conformation rather than the side-by-side pairing observed for chromosomes 2, 3 and 4. Furthermore, Hi-C analysis showed that chromosome 1 had three putative inversions, each >20 Mb in size (Fig. 2d); such inversions can cause crossover suppression^32,33,34. The rest of the genome had only one large inversion of approximately 10 Mb on chromosome 2 (Extended Data Fig. 2b).

Island-specific evolution of chromosome 1

To investigate the genetic diversity of S. mediterranea in its natural environment and determine whether chromosome 1 J/V heterozygosity occurs throughout the entire species, we used RADseq³¹ to sample the genomes of 70 sexual individuals from Sardinia, Corsica, Sicily and Tunisia and 2 asexual individuals from Menorca¹ (Fig. 3a). To look for genetic relationships between the individuals, we determined clustering of relatedness measured by identity-by-state pairwise distances³⁵ and identified two superclusters. Animals from Sicily were closely related to those from Tunisia, and animals from Sardinia were closely related to those from Corsica (Fig. 3b). Phylogenetic clustering (Extended Data Fig. 5), fixation index (F_ST) values (Extended Data Fig. 6) and analyses with Structure (Extended Data Fig. 7) further supported this observation.

The relatedness of the populations from Sardinia and Corsica suggests that they may share genome characteristics that differ from those of the populations from Sicily and Tunisia. Indeed, animals in the populations from Sardinia and Corsica had greater heterozygosity on chromosome 1 than expected under Hardy–Weinberg equilibrium (Fig. 3c), whereas the heterozygosity on the other three chromosomes in these populations and on all four chromosomes in the other populations closely followed expectations. By analysing the J/V haplotype markers in wild populations, we found that animals from Sardinia and Corsica were heterozygous J/V, whereas those from Sicily and Tunisia were homozygous J/J. Moreover, the animals from Sardinia (n = 28) had many sites that were heterozygous in more than 80% of the individuals and were distributed along the length of chromosome 1 except near the ends; conversely, the animals from Sicily (n = 27) had very few such heterozygous sites (Fig. 3d, e and Supplementary Table 8). Few such heterozygous sites were observed on chromosomes 2, 3 and 4 in either population (Fig. 3d, e and Extended Data Fig. 3c, d). These analyses suggest that chromosome 1 specifically evolved and diverged on the islands of Sardinia and Corsica.

A sex-primed chromosome

To gain insight into the island-specific suppression of recombination on chromosome 1, we examined the genes located on this chromosome, which contained 39% of all annotated genes. Five of the seven known master regulators of the reproductive system were found on chromosome 1, including nanos⁵, nhr-1 (refs. ^8,9), npy-8 (ref. ⁶, npyr-1 (ref. ¹¹) and CPEB-2 (ref. ¹²) (ophis¹¹ and boule-2 (ref. ¹⁰) were the two exceptions) (Fig. 4a). Knockdown of any one of these master regulators leads to depletion of both male and female reproductive tissues^{5,6,8,9,10,11,12}. The presence of these genes with crucial roles in sexual development on chromosome 1 suggests that chromosome 1 integrity is important for the maintenance of sexual reproduction. Indeed, the asexual lineage of S. mediterranea, which is devoid of any reproductive organs²³, had a translocation from chromosome 1 to chromosome 3 and probably evolved through loss of function of one or more of these genes.

**Fig. 4: Acquisition and haplotype-specific expression of sex-related genes.**

Chromosome 1 also contained three of the four known master regulators of male or female reproductive tissues (CPEB-1 (ref. ¹²), onecut¹³ and Bic-C⁵; dmd-1 (ref. ³⁶) was the exception) (Fig. 4a). The CPEB-1 gene is specifically required for the development of oocytes and yolk glands¹². Loss of onecut and Bic-C expression leads to a ‘no-testes’ phenotype without affecting the ovary^5,13. The presence of a female-determining gene on a chromosome that does not recombine provides an ideal foundation for the evolution of a sex chromosome. A loss-of-function mutation in the gene on one of the two homologous chromosomes would turn the chromosome with the mutation into a male-determining chromosome. Similar logic applies to a male-determining gene.

Haplotype-specific expression of sex-specific genes is a predicted signature of a sex-primed autosome. To test this prediction, we examined the expression of 20 genes in sexually mature adult worms and in the male copulatory organ, the penis papilla⁷, in a J/V line. The 20 genes were chosen because they were characterized as having sex-specific function and/or expression in the literature and include the 8 master regulators mentioned above^5,12 (Supplementary Table 9). Of these 20 genes, 13 contained at least one heterozygous variant in the coding sequence, which allowed us to determine whether these genes showed biased expression from the J or V haplotype (Extended Data Fig. 8). We found that a gelsolin-like protein¹², Smed-glp, was expressed predominantly from the J haplotype in both whole worms and the penis papilla, with more haplotype-specific expression in the male copulatory organ. Four other genes showed biased expression from the J haplotype in whole worms (KY847529.1 and KY847536.1) or in the penis papilla (KY847539.1 and CPEB-2), and two additional genes showed biased expression from the V haplotype in the male copulatory organ (KY847532.1) or in whole worms (Smed-MSY4) (Fig. 4b and Supplementary Table 9). All seven of these genes were localized within the inversions or close to inversion breakpoints (Supplementary Table 5).

We used FISH with hybridization chain reaction to confirm that the expression of Smed-glp is highly enriched in both the penis papilla and the bursa canal^7,37 (Extended Data Fig. 9), which is a female organ used to receive sperm from mating partners. To examine the function of Smed-glp, we used RNA interference (RNAi)^14,15 to knock down its expression in young hatchlings. After 8 weeks of feeding, both the knockdown and control hatchlings reached sexual maturity. Notably, the bursa canal was much more pronounced in the knockdown animals (Fig. 4c, d). No obvious morphological differences were observed in the male copulatory organ. These results suggest that the J allele of Smed-glp may have a greater role in preventing over-development of the female bursa canal (Fig. 4e), probably by controlling the number of organ-specific muscle fibres³⁸. This is consistent with a previous observation that J/J individuals show higher egg production, suggesting a larger contribution to female reproduction (Fig. 4b in ref. ²⁴; Supplementary Information).

We have thus identified a chromosome in the hermaphrodite planarian S. mediterranea that does not recombine, is enriched in master regulators of reproductive systems, shows allele-specific expression of sex-related genes, and contains genes orthologous to those on the sex chromosome of Schistosoma mansoni (Supplementary Tables 9 and 10, and Supplementary Information), which is the only known Platyhelminthes species with differentiated sex chromosomes³⁹. These observations collectively led us to propose that chromosome 1 of S. mediterranea is primed for evolution into a sex chromosome.

Chromosome 1 of S. mediterranea marks an intriguing system for studying sex chromosome evolution⁴⁰. Planarians from the islands of Sardinia and Corsica are heterozygous for the J and V haplotypes, whereas those in Sicily and Tunisia exist as J/J homozygotes. In laboratory crosses involving J/V lines, all hatchlings were J/V heterozygotes²⁴. No V/V planarians have been identified either in nature or in laboratory crosses. We genotyped single zygotes from crosses between J/V individuals and determined that homozygous zygotes (J/J or V/V) exist (Extended Data Fig. 10a and Supplementary Information). We propose that early embryonic lethality leads to the loss of J/J and V/V adults in both nature and laboratory crosses. Such lethality may arise from degeneration of coding sequences on the J and V haplotypes as a consequence of crossover suppression. Indeed, chromosome 1 has an elevated rate of transposable elements and mutations introducing stop codons relative to the rest of the genome (Supplementary Table 11) as well as higher nonsynonymous substitution rates (Supplementary Table 13 and Extended Data Fig. 10b). As a consequence, the J and V haplotypes each carry unique sets of functional genes that are silenced or truncated on the other haplotype (Supplementary Table 11).

The presence of three nested inversions on chromosome 1 suggests the possibility that recombination suppression on this chromosome may have evolved in a stepwise manner similar to that reported for the human X and Y chromosomes¹⁷. In support of this possibility, we observed evidence for three evolutionary strata corresponding to the inverted regions. We used PacBio genome sequencing data²⁶ to identify long reads that bridged the inversion breakpoints, and we validated the three inversions identified by Hi-C (Supplementary Table 12 and Supplementary Information). We observed that the rates of heterozygous sites and synonymous substitutions were elevated within the three inversions in a pattern consistent with the evolutionary strata (Fig. 2a, Supplementary Tables 5 and 13, and Extended Data Fig. 10b, c). We used deep whole-genome sequencing data of individuals from laboratory crosses²⁴ to estimate the de novo mutation rate in sexual planarians to be approximately 1.0 × 10⁻⁸ mutations per nucleotide per generation. On the basis of this estimate, it is likely that inversions 3, 1 and 2 evolved approximately 450,000, 320,000 and 260,000 generations ago, respectively.

Although it is not possible to know how chromosome 1 will evolve, our findings provide a snapshot of an incipient stage that supports the hypothesis that sex chromosomes evolve from homologous autosomes that acquire sex-specific roles and cease to recombine^16,17,18,19. The locked J/V heterozygous system may facilitate this process by maintaining sex-specific alleles in the planarian population before the evolution of dioecy⁴¹. We propose that the planarian chromosome 1 haplotypes provide a unique opportunity to directly examine the molecular characteristics of a sex-primed autosome.

Methods

Planarian husbandry, RNAi and phenotyping

Sexual planarians were fed organic beef liver once a week. All animals used for experiments were selected randomly. For RNAi, hatchlings around 2 weeks old from strain S2F10 were used. RNAi food was prepared by mixing 1 µg of double-stranded RNA with 10 µl of liver paste¹⁵. To examine the RNAi phenotypes, photographs were taken with a stereomicroscope (Zeiss) when the animals were freely swimming. The lengths of the worms or bursa canal were measured with ImageJ software by multiple researchers in a double-blind experimental design. The ratio of the bursa canal to the whole worm was used to mitigate potential variations in worm size and their degree of relaxation caused by their soft bodies.

Chromosome sequencing

Chromosomes were collected from multiple animals of one clonal line, S2, maintained in the laboratory by amputation and regeneration. Chromosome spreads were prepared on nuclease-free membrane slides (Zeiss) according to a previously developed protocol except that, at the last step, the tissues were dissociated into single nuclei and placed onto the slides without squashing with a coverslip⁴². Single chromosomes were identified under a ×40 lens and were collected into the caps of single PCR tubes by PALM MicroBeam laser microdissection (Zeiss). The collected chromosomes were spun down with a tabletop centrifuge in 4 µl PBS, and the DNA in the pellets was amplified with a REPLI-g Single Cell Kit (Qiagen) for sequencing on a MiSeq or HiSeq 3000 sequencing system (Illumina).

Chromosome-scale genome assembly

A Hi-C sequencing library was prepared from multiple animals of the S2 strain using the enzyme DpnII. Sequenced reads were aligned to dd_Smes_g4.fasta²⁶ with bwa mem (version 0.7.17)⁴³. An assembly file was prepared from the SALSA³ output FINAL.fasta with juicebox_scripts (Phase Genomics). The .hic file was prepared by run-assembly-visualizer.sh from three-dimensional de novo assembly⁴⁴. The two files were loaded into Juicebox⁴ for scaffold manipulation using split, merge, order and orient commands and for chromosome assembly. The modified assembly file (chromosome-scale) was converted to fasta (Smed_chr_ref_v1) with juicebox_assembly_converter.py.

Hi-C library construction

A Hi-C library was generated using Phase Genomics Proximo Animal Kit version 3.0. Approximately four worms were finely chopped and were then cross-linked for 20 min at room temperature with end-over-end mixing in 1 ml of Proximo cross-linking solution. The cross-linking reaction was terminated with quenching solution for 15 min at room temperature, again with end-over-end mixing. The quenched tissue was rinsed once with 1× chromatin rinse buffer (CRB), transferred to a liquid nitrogen-cooled mortar and ground to a fine powder. The powder was resuspended in 700 µl Proximo lysis buffer 1 and lysed with glass beads for 20 min at room temperature on a vortex mixer. A low-speed spin was used to clear the large debris, and the chromatin-containing supernatant was transferred to a new tube. After a second spin at higher speed, the supernatant was removed, and the pellet containing the nuclear fraction of the lysate was washed with 1× CRB. After removal of the 1× CRB wash, the pellet was resuspended in 100 µl Proximo lysis buffer 2 and incubated at 65 °C for 15 min. Chromatin was bound to the recovery beads for 10 min at room temperature. The beads were placed on a magnetic stand and washed with 200 µl of 1× CRB.

The chromatin bound on the beads was resuspended in 150 µl Proximo fragmentation buffer, and 2.5 µl of Proximo fragmentation enzyme was added. The reaction was incubated for 1 h at 37 °C, cooled to 12 °C and then incubated with 2.5 µl of finishing enzyme for 30 min. After the addition of 6 µl of Stop Solution, the beads were washed with 1× CRB and were resuspended in 100 µl of Proximo ligation buffer supplemented with 5 µl of proximity ligation enzyme. The proximity ligation reaction was incubated at room temperature for 4 h with end-over-end mixing. To this, 5 µl of reverse cross-linking enzyme was added, and the reaction was incubated at 65 °C for 1 h.

After reversing the cross-links, the free DNA was purified with recovery beads, and the Hi-C junctions were bound to streptavidin beads and washed to remove unbound DNA. The washed beads were used to prepare paired-end deep sequencing libraries using Proximo library preparation reagents.

Oocyte and sperm cell sequencing

Sperm cells were released from sexually mature S2 strain animals into calcium- and magnesium-free buffer (1% BSA). The cell dissociation solution was placed onto a slide and examined under a phase-contrast microscope to identify single sperm cells. Oocytes were released from egg capsules. Single sperm cells or oocytes were transferred into single PCR tubes for amplification with a REPLI-g Single Cell Kit (Qiagen). RADseq libraries and whole-genome libraries were prepared and sequenced on a HiSeq 3000 or NovaSeq S2 sequencing system (Illumina). RADseq sequencing data were analysed as described in the linkage map section. Whole-genome sequencing data were analysed as described in the recombination section.

Synteny analysis

For candidate genes, the protein sequences of genes of interest were obtained from Schmidtea specimens. Such genes were aligned to S. mansoni genomes³⁹ using the Protein to Nucleotide BLAST (tblastn) tool. To systematically examine the synteny of the whole genome, we used SonicParanoid (version 1.3.8)⁴⁵ to identify one-to-one protein orthologues between S. mediterranea and S. mansoni. The S. mansoni genome assembly (V9) and its protein annotations are available at https://zenodo.org/record/5149023#.Ybk9jn3MK3I.

Linkage map

A J/J line, D5, was crossed to a J/V line, S2F8b, to build an F₂ population of 93 animals. Genomic DNA was extracted from a fragment of each animal using an Easy-DNA gDNA Purification Kit (K180001, ThermoFisher). Sequencing libraries for RADseq were prepared according to the procedures of Adapterama III³¹ with a few modifications. Genetic variants were identified using Stacks (version 2.41)^46,47]. All variants were filtered with VCFtools (version 0.1.14)⁴⁸ to remove insertions and deletions and to select biallelic SNVs. Clusters of markers located within 200 bp were removed because they were likely to correspond to repetitive elements. Markers disobeying Mendelian segregation were also removed. Only markers that were homozygous in both parents were used. The linkage map was built with R/QTL⁴⁹.

Quantifying recombination

Sequencing reads from S2 and its oocytes and sperm cells were aligned to Smed_chr_ref_v1 with bwa mem (version 0.7.17). Genetic variants were jointly called by using the Genome Analysis Toolkit (GATK, version 4.1.4.1) with GenomicsDB and GenotypeGVCFs⁵⁰. Biallelic heterozygous markers in S2 were further filtered by removing abnormal markers, including those with no segregation of two alleles in the gametes, clusters of markers in close proximity (<200 bp apart) and markers of heterozygosity in sperm cells. The J and V haplotypes were manually phased for oocytes without recombination and were phased by MPR.genotyping⁵¹ for all gametes. The MPR.genotyping package was also used to impute or correct missing or erroneous genotypes. The final genotype bins were used to identify and visualize recombination with customized R code. Quantification of recombination was focused on crossovers between long tracks of haplotypes along a chromosome. Putative gene conversion events such as short tracks of haplotype switches encompassing <1% of the chromosome length were not included.

Gene expression analysis

To examine the gene content on chromosome 1, transcriptome data were downloaded from the NCBI Sequence Read Archive (SRA). To examine genes related to sexual development, transcriptomes from sexual adults^9,12,52, sexual juveniles⁹ and sexual adults with nhr-1 RNAi⁹ were used. To examine stem cell-enriched genes, transcriptomes from sorted X1 cells and CIW4 were used^53,54.

All sequencing data were aligned to dd_Smed_v6 (ref. ²⁷) with bwa mem (version 0.7.17). Differential gene expression was analysed with DESeq2 (version 1.26.0)⁵⁵. Expression was quantified at the transcript level with kallisto (version 0.44.0)⁵⁶ and was imported and summarized to gene-level count matrices by tximport⁵⁷.

Haplotype-specific expression

To examine the haplotype-specific expression of critical regulators of the reproductive system, mRNA was extracted from six sexually mature animals of the S2 J/V line and was analysed as three biological replicates, with two animals pooled into each replicate. Nine penis papillae were dissected from nine sexual adult animals of the same line and were analysed as three biological replicates, with penis papillae from three animals pooled into each replicate. All mRNA samples were extracted on the same day and were processed at the same time for library preparation and sequencing to minimize technical variation. Libraries for RNA-seq were prepared with a Clontech SMARTer Stranded Total RNA-seq (Pico) Kit. The workflow consisted of converting total RNA to cDNA and then adding adaptors for Illumina sequencing through PCR. The PCR products were purified, and ribosomal DNA was depleted. The cDNA fragments were further amplified with primers universal to all libraries. Lastly, the PCR products were purified again to yield the final cDNA library. Different adaptors were used to multiplex samples in one lane. Sequencing was performed on an Illumina NovaSeq 6000 with a 150-bp paired-end-read run. Data quality checks were conducted using the Illumina Sequencing Analysis Viewer. Demultiplexing was performed using Illumina Bcl2fastq2 version 2.17.

All sequencing data were aligned to dd_Smed_v6 (ref. ²⁷) with bwa mem (version 0.7.17). To ensure accuracy, haplotype-specific expression of the 14 genes of interest was manually examined using the Integrative Genomics Viewer. J or V allele counts were identified for each biallelic variant in the exons. For a particular gene, the allele counts were aggregated from all variants on all exons for the J and V haplotypes. The allele counts were then subjected to binomial testing and Bonferroni correction⁵⁸ to determine whether the observed allele bias was statistically significant. P > 1 after Bonferroni correction was set to 1.

De novo map, parameter optimization and phylogenetic inference

All datasets were run through the de novo pipeline as implemented in Stacks version 2.52 (refs. ^46,47). First, paired-end reads were demultiplexed and filtered for quality using PROCESS_RADTAGS and were truncated to a length of 135 bp. Individual reads with Phred scores below 30 or uncalled bases were discarded (96.9% of reads passed the quality filters). Optimal parameters were identified following the guidelines of ref. ⁵⁹ by running multiple iterations of the de novo pipeline and varying just one parameter with each new iteration on a subset of 13 samples from the same population, [Sic_mar], following recommendations⁶⁰. We varied the minimum stack depth (-m) between 1 and 6 (m1–m6), the number of mismatches allowed between stacks (-M) between 1 and 10 (M1–M10), and the number of mismatches allowed to merge catalogue loci (-n) while keeping all other parameters constant (m3, M2 and n0). We then compared the number of polymorphic assembled loci across samples using a sample representation cut-off of 80% (r80) and gain or loss of polymorphic loci with each new iteration. Once -m and -M were optimized, we assessed -n by evaluating the change in the number of polymorphic loci for n = M − 1, n = M and n = M + 1. RAD loci were then assembled using the denovo_map.pl wrapper in Stacks, and the following parameters were set: m = 3, M = 2 and n = 3.

Assembled loci that were present in 75% of all individuals were kept from POPULATIONS, a minor allele frequency (MAF) filter of 0.04 (--min-maf) was used to filter out singleton SNPs that could mask population structure and a maximum observed heterozygosity (--max-obs-het) filter of 0.99 was used to remove potentially paralogous loci⁶¹. Additionally, to build the phylogenetic tree, we concatenated all RADseq loci after filtering (--phylip-var-all). Phylogenetic trees were built by maximum likelihood using RAxML-NG version 0.9.0 (ref. ⁶²), starting from a random seed and applying a GTR+G substitution model and 1,000 bootstrap replicates. The sample from Menorca (Sme7-5_men) was used as an outgroup to root the tree.

Population structure analysis

For this analysis, we excluded the outgroup sequence and ran POPULATIONS to retain loci present in all populations (-p 10) and 75% of the individuals present in each population (-r 0.75). On the basis of the loci that passed our filtering criteria, a random whitelist of 1,000 loci was generated and again run through POPULATIONS with the same criteria but retaining the first SNP at each locus (--write-single-snp). The output was exported in Structure format, and Structure version 2.3.4 (ref. ⁶³) was used to infer population structure with 10,000 chains as burn-in and 100,000 MCMC chains with 20 iterations for K = [1–11]. The resulting files were run through Structure Harvester⁶⁴, and the optimal K was determined⁶⁵.

DNA FISH

For telomere FISH on oocyte chromosomes, ovaries were dissected as reported previously^66,67. Hybridization was carried out as reported with chromosome spreads⁴² except that the ovaries were kept in suspension in washing buffer or hybridization buffer. Before hybridization, the dissected ovaries were treated with a digestion buffer (0.1% SDS and 10 µg ml⁻¹ proteinase K (Qiagen) in 0.3% Triton X-100 in PBS) for 10 min at room temperature. The repeat located near centromeres in Fig. 2c had the following sequence: TCTGGACGGAAATTTTTTAATCTTTATAGGCTTGTATCTCTGTCAATTTTTATTTGTTTTCATAATCTTTGATATATTTCTCGATAACTTTTGATTCTCTACATGATAGCATTTTAAAAATTGCAAAAATCATAACGGGCTCGTCAAACACAAGTCAT.

Hybridization chain reaction and RNA FISH

To examine tissue expression of the genes glp and smedwi-1, we used probes and buffers for third-generation ISH chain reaction purchased from Molecular Instruments. Sexually mature planarians were treated with 7.5% N-acetyl-l-cysteine (Sigma-Aldrich) for 10 min and were then fixed in 4% paraformaldehyde (Electron Microscopy Sciences; 16% solution diluted 1:4 in PBS) for 20 min. The copulatory apparatus was dissected into a 1.5-ml RNase-free tube. The rest of the procedures followed the hybridization chain reaction RNA FISH protocol of Molecular Instruments for whole-mount mouse embryos.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.

Data availability

The authors confirm that all data underlying the findings are fully available without restriction. All sequencing data are available from the NCBI SRA database (accession number PRJNA731187). The chromosome-scale genome assemblies for sexual S. mediterranea (including phased genomes and genomes with repetitive elements) are openly available on the Planosphere (https://planosphere.stowers.org/), PlanMine (https://planmine.mpibpc.mpg.de/planmine/begin.do), GenBank (GCA_022537955.1) and Zenodo (https://doi.org/10.5281/zenodo.5807415) databases. We used publicly available NCBI PacBio sequencing data (accession numbers SRX2700681–SRX2700684) and planarian transcriptome data (accession numbers SRR2658118–SRR2658125, SRR2658134–SRR2658141, SRR3473955–SRR3473957, SRR3629945–SRR3629952, SRR6351185–SRR6351188, SRR6351201–SRR6351204, SRR6351213–SRR6351216, SRR6363910–SRR6363927 and SRR6364586–SRR6364588).

Code availability

Custom code for data analysis can be openly accessed on the Zenodo database at https://doi.org/10.5281/zenodo.5807415.

References

Lazaro, E. M. et al. Schmidtea mediterranea phylogeography: an old species surviving on a few Mediterranean islands? BMC Evol. Biol. 11, 274 (2011).
Article PubMed PubMed Central Google Scholar
Benazzi, M., Baguná, J., Ballester, R., Puccinelli, I. & Papa, R. D. Further contribution to the taxonomy of the «Dugesia Lugubris-Polychroa Group» with description of Dugesia Mediterranea N.S.P. (Tricladida, Paludicola). Boll. Zool. 42, 81–89 (1975).
Article Google Scholar
Burton, J. N. et al. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat. Biotechnol. 31, 1119–1125 (2013).
Article CAS PubMed PubMed Central Google Scholar
Durand, N. C. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 3, 99–101 (2016).
Article CAS PubMed PubMed Central Google Scholar
Wang, Y., Zayas, R. M., Guo, T. & Newmark, P. A. Nanos function is essential for development and regeneration of planarian germ cells. Proc. Natl Acad. Sci. USA 104, 5901–5906 (2007).
Article ADS CAS PubMed PubMed Central Google Scholar
Collins, J. J. III et al. Genome-wide analyses reveal a role for peptide hormones in planarian germline development. PLoS Biol. 8, e1000509 (2010).
Article PubMed PubMed Central Google Scholar
Chong, T., Stary, J. M., Wang, Y. & Newmark, P. A. Molecular markers to characterize the hermaphroditic reproductive system of the planarian Schmidtea mediterranea. BMC Dev. Biol. 11, 69 (2011).
Article PubMed PubMed Central Google Scholar
Tharp, M. E., Collins, J. J. III & Newmark, P. A. A lophotrochozoan-specific nuclear hormone receptor is required for reproductive system development in the planarian. Dev. Biol. 396, 150–157 (2014).
Article CAS PubMed PubMed Central Google Scholar
Zhang, S. et al. A nuclear hormone receptor and lipid metabolism axis are required for the maintenance and regeneration of reproductive organs. Preprint at bioRxiv https://doi.org/10.1101/279364 (2018).
Iyer, H., Issigonis, M., Sharma, P. P., Extavour, C. G. & Newmark, P. A. A premeiotic function for boule in the planarian Schmidtea mediterranea. Proc. Natl Acad. Sci. USA 113, E3509–E3518 (2016).
Article CAS PubMed PubMed Central Google Scholar
Saberi, A., Jamal, A., Beets, I., Schoofs, L. & Newmark, P. A. GPCRs direct germline development and somatic gonad function in planarians. PLoS Biol. 14, e1002457 (2016).
Article PubMed PubMed Central Google Scholar
Rouhana, L., Tasaki, J., Saberi, A. & Newmark, P. A. Genetic dissection of the planarian reproductive system through characterization of Schmidtea mediterranea CPEB homologs. Dev. Biol. 426, 43–55 (2017).
Article CAS PubMed PubMed Central Google Scholar
Li, P. et al. Single-cell analysis of Schistosoma mansoni identifies a conserved genetic program controlling germline stem cell fate. Nat. Commun. 12, 485 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Newmark, P. A., Reddien, P. W., Cebrià, F. & Alvarado, A. S. Ingestion of bacterially expressed double-stranded RNA inhibits gene expression in planarians. Proc. Natl Acad. Sci. USA 100, 11861–11865 (2003).
Article ADS CAS PubMed PubMed Central Google Scholar
Rouhana, L. et al. RNA interference by feeding in vitro-synthesized double-stranded RNA to planarians: methodology and dynamics. Dev. Dynam. 242, 718–730 (2013).
Article CAS Google Scholar
Bachtrog, D. A dynamic view of sex chromosome evolution. Curr. Opin. Genet. Dev. 16, 578–585 (2006).
Article CAS PubMed Google Scholar
Lahn, B. T. & Page, D. C. Four evolutionary strata on the human X chromosome. Science 286, 964–967 (1999).
Article CAS PubMed Google Scholar
Rice, W. R. Evolution of the Y sex chromosome in animals. BioScience 46, 331–343 (1996).
Article Google Scholar
Charlesworth, B. The evolution of sex chromosomes. Science 251, 1030–1033 (1991).
Article ADS CAS PubMed Google Scholar
Muller, H. J. Genetic variability, twin hybrids and constant hybrids, in a case of balanced lethal factors. Genetics 3, 422–499 (1918).
Article CAS PubMed PubMed Central Google Scholar
Charlesworth, D. Evolution of recombination rates between sex chromosomes. Philos. Trans. R. Soc. Lond. B 372, 20160456 (2017).
Article Google Scholar
Reddien, P. W. & Sanchez Alvarado, A. Fundamentals of planarian regeneration. Annu. Rev. Cell Dev. Biol. 20, 725–757 (2004).
Article CAS PubMed Google Scholar
Newmark, P. A. & Sanchez Alvarado, A. Not your father’s planarian: a classic model enters the era of functional genomics. Nat. Rev. Genet. 3, 210–219 (2002).
Article CAS PubMed Google Scholar
Guo, L., Zhang, S., Rubinstein, B., Ross, E. & Alvarado, A. S. Widespread maintenance of genome heterozygosity in Schmidtea mediterranea. Nat. Ecol. Evol. 1, 19 (2016).
Article PubMed Google Scholar
Zayas, R. M. et al. The planarian Schmidtea mediterranea as a model for epigenetic germ cell specification: analysis of ESTs from the hermaphroditic strain. Proc. Natl Acad. Sci. USA 102, 18491–18496 (2005).
Article ADS CAS PubMed PubMed Central Google Scholar
Grohme, M. A. et al. The genome of Schmidtea mediterranea and the evolution of core cellular mechanisms. Nature 554, 56–61 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Brandl, H. et al. PlanMine—a mineable resource of planarian biology and biodiversity. Nucleic Acids Res. 44, D764–D773 (2016).
Article CAS PubMed Google Scholar
Robb, S. M., Ross, E. & Sanchez Alvarado, A. SmedGD: the Schmidtea mediterranea genome database. Nucleic Acids Res. 36, D599–D606 (2008).
Article CAS PubMed Google Scholar
Weise, A. et al. High-throughput sequencing of microdissected chromosomal regions. Eur. J. Hum. Genet. 18, 457–462 (2010).
Article PubMed Google Scholar
Ma, L. et al. Direct determination of molecular haplotypes by chromosome microdissection. Nat. Methods 7, 299–301 (2010).
Article CAS PubMed PubMed Central Google Scholar
Bayona-Vasquez, N. J. et al. Adapterama III: quadruple-indexed, double/triple-enzyme RADseq libraries (2RAD/3RAD). PeerJ 7, e7724 (2019).
Article PubMed PubMed Central Google Scholar
Dobzhansky, T. & Epling, C. The suppression of crossing over in inversion heterozygotes of Drosophila pseudoobscura. Proc. Natl Acad. Sci. USA 34, 137–141 (1948).
Article ADS CAS PubMed PubMed Central Google Scholar
Miller, D. E. et al. The molecular and genetic characterization of second chromosome balancers in Drosophila melanogaster. G3 8, 1161–1171 (2018).
Article CAS PubMed PubMed Central Google Scholar
Sun, Y., Svedberg, J., Hiltunen, M., Corcoran, P. & Johannesson, H. Large-scale suppression of recombination predates genomic rearrangements in Neurospora tetrasperma. Nat. Commun. 8, 1140 (2017).
Article ADS PubMed PubMed Central Google Scholar
Zheng, X. et al. A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics 28, 3326–3328 (2012).
Article CAS PubMed PubMed Central Google Scholar
Chong, T., Collins, J. J. III, Brubacher, J. L., Zarkower, D. & Newmark, P. A. A sex-specific transcription factor controls male identity in a simultaneous hermaphrodite. Nat. Commun. 4, 1814 (2013).
Article ADS PubMed Google Scholar
Hyman, L. H. in The Invertebrates, vol. II (ed. Boell, E. J.) 52–458 (McGraw-Hill Book Company, 1951).
Bertin, B. et al. Gelsolin and dCryAB act downstream of muscle identity genes and contribute to preventing muscle splitting and branching in Drosophila. Sci. Rep. 11, 13197 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Buddenborg, S. et al. Assembled chromosomes of the blood fluke Schistosoma mansoni provide insight into the evolution of its ZW sex-determination system. Preprint at bioRxiv https://doi.org/10.1101/2021.08.13.456314 (2021).
Charlesworth, B. & Charlesworth, D. A model for the evolution of dioecy and gynodioecy. Am. Nat. 112, 975–997 (1978).
Article Google Scholar
Charlesworth, D. & David, S. in Sex Determination in Plants, 1st ed. (ed. Ainsworth, C. C.) 25–50 (Garland Science, 1999).
Guo, L. et al. An adaptable chromosome preparation methodology for use in invertebrate research organisms. BMC Biol. 16, 25 (2018).
Article PubMed PubMed Central Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS PubMed PubMed Central Google Scholar
Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Cosentino, S. & Iwasaki, W. SonicParanoid: fast, accurate and easy orthology inference. Bioinformatics 35, 149–151 (2019).
Article CAS PubMed Google Scholar
Catchen, J., Hohenlohe, P. A., Bassham, S., Amores, A. & Cresko, W. A. Stacks: an analysis tool set for population genomics. Mol. Ecol. 22, 3124–3140 (2013).
Article PubMed PubMed Central Google Scholar
Catchen, J. M., Amores, A., Hohenlohe, P., Cresko, W. & Postlethwait, J. H. Stacks: building and genotyping loci de novo from short-read sequences. G3 1, 171–182 (2011).
Article CAS PubMed PubMed Central Google Scholar
Danecek, P. et al. The variant call format and VCF tools. Bioinformatics 27, 2156–2158 (2011).
Article CAS PubMed PubMed Central Google Scholar
Broman, K. W., Wu, H., Sen, S. & Churchill, G. A. R/qtl: QTL mapping in experimental crosses. Bioinformatics 19, 889–890 (2003).
Article CAS PubMed Google Scholar
McKenna, A. et al. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
Article CAS PubMed PubMed Central Google Scholar
Xie, W. et al. Parent-independent genotyping for constructing an ultrahigh-density linkage map based on population sequencing. Proc. Natl Acad. Sci. USA 107, 10578–10583 (2010).
Article ADS CAS PubMed PubMed Central Google Scholar
Davies, E. L. et al. Embryonic origin of adult stem cells required for tissue homeostasis and regeneration. eLife 6, e21052 (2017).
Article PubMed PubMed Central Google Scholar
Zeng, A. et al. Prospectively isolated tetraspanin⁺ neoblasts are adult pluripotent stem cells underlying planaria regeneration. Cell 173, 1593–1608 (2018).
Article CAS PubMed Google Scholar
Duncan, E. M., Chitsazan, A. D., Seidel, C. W. & Sanchez Alvarado, A. Set1 and MLL1/2 target distinct sets of functionally different genomic loci in vivo. Cell Rep. 13, 2741–2755 (2015).
Article CAS PubMed PubMed Central Google Scholar
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Article PubMed PubMed Central Google Scholar
Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525–527 (2016).
Article CAS PubMed Google Scholar
Soneson, C., Love, M. I. & Robinson, M. D. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Res 4, 1521 (2015).
Article PubMed Google Scholar
Etymologia: Bonferroni correction. Emerg. Infect. Dis. 21, 289 (2015).
Paris, J. R., Stevens, J. R. & Catchen, J. M. Lost in parameter space: a road map for STACKS. Methods Ecol. Evol. 8, 1360–1373 (2017).
Article Google Scholar
Rochette, N. C. & Catchen, J. M. Deriving genotypes from RAD-seq short-read data using Stacks. Nat. Protoc. 12, 2640–2659 (2017).
Article CAS PubMed Google Scholar
Stobie, C. S., Oosthuizen, C. J., Cunningham, M. J. & Bloomer, P. Exploring the phylogeography of a hexaploid freshwater fish by RAD sequencing. Ecol. Evol. 8, 2326–2342 (2018).
Article PubMed PubMed Central Google Scholar
Kozlov, A. M., Darriba, D., Flouri, T., Morel, B. & Stamatakis, A. RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference. Bioinformatics 35, 4453–4455 (2019).
Article CAS PubMed PubMed Central Google Scholar
Pritchard, J. K., Stephens, M. & Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000).
Article CAS PubMed PubMed Central Google Scholar
Earl, D. A. & vonHoldt, B. M. Structure Harvester: a website and program for visualizing Structure output and implementing the Evanno method. Conserv. Genet. Resour. 4, 359–361 (2012).
Article Google Scholar
Evanno, G., Regnaut, S. & Goudet, J. Detecting the number of clusters of individuals using the software Structure: a simulation study. Mol. Ecol. 14, 2611–2620 (2005).
Article CAS PubMed Google Scholar
Guo, L. et al. Subcellular analyses of planarian meiosis implicates a novel, double-membraned vesiculation process in nuclear envelope breakdown. Preprint at bioRxiv https://doi.org/10.1101/620609 (2019).
Guo, F. et al. Planarian ovary dissection for ultrastructural analysis and antibody staining. J. Vis. Exp. 175, e62713 (2021).
Google Scholar
Ouellette, L. A., Reid, R. W., Blanchard, S. G. & Brouwer, C. R. LinkageMapView—rendering high-resolution linkage and QTL maps. Bioinformatics 34, 306–307 (2018).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

We thank L. Rouhana, G. Bruni and S. Zdraljevic for discussions; Z. Kashif, E. Belfer and B. Pakfar for planarian husbandry; and X. Li and the UCLA Technology Center for Genomics & Bioinformatics for next-generation sequencing. We thank Y. Zeng, E. Noskova and D. Pasha for discussions on demographic history. We acknowledge Life Science Editors for professional editing services. Confocal laser-scanning microscopy was performed at the Advanced Light Microscopy/Spectroscopy Laboratory and the Leica Microsystems Center of Excellence at the California Nano Systems Institute at UCLA with funding from NIH Shared Instrumentation grant S10OD025017 and NSF Major Research Instrumentation grant CHE-0722519. This work was supported by grants from the Howard Hughes Medical Institute (L.K.) and the Helen Hay Whitney Foundation (L.G.).

Author information

Authors and Affiliations

Department of Human Genetics and Department of Biological Chemistry, University of California, Los Angeles, Los Angeles, CA, USA
Longhua Guo, Joshua S. Bloom, James Boocock, Olga T. Schubert, Kaiya Kozuma, Katarina Ho, Emily Warda, Clarice Chui, Daniel Leighton, Tzitziki Lemus Vergara & Leonid Kruglyak
Howard Hughes Medical Institute, Chevy Chase, MD, USA
Longhua Guo, Joshua S. Bloom, James Boocock, Olga T. Schubert, Daniel Leighton, Tzitziki Lemus Vergara, Alejandro Sánchez Alvarado & Leonid Kruglyak
Departament de Genètica, Microbiologia i Estadística, Institut de Recerca de la Biodiversitat, Universitat de Barcelona, Barcelona, Spain
Daniel Dols-Serrate & Marta Riutort
Department of Biochemistry and Molecular Biology, Institute for Medical Research Israel-Canada, Hebrew University of Jerusalem–Hadassah Medical School, Jerusalem, Israel
Eyal Ben-David
Institute of Reproductive Medicine, Henan Provincial People’s Hospital, Zhengzhou University, Zhengzhou, China
Yubao Wei
Stowers Institute for Medical Research, Kansas City, MO, USA
Alejandro Sánchez Alvarado

Authors

Longhua Guo
View author publications
You can also search for this author in PubMed Google Scholar
Joshua S. Bloom
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Dols-Serrate
View author publications
You can also search for this author in PubMed Google Scholar
James Boocock
View author publications
You can also search for this author in PubMed Google Scholar
Eyal Ben-David
View author publications
You can also search for this author in PubMed Google Scholar
Olga T. Schubert
View author publications
You can also search for this author in PubMed Google Scholar
Kaiya Kozuma
View author publications
You can also search for this author in PubMed Google Scholar
Katarina Ho
View author publications
You can also search for this author in PubMed Google Scholar
Emily Warda
View author publications
You can also search for this author in PubMed Google Scholar
Clarice Chui
View author publications
You can also search for this author in PubMed Google Scholar
Yubao Wei
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Leighton
View author publications
You can also search for this author in PubMed Google Scholar
Tzitziki Lemus Vergara
View author publications
You can also search for this author in PubMed Google Scholar
Marta Riutort
View author publications
You can also search for this author in PubMed Google Scholar
Alejandro Sánchez Alvarado
View author publications
You can also search for this author in PubMed Google Scholar
Leonid Kruglyak
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

L.K. and L.G. conceptualized the study. L.G. coordinated the project, conducted experiments and wrote the original draft. L.G., E.B.-D. and T.V.L. developed methods for single-sperm sequencing. L.G., K.K., K.H., E.W., C.C. and Y.W. collected the data. L.G., J.S.B., D.D.-S., M.R., J.B. and D.L. analysed the data. K.K., K.H., E.W., C.C. and Y.W. maintained planarian colonies. L.K., L.G., O.T.S. and A.S.A. edited the manuscript. A.S.A. provided the D5 and S2 strains, and sequencing data of a dozen oocytes. M.R. provided wild isolate DNA. L.K. acquired funding and supervised the project.

Corresponding authors

Correspondence to Longhua Guo or Leonid Kruglyak.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks Bo Wang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Chromosome assignment by ChrSeq.

(A-C) Chromosome sequencing depth of different scaffolds is consistent with linkage group assignment. Linkage groups: lg11 = chr1, lg1+lg4 = chr2, lg7 = chr3, lg17 = chr4. (D-F) Chromosome sequencing depth of different scaffolds is consistent with chromosome assignment in the genome assembly smed_chr_ref_v1. Chromosome sequencing samples: chromosome 1 (A, D); chromosome 2 (B, E); chromosome 3 (C, F).

Extended Data Fig. 2 Putative inversions in the planarian genome detected by chromatin contact heatmap.

(A) chromosome 1 chromatin contact heatmap showing 3 potential inversions (dashed-line rectangles) visualized in Juicebox. (B) A complex structural variant on chromosome 2. Green outline: dd_Smes_g4 scaffolds. Blue outline: chromosome boundaries.

Extended Data Fig. 3 Recombination in gametes and heterozygosity distribution in the wild on chromosomes 2 and 4.

(A-B) Number of recombinations in oocytes and sperm on chromosomes 2 and 4 summarized as a violin plot (A) and a box plot (B). The median number of recombinations on chromosome 2 is 3 for sperm and 2 for oocytes, derived from a total of 49 sperm and 38 oocytes. The median number of recombinations on chromosome 4 is 2 for sperm and 1 for oocytes, derived from a total of 50 sperm and 26 oocytes. Chr2 sperm: min = 0, max = 7, median = 3, first quartile = 2, third quartile = 4. Chr2 oocytes: min = 0, max = 7, median = 2, first quartile = 0, third quartile = 4; Chr4 sperm: min = 0, max = 7, median = 2, first quartile = 1, third quartile = 4; Chr4 oocytes: min = 0, max = 3, median = 1, first quartile = 1, third quartile = 2. (C-D) Very few sites are consistently heterozygous in more than 80% of the wild isolates on chromosomes 2 and 4 in Sicily (C) and in Sardinia (D).

Extended Data Fig. 4 Representative images of late prophase I oocytes with ring-shaped chromosome 1.

More than 100 oocytes from > 40 ovaries and > 20 sexual strains were examined for the presence of a ring-shaped chromosome 1. The experiment was repeated >10 times independently with similar results. Chromosome 1 always forms a ring in J/V strains in late prophaseI oocytes. Shown in the figure are chromosomes from 8 different oocytes atlate prophase I. Arrows point to the ring-shaped chromosome 1.

Extended Data Fig. 5 Phylogenetic tree of wild isolates of Schmidtea mediterranea.

The tree is rooted with one asexual animal from Menorca (Sme7-5). The tree on the left (A) includes two animals from SAR and SIC with higher genome heterozygosity than the cohorts from the same collection sites. Phylogenetic relations agree with the previous designation of eastern and western populations¹. SAR: Sardinia; COR: Corsica; SIC: Sicily; TUN: Tunisia.

Extended Data Fig. 6 Fixation index (F_ST) analysis of wild isolates.

Pairwise population F_ST for all collection sites from Sardinia (SAR), Corsica (COR), Menorca (MEN), Sicily (SIC), and Tunisia (TUN). The size and darkness of the blue circles (top right) correspond to the values of F_ST (bottom left).

Extended Data Fig. 7 STRUCTURE analysis of wild isolates.

Evanno analysis supports a K = 2 split, dividing the samples geographically into Eastern and Western groups. With K = 3, a Tunisian signal was found to be present in all other 3 groups (light blue).

Extended Data Fig. 8 Datasets for the examination of haplotype-specific expression.

The sequencing reads from genomes (J-haplotype oocytes, V-haplotype oocytes, diploid S2 worms) and transcriptomes (pp, ww from the diploid S2 worms) were aligned to Smed_chr_ref_v1. Shown in the figure are reads for the gene CPEB-2. Two variants with J and V alleles are marked by colored bars. In the genome of the adult worm S2 (i.e., DNA), both J and V alleles are present at equal ratios. In the genome of oocytes (J or V haplotype, i.e., DNA), the variants have only one allele. These observations validate the heterozygous variants. In the transcriptomes (i.e., RNA) of penis papillae (pp) and whole worms (ww) from S2 adults, both J and V alleles are present. In the three transcriptomes of pp, J alleles were more highly expressed. In the three transcriptomes of ww, the J:V allele ratios vary and do not significantly depart from equal expression.

Extended Data Fig. 9 Specific expression of smed-glp in the male reproductive organ (penis papillae) and female reproductive organ (bursa canal).

Expression of the glp gene (in red color) and the stem cell marker smedwi-1 (in green color) in the dissected copulatory apparatus was visualized by Fluorescence in situ hybridization chain reaction. (A) White dashed lines outline penis papillae and bursa canal. Nuclei were stained by DAPI (in blue color). No glp expression in tissues surrounding penis papillae or bursa canal. Less than 1% of glp+ cells express the stem cell marker smedwi-1. (B) Cells expressing the two markers do exist in both bursa canal and penis papillae. This experiment was repeated 3 times independently with 6 sexually mature worms, producing similar results.

Extended Data Fig. 10 Embryonic lethality and synonymous divergence in Schmidtea mediterranea.

(A) Two J/V lines with differential SNP markers on chromosome 3 were crossed. Zygotes were collected for genotyping. Heterozygous chromosome 3 markers validate the occurrence of fertilization. All three genotypes of chromosome 1 (J/J, J/V and V/V) were observed in zygotes. (B-C) Non-synonymous divergence (B) and Synonymous divergence (C) for all heterozygous sites in the J/V strain S2 and their distribution in different chromosomes summarized as boxplots. Chromosome 1 has the highest mean (red dots) and median dN or dS values. The min, max, median, mean, first quartile, and third quartile of dN (B) for each chromosome are as following: chr1 = (0, 0.0052, 0, 0.0017, 0, 0.0021), chr2 = (0, 0, 0, 0.000092, 0, 0), chr3 = (0, 0, 0, 0.00031, 0, 0), and chr4 = (0, 0, 0, 0.000088, 0, 0). The min, max, median, mean, first quartile, and third quartile of dS (C) for each chromosome are as following: chr1 = (0, 0.024, 0.0022, 0.006, 0, 0.0098), chr2 = (0, 0, 0, 0.00021, 0, 0), chr3 = (0, 0, 0, 0.00093, 0, 0), and chr4 = (0, 0, 0, 0.00026, 0, 0). (D) Synonymous divergence for all heterozygous sites in the J/V strain S2, and their distribution in the three chromosome 1 inversions and the rest of the genome, summarized as boxplots. Mean (red dots) and median dS decreases in the following order: inversion 3 (inv3), inversion 1 (inv1), inversion 2 (inv2), and the rest of the genome (Not INV). One way ANOVA and Turkey Honest Significant Differences tests showed that dS were significantly different between all 3 inversions (inv2-inv1: p.adj = 0.005; inv3-inv1: p.adj < 1e-06; inv3-inv2: p.adj < 1e-06). The min, max, median, mean, first quartile, and third quartile of dS for each inversion are as following: inversion 1 = (0, 0.026, 0.0034, 0.0064, 0, 0.01), inversion 2 = (0, 0.021, 0.0026, 0.0053, 0, 0.0083), inversion 3 = (0, 0.036, 0.0084, 0.0091, 0, 0.015), and non-inversion = (0, 0, 0, 0.0005, 0, 0). A total of 8317 genes from chromosome 1, 6568 genes from chromosome 2, 3394 genes from chromosome 3, and 1834 genes from chromosome 4 were plotted for non-synonymous divergence (B) and synonymous divergence (C). A total of 6633 genes from inversion 1, 754 genes from inversion 2, 298 genes from inversion 3, and 12428 genes from the rest of the genome were plotted for synonymous divergence (D).

Supplementary information

Supplementary Information

This file contains supplementary discussion plus methods.

Reporting Summary

Supplementary Table 1

Assignment of 481 scaffolds to chromosomes with 17 sequenced chromosomes. Chromosome sample information (ploidy, identity and potential contamination), sequencing coverage and the chromosome assignment for each of the 481 scaffolds from the assembly dd_Smes_g4 are given. For chromosome assignment, 0 represents scaffolds that could not be assigned to chromosomes by ChrSeq, 5 represents scaffolds that were assigned to two chromosomes by ChrSeq and 1–4 are chromosomes.

Supplementary Table 2

Linking 481 of dd_Smes_g4 scaffolds into chromosomes with Hi-C chromatin contact sequencing. The dd_Smes_g4 scaffolds that were assigned or unassigned to Smed_chr_ref_v1 (chrAssembly), their identity in the SALSA assembly of raw Hi-C sequencing data (salsa_scaffolds) and their assignment to chromosomes by ChrSeq are given. For chromosome assignment, 0 represents scaffolds that could not be assigned to chromosomes by ChrSeq, 5 represents scaffolds that were assigned to two chromosomes by ChrSeq and 1–4 are chromosomes.

Supplementary Table 3

Summary of the final chromosome-scale genome assembly (Smed_chr_ref_v1). Hi-C detected 26 assembly errors in dd_Smes_g4, 5 of which were confirmed by ChrSeq to be inter-chromosomal misjoining. A total of 97 dd_Smes_g4 scaffolds were not assigned to Smed_chr_ref_v1.

Supplementary Table 4

Genetic markers and distances in the linkage map established from an F₂ mapping population (supporting data for Fig. 1d). Linkage groups L.3, L.6 and L.8 contained only one genetic marker and are not included in the table. Chromosomes 2 and 4 were split into two linkage groups each, probably due to the small size of the F₂ mapping population.

Supplementary Table 5

Distribution of genetic variants that maintained heterozygosity in the inbreeding pedigree (supporting data for Fig. 2a). The chromosomes were divided into 10-Mb windows. Heterozygous variants were identified from the inbreeding pedigree from S2 to S2F9b.

Supplementary Table 6

Chromosome 1 showing heterozygosity in all samples of an F₂ population. The genetic markers were heterozygous in the J/V parent (parent_S2F10B_2A and parent_S2F10B_2B) and homozygous in the J/J parent (parent_D5-1), both of which are clones. Genotyping data from RADseq of 291 F₂ samples are listed. The 291 F₂ samples correspond to 93 unique segregants. 0/0, homozygous reference allele; 1/1, homozygous alternative allele; 0/1, heterozygous; ./., missing data.

Supplementary Table 7

Recombination in the gametes (supporting data for Fig. 2b and Extended Data Fig. 3a, b). Distributions of heterozygous variants identified in the S2 J/V line and used in the crossover assessment along chromosomes 1 and 3 in 20-Mb windows are shown. The number of crossovers identified per gamete on chromosomes 1 and 3 is also shown.

Supplementary Table 8

Chromosome 1 genotypes of wild isolates in Sardinia and Sicily (supporting data for Fig. 3d, e). Genotyping data for six different collection sites in Sardinia and two different collection sites in Sicily were aggregated. Row 1 shows the names of individual animals. 0/0, homozygous reference allele; 1/1, homozygous alternative allele; 0/1, heterozygous; ./., missing data.

Supplementary Table 9

Allele-specific expression of male and female genes (supporting data for Fig. 4a, b). Genes with well-characterized sex-related functions were identified from published work. Their locations on Smed_chr_ref_v1 and S. mansoni chromosomes were determined. The expressed J or V alleles for each of the nine male or female genes were quantified in the transcriptomes of whole worms and the isolated male copulatory organ, the penis papilla.

Supplementary Table 10

Distribution of orthologues between S. mediterranea and S. mansoni chromosomes. The numbers of orthologous genes shared between S. mediterranea and S. mansoni chromosomes are given. Sman_W is the sex chromosome of S. mansoni.

Supplementary Table 11

Distribution of transposons and stop codon mutations on different chromosomes of S. mediterranea. The number of repetitive elements such as transposons and short repeats on different chromosomes of g4wRepeat_chr_ref and the number of stop codons on different chromosomes of smed_chr_ref_v1 are shown. The smed_chr_ref_v1 genome was phased as J haplotype or V haplotype (Supplementary Information).

Supplementary Table 12

Bridging reads spanning the distant ends of the three inversions on chromosome 1. Publicly available PacBio sequencing data (SRX2700681–SRX2700684) were used to identify reads that span the breakpoints of the three inversions on chromosome 1. The same reads were aligned to smed_chr_ref_v1 and g4wRepeat_chr_ref. The assembly with repeats unmasked (g4wRepeat_chr_ref) identified more bridging reads, particularly for inversion 1.

Supplementary Table 13

Synonymous divergence of coding regions in the J/V strain S2 (supporting data for Extended Data Fig. 10b, c). The phased S2 genome was used to determine synonymous divergence of coding regions.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Guo, L., Bloom, J.S., Dols-Serrate, D. et al. Island-specific evolution of a sex-primed autosome in a sexual planarian. Nature 606, 329–334 (2022). https://doi.org/10.1038/s41586-022-04757-3

Download citation

Received: 15 September 2021
Accepted: 12 April 2022
Published: 01 June 2022
Issue Date: 09 June 2022
DOI: https://doi.org/10.1038/s41586-022-04757-3

This article is cited by

Pan-evolutionary and regulatory genome architecture delineated by an integrated macro- and microsynteny approach
- Hongwei Yu
- Yuli Li
- Shi Wang
Nature Protocols (2024)
CRISPR/Cas9-based depletion of 16S ribosomal RNA improves library complexity of single-cell RNA-sequencing in planarians
- Kuang-Tse Wang
- Carolyn E. Adler
BMC Genomics (2023)
The planarian flatworm Schmidtea mediterranea
- Leonard Drees
- Jochen C. Rink
Nature Methods (2023)
A chromosome predisposed for sex
- Yuying Lin
- Judith E. Mank
Nature (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Main

Chromosome-scale genome assembly

Chromosome 1 recombination suppression

Island-specific evolution of chromosome 1

A sex-primed chromosome

Methods

Planarian husbandry, RNAi and phenotyping

Chromosome sequencing

Chromosome-scale genome assembly

Hi-C library construction

Oocyte and sperm cell sequencing

Synteny analysis

Linkage map

Quantifying recombination

Gene expression analysis

Haplotype-specific expression

De novo map, parameter optimization and phylogenetic inference

Population structure analysis

DNA FISH

Hybridization chain reaction and RNA FISH

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data figures and tables

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links