Cell landscape of larval and adult Xenopus laevis at single-cell resolution

The rapid development of high-throughput single-cell RNA sequencing technology offers a good opportunity to dissect cell heterogeneity of animals. A large number of organism-wide single-cell atlases have been constructed for vertebrates such as Homo sapiens, Macaca fascicularis, Mus musculus and Danio rerio. However, an intermediate taxon that links mammals to vertebrates of more ancient origin is still lacking. Here, we construct the first Xenopus cell landscape to date, including larval and adult organs. Common cell lineage-specific transcription factors have been identified in vertebrates, including fish, amphibians and mammals. The comparison of larval and adult erythrocytes identifies stage-specific hemoglobin subtypes, as well as a common type of cluster containing both larval and adult hemoglobin, mainly at NF59. In addition, cell lineages originating from all three layers exhibits both antigen processing and presentation during metamorphosis, indicating a common regulatory mechanism during metamorphosis. Overall, our study provides a large-scale resource for research on Xenopus metamorphosis and adult organs.

T he rapid development of high-throughput single-cell RNA sequencing (scRNA-seq) technology offer a good opportunity to dissect cell heterogeneity of Xenopus, from embryogenesis 1,2 to individual organs such as cornea and kidney 3,4 , and to reveal the cellular mechanism of Xenopus regeneration such as in limb 5,6 , tail 7,8 and spinal cord 9 . A large number of organism-wide single-cell atlases have been constructed for various species, such as Homo sapiens 10 , Macaca fascicularis 11 , Mus musculus 12 , Danio rerio 13 , Drosophila melanogaster 14 , Ciona intestinalis 15 , Caenorhabditis elegans 16 , Schmidtea mediterranea 17 , and Stylophora pistillata 18 . From an evolutionary perspective, the amphibian is an intermediate taxon that links mammals to vertebrates of more ancient origin. A comprehensive larval and adult Xenopus cell landscape is helpful in all aspects of a better understanding of genetic evolution relations. For example, to achieve the transition from an aquatic lifestyle to a terrestrial one, anuran tadpoles undergo drastic morphological and physiological changes during metamorphosis, such as the appearance of limbs, loss of larval tail, remodeling of digestive organs, and decrease in regeneration ability 19 . The mammalian perinatal period shares strong similarities with amphibian metamorphosis 20 , making the process of metamorphosis an excellent model from which to learn about mammalian postembryonic development. However, the molecular networks that regulate metamorphosis at single-cell resolution are still unclear.
In this work, we perform highly parallel single-cell transcriptomics with Microwell-seq to construct an initial "Xenopus Cell Landscape (XCL)", comprising more than 500,000 cells isolated from four Xenopus laevis stages, from larva to juvenile, and 17 adult tissues. XCL, the first such comprehensive resource for clawed frog research, profiles more than 100 major cell types in Xenopus laevis, which provides valuable scientific evidence of cross-species evolution. The XCL database website is publicly available at http://bis.zju.edu.cn/XCL/ for all researchers in the Xenopus community.

Results
Construction of the Xenopus cell landscape using Microwellseq. Xenopus laevis is a classical model of vertebrate embryonic development 1 , regeneration potential 8 , and neuronal regulation 21 . Using Microwell-seq, we constructed a comprehensive XCL including adult and larval metamorphosis stages (Fig. 1a) 12 . A detailed multi-organ transcriptome landscape covered brain, liver, lung, kidney, intestine, heart, muscle, skin, pancreas, eye, testis, ovary, bladder, stomach, bone marrow, spleen, and oviduct tissues from 1-year-old Xenopus laevis. The whole body of Nieuwkoop and Faber (NF) stage 48 (premetamorphosis), NF54 (prometamorphosis), NF59 (metamorphosis climax), and NF66 (end of metamorphosis) were dissociated respectively to fully analyze the process of metamorphosis. Tissues and tadpoles were prepared as single-cell suspensions (Supplementary Dataset 1).
The single-cell transcriptomics data were processed using published pipelines 22 . In total, we obtained 501, 358 qualified single cells from 17 adult tissues and tadpoles at NF48, NF54, NF59, and NF66 ( Fig. 1b-d and Fig. S1a, b). The complete tissue dataset was grouped into 106 clusters and over 581 cell-type subclusters in the hierarchy (Fig. 1b, c and Fig. S1c). The numbers of cells and unique molecular identifiers (UMIs) per cell detected for each tissue are shown in Fig. 1d and Fig. S1b. The average number of genes detected per cell was~600. The tissues with the highest and lowest average numbers of genes detected were the testis and pancreas, with~1000 genes and 300 genes respectively (Fig. S1a). We annotated the cell types based on clustering markers referred in previous studies. The marker genes for each cell type are listed in Fig. S1d and Supplementary Dataset 1. Multiple tissues, including bladder, eye, heart, lung, muscle, ovary, pancreas, oviduct, and stomach, contributed to the defined adult stromal cells (Cluster (C)1, C22, and C59), while several clusters (C0, C12, C24, and C105) were larva-specific stromal cells. Other clusters with significant multi-tissue contributions corresponded to epithelial cells (C8), endothelial cells (C9), and muscle cells (C43) (Fig. 1b, c).
To define the cell-type similarities and relationships, pairwise unsupervised MetaNeighbor 23 analyses were performed for these 106 clusters (Fig. 1e). The resulting circular cell-type dendrogram was organized according to major lineages rather than tissues. For example, most cell types were matched within the same category, such as stromal, muscle, and epithelial cells.
In total, our landscape provided cellular signatures for major tissues of Xenopus laevis, providing a valuable resource at singlecell resolution for the Xenopus community. The resource is publicly available at http://bis.zju.edu.cn/XCL/.
Cellular heterogeneity in diverse Xenopus laevis tissues. Using t-distributed stochastic neighbor embedding (t-SNE) and differential gene expression analyses, we constructed more detailed transcriptional profiles of Xenopus tissues, including previously unrecognized cell heterogeneity. The brain is the most complicated tissue, and our data covered most known cell types, such as excitatory and inhibitory neurons, radial glia, astrocytes and oligodendrocytes. We also identified several endocrine cells from the pituitary gland, including gonadotroph cells, growth hormone cells, thyrotroph cells, and melanotrope cells ( Fig. 2a and Supplementary Dataset 2). We further examined specific marker genes related to each cluster. Three GABAergic neurons (inhibitory neuron) subclusters (C5, C6, and C27) expressed high levels of slc32a1 and gad2, whereas three excitatory neuron subclusters (C19, C28, and C32) expressed high levels of slc17a7. Myelinating oligodendrocytes (C2) expressed high levels of myelin-associated genes plp1, mbp, and olig2, and progenitor cells (C17) expressed high levels of pdgfra and the neurogenesisrelated transcription factor nkx2-2 24 ( Fig. 2b and Supplementary Dataset 2). Notably, a large number of endocrine cells were identified from the pituitary gland that expressed different hormone-associated genes as well as co-expression of high levels of chga and chgb (Fig. S2). For instance, gonadotroph cells (C1 and C26) expressed high levels of fshb, growth hormone cells (C3, C7, and C15) expressed high levels of gh1, prolactin progenitor cells (C9) expressed high levels of prl.1, and thyrotroph cells (C12) expressed high levels of tshb. In addition, C10, which expressed the astrocyte marker atp1a2, was identified as representative fibrous astrocytes for its high expression of the myelinassociated gene pmp2 25 (Fig. 2b).
The lung is an organ of particular interest because Xenopus undergoes a transition from living in an aquatic environment to living in a terrestrial environment. We annotated 17 cell-type clusters in the lung with analysis based on known markers (Fig. 2c, d and Supplementary Dataset 2). Alveolar epithelial (AE) cells (C5) were enriched in sftpb and sftpc, which are typical markers of alveolar type 2 (AT2) cells in the mouse and human lungs (Fig. S3a). The lack of alveolar type 1 (AT1) cells which are easily found in the lungs of mice and humans were missed in our data and enriched pulmonary ionocytes (C10) with high expression of atp6v0d2 and atp6v1b1 ( Fig. 2d and Fig. S3a), which implied that the respiratory organ of Xenopus was intermediate between the gill enrichment of ionocytes and mammalian lung enrichment of alveolar epithelial cells. We also identified melanocytes (C15) with high expression of pmel-like and tyrp1, which were previously reported to exist in the lungs of Xenopus with unknown function 26 (Fig. 2c and Supplementary Dataset 2). To infer the evolutionary relationship between the mouse lung and zebrafish respiratory organs, we performed a cross-species clustering analysis of these tissues. To enable crossspecies analysis, orthologous and paralogous genes from mouse, zebrafish and Xenopus were extracted 27 . As revealed in Fig. S3b, several cell types were conserved among zebrafish, Xenopus, and mouse. For example, the gene expression patterns of Xenopus pulmonary ionocytes showed strong correlations with zebrafish gill ionocytes, zebrafish swim bladder epithelial cells and mouse AT2 cells. Interestingly, although sftpb and sftpc were highly expressed, the gene expression patterns of Xenopus AE cells showed strong correlations with not only mouse AT2 cells but also mouse AT1 cells, Club cells, and alveolar bipotent progenitors, indicating that Xenopus AE cells performed a variety of functions similar to those of mouse AT1, AT2 and Club cells in adult Xenopus lungs. Xenopus AE cells also showed strong correlations with zebrafish swim bladder epithelial cells, implying the evolutionary relationships among the Xenopus lung, zebrafish swim bladder, and mouse lung. These results were consistent with      a previous hypothesis that cell types rather than organs might be proposed as "evolutionary units" in evolutionary biology 28,29 . Unlike that in mammals, hematopoiesis in Xenopus occurs in the liver rather than bone marrow 30,31 . To further study the function of hematopoiesis at the single-cell level, we analyzed the cellular components in Xenopus liver and bone marrow, respectively. We identified 21 clusters in the Xenopus liver (Fig. 2e, f) and expectedly most were hematopoiesis-related clusters except liver parenchyma cell hepatocytes (C14). For example, hematopoietic stem and progenitor cells (HSPCs and C16) were enriched in hematopoiesis transcription factors gata2, runx1, and tal1 ( Fig. 2f and Supplementary Dataset 2). Next, we identified erythrocytes at various differentiation stages, including erythroid progenitor cells (C4), proliferating early erythrocytes (C15), proliferating middle erythrocytes (C8), and late erythrocytes (C5) based on previously reported marker genes 32 (Fig. S4a). In addition, among 16 identified clusters in Xenopus bone marrow, mainly composing granulocytes such as neutrophils and eosinophils (Fig. 2g, h). In contrast to adult mouse bone marrow, Xenopus bone marrow contained no HSPCs, but its other cellular components, such as major granulocytes, were similar (Fig. S4b). These data indicated that the liver was the primary hematopoietic organ of adult Xenopus while the bone marrow harbored granulocytic potential.
Further, we defined cell-type clusters and their markers in all the other tissues and summarized them in Supplementary Dataset 2 and on the XCL website ( Fig. S5 and Supplementary Dataset 2).
Cell-type evolution between adult Xenopus, zebrafish, and mammals. Cross-species single-cell RNA analysis provides new insight into animal evolution at the single-cell level 28,33 . To identify the evolutionary process in vertebrates, updated datasets from the Human Cell Landscape (HCL), the Mouse Cell Atlas (MCA), and the Zebrafish Cell Landscape (ZCL) were collected. Details of selected homologous genes across four species were shown in Supplementary Dataset 3. To proceed with network construction, pseudocells for each cell cluster of Xenopus laevis were constructed and an integrated manifold with a high degree of cross-species alignment was produced by SAMap to ensure the accuracy of cell types and cell states comparison (Fig. S6). As shown in Fig. 3a, the gene expression patterns of endothelial cells showed the strongest correlations between both species compared with other lineages. Besides, stromal cells were also highly conserved across these four species. The germline of Xenopus showed the strongest correlation with the zebrafish germline and lower correlations with mouse germline, and few correlations with human germline, indicating that the germline of Xenopus was evolutionarily closer to fish than to mammals.
To investigate the genetic network that underlies cell lineage fate determination, transcription factor specificity scores were calculated for each cell type of Xenopus laevis (Fig. 3b). Stromal, muscle, and endothelial cells shared several TFs, such as the notch effector hes1/4, which are reported to regulate myogenesis 34,35 . Twist1/2 and snail, which are related to the epithelial-tomesenchymal transition in humans, were highly associated with stromal cells 36 . Meox2, a response gene for cell migration and fatty acid transport, was identified in endothelial cells 37,38 . The key regulators of erythropoiesis gata1 and tal1 in mice were also highly associated with erythrocytes in Xenopus 39,40 . For immune cells of Xenopus, the major hematopoietic TF runx1 and the key regulator of T cell differentiation runx3 were identified 41,42 . Sox1, reported to be implicated in Xenopus neurogenesis and the Xenopus myelination-related gene sox10 were detected in our neuronal cells 43,44 (Fig. 3b). To confirm the conserved cell-lineage-specific TFs in vertebrates, we collected TF data from humans, mice, and zebrafish, and further identified conserved TFs for each lineage based on homologous relationships and expression ( Fig. 3c and Figs. S7, S8). Our data showed that endothelial cells shared several TFs in all lineages, such as the regulator of endothelial differentiation erg 45 , a key gene for vascular development sox7 46 , and the regulator of transcription of lipid transport meox2 for endothelial cells 47 (Fig. 3c). In addition, neurons shared most TFs in all lineages. For instance, the key neural crest specifier sox10 48 , a common regulator of the neuronal conversion gene neurod1 49 , and the regulator of the oligodendrocyte lineage and neural cell specification olig2 for neurons 50 (Fig. S8a). In particular, some highly conserved TFs such as the regulator of neural stem cells tcf4 51 and the known neural developmental regulator lhx6 52 were detected in four species (Fig. S8a). Notably, olig1, a common oligodendrocyte transcription factor in mice and humans 53 , was not detected in Xenopus adult neurons and was only weakly expressed prior to the tadpole stage 54 (Fig. S8a).
Cell landscape of larval Xenopus during metamorphosis. To describe the cellular mechanisms regulating amphibian metamorphosis at the single-cell level, NF48, NF54, NF59, and NF66 tadpoles were processed using Microwell-seq and 188,020 single cells passed our quality control tests. Metamorphosis is a process regulated by thyroid hormone (TH) T3. Previous research indicates that T3 is able to induce complete metamorphosis and blockade of T3 synthesis prevents metamorphosis 55 . To ensure that the collected samples represented different stages of metamorphosis, we cross-referenced our data with the expression patterns of four previously reported T3-dependent genes 56 , which indicated that our samples represented premetamorphosis, prometamorphosis, metamorphic climax, and the end of metamorphosis (Fig. 4a). The marker genes for each major cell type were listed in Fig. S9a. In our data, several major cell type categories were present in all four stages, such as the defined neurons (C13, C20, C29, C41, and C43), immune cells (C4, C11, and C31), and proximal tubule cells (C27) (Fig. 4b, c). Several digestion system clusters were specifically present in certain stages. For instance, stomach parietal cells (C22) and hepatocytes (C24) were present only in NF66 and NF54, respectively. Meanwhile, enterocytes (C12) were present in NF59 and enterocytes (C15 and C25) were present in N66; however, enterocytes (C48) were present in NF54, NF59, and NF66. Our results showed that remodeling of the intestine is the one of the most dramatic changes during metamorphosis as noted previously 57 . Enterocytes, especially at later stages, exhibited high heterogeneity due to changes in diet during metamorphosis to form the complex adult intestine 56,58 .
Next, the different stage tadpole datasets were grouped into 38, 46, 48, and 56 clusters respectively ( Fig. S9b-e). We analyzed the proportions of four stages in each cluster. As shown in Fig. 4d, almost no intertemporal cell types emerged, indicating reliable temporality of our data. To further understand the transformations that occur during metamorphosis, we performed another round of clustering specifically for epithelial, stromal, and immune cells ( Fig. S10 and Supplementary Dataset 4). Obviously, epithelial and stromal cells exerted strong transcriptomic heterogeneity rather than immune cells. Overall, NF66 is more self-consistent in Epithelial and Stromal in batch-integrated cellular profiles. We also analyzed the relationship between larval and adult cell clusters (Figs. S11-S12 and Supplementary Dataset 5). As shown in Fig. S11b  shown in Fig. S11c. The high correlation coefficient between C3 and C4 with the most proportion of NF54 and NF59 data suggested some NF-specific regulation, while DGEs contributing to the combined cluster from larval and adult integration indicated that enterocytes were undergoing extensive remodeling during metamorphosis. Similar to the enterocytes, most stomach parietal cells at NF66 and adults shared transcriptional similarity but low correlation with that at NF54 and NF59 (Fig. S11d-f). Hepatocytes at NF54 showed transcriptional heterogeneity but hepatocytes at NF59 displayed a high correlation with NF66 and adult hepatocytes (Fig. S12a-c). In terms of neuron cells, general integration showed a high correlation among adult and most larval neurons, with some adult neurons showing heterogeneity mostly in C1 and C3 (Fig. S12d-f). Taken together, these data suggested that significant transcriptome changes occurred in the vast majority of adult cells compared with larval cells that underwent metamorphosis.
To identify the stage-specific TFs for each cell type, we collected TF data from all cell types at four stages and defined the stagespecific TFs as "driver TFs    regulatory factor expression during metamorphosis. By contrary, erythrocytes dTFs underwent tremendous changes, with the exception of gata1. It was worth noting that lyar, which regulates globin gene expression during development was no longer a specific transcription factor at NF66 59 , indicating a hemoglobin (Hb) transition during this stage. Meanwhile, tal1, a regulator of erythroid differentiation, became a specific transcription factor at NF66 60 . These changes were likely important for the larval erythrocytes to undergo Hb transition and to differentiate into adult erythrocytes. In total, our data provided a useful resource for identifying the key regulators of each cell lineage during metamorphosis.
Conversion of erythrocyte hemoglobin from larval to adult. Hb transition from larva to adult during vertebrate development is a physiologically important process for animals to adapt to environmental changes from the fetal environment to an atmospheric environment in animals. Hb switching is also TH-dependent and which kinds of globin are expressed in individual erythrocytes at different stages in Xenopus laevis has been controversial 61 . To investigate this transition during metamorphosis in Xenopus, we reanalyzed the whole larval and adult erythrocytes. As expected, adult and larval erythrocytes showed strong heterogeneity before and after metamorphosis (Fig. 5a, b). In adults and NF66, erythrocytes mainly expressed hba1 and hbg1, while in early tadpoles at NF48 and NF54, erythrocytes expressed hbg2 and hbd instead. Notably, during premetamorphosis at NF59, the cluster of erythrocytes coexpressing hba1, hbg1, hbg2, and hbd indicated a transient state in the development of erythrocytes (Fig. 5c). The expression of hba1 and hbd confirmed that hemoglobin was coexpressed in individual erythrocytes (Fig. 5d). Although several reports have claimed that no erythrocytes contained both larval and adult globin 61,62 , our data identified that there was a cluster of erythrocytes, mainly at NF59, that contained both larval and adult globin, as reported previously 63,64 . Interestingly, our data showed that few or no mhc1a (MHC class I) molecules were detected in larval erythrocytes before metamorphosis, consistent with a previous report (Fig. 5c), but a continuous increase in expression occurred following development until adult erythrocyte transition 65 . Our result suggested that immune cells might be involved in the removal of larval erythrocytes without mhc1a. To investigate the molecular pathways involved in erythrocyte development, we sub-clustered whole erythrocytes into three groups: a larval cluster (C0) of erythrocytes mainly at NF48 and NF54; a transient cluster (C6) of erythrocytes at NF59; and an adult cluster (C2), of erythrocytes mainly at NF66 and in adults. GO classification was performed to analyze the biological functions of the differentially expressed genes (DEGs) in each group based on human homologs (Fig. 5e). The larval cluster was enriched mainly in biological pathway categories associated with RNA metabolism and telomere maintenance, suggesting that transcription and telomerase activity were necessary at the early stage of erythrocytes. Erythrocyte differentiation-and hemoglobin biosynthesis-related biological pathways were enriched in transient clusters. This result implied that erythrocytes at this stage might synthesize new hemoglobin and differentiate into new cell types during the transient state. In conclusion, our data indicated that hba1 and hbg1 are adult-special hemoglobin in Xenopus laevis, while hbg2 and hbd are larval-special hemoglobin, and that there is a transient cluster of cells containing both larval and adult hemoglobin only at NF59 that might be a progenitor of adult erythrocytes.
Gene expression profiles of the remodeling multisystem during metamorphosis. As shown in Fig. 4c, digestion system cells derived from the endoderm, including enterocytes, hepatocytes and stomach parietal cells showed strong heterogeneity during metamorphosis. To investigate common molecular pathways involved in tissue development during metamorphosis, we recollected data from digestion system cells, erythrocytes derived from mesoderm and neurons derived from ectoderm to analyze. We clustered selected genes expressed (for details, see Methods) into four modules: module 1, thyroid hormone-synchronous upregulated genes; module 2, thyroid hormone-synchronous downregulated genes; module 3, continuously downregulated genes; module 4, continuously upregulated genes ( Fig. 6a and Supplementary Dataset 7). The number of genes in module 1 was larger than that of module 4 for each cell lineage (Fig. 6b), indicating that upregulated genes involved in metamorphosis were mainly thyroid hormone-dependent.
Next, GO analysis was performed (based on the human homologs) to clarify the biological and molecular functions of the genes in each module (Fig. 6c-f). The entire GO term for each module could be found in Supplementary Dataset 8. The common genes in module 1 across cell types during metamorphosis were enriched in antigen processing and presentation of exogenous antigen, immune system process, cellular location, granulocyte activation, and heterocycle catabolic process. Interestingly, the shared genes enriched in antigen processing and presentation of exogenous antigen catalog were proteasome family genes, although different subtypes, such as psmb4, psmb5, and psmc2, are involved in the expression of MHC class I antigens and protein degradation 66 . MHC class I antigens were reported to be important surface markers of adult erythrocytes during Xenopus metamorphosis 67 . Our data suggested that antigen processing and presentation in NF59 was required for the formation of MHC I molecules at the end of metamorphosis and was a common regulatory mechanism in each lineage during metamorphosis. To detect the common genes shared by multiple cell types, we also generated Venn diagrams of the genes in different cell types. Genes present with a high frequency in module 1 were zan (freq = 5), nme2 (freq = 4), cwcl5 (freq = 4), lypla2 (freq = 4), taf13 (freq = 4), and so on.
The common genes in module 2 across cell types during metamorphosis were mainly enriched in metabolic processes such as nitrogen compound metabolic process, organic substance metabolic process, mRNA metabolic process, and cellular macromolecule metabolic process. The common genes in module 3 across cell types during metamorphosis were also mainly enriched in metabolic processes, such as cellular metabolic process, mRNA metabolic process, and oxoacid metabolic process (Fig. 6d, e). These data suggested that some cellular metabolic processes such as nitrogen compound and mRNA metabolic process were continually downregulated from the climax of metamorphosis to the end of metamorphosis. Genes with a high frequency in module 2 and 3 were also mainly translation-related genes, such as eef1d (freq = 4), eif2s3 (freq = 4), and ilf3 (freq = 4), which are involved in the elongation and translation initiation process of protein synthesis 68,69 .
Overall, our data demonstrated that upregulation of antigen processing and presentation and downregulation of cellular metabolic processes are common biological processes in Xenopus metamorphosis for major cell lineage.

Discussion
The constructed organism-wide cell atlases, such as those for Homo sapiens, Mus musculus, and Danio rerio, alongside that of Xenopus laevis generated here, allowed us to systematically compare cell types across vertebrates. During the transition from an aquatic to a terrestrial lifestyle, the respiratory system of amphibians undergoes drastic changes. The evolutionary relationships among the lung, gill and swim bladder are controversial based on current evidence 70  Recently, cell types rather than tissues have been proposed as "evolutionary units" 28,29 . Our results also demonstrated that the evolutionary conservation and divergence of the vertebrate respiratory system relied mainly on cell types and not tissues, as previous research proposed.
Amphibian metamorphosis is a dramatic biological phenomenon that could be controlled by the simple TH. Our data indicated that antigen processing and presentation was a common regulatory mechanism involved in cell apoptosis in lineages derived from three germ layers during metamorphosis. GO analysis of thyroid hormone-synchronous upregulated genes during metamorphosis revealed enrichment in genes associated with antigen processing and presentation of exogenous antigens. MHC class I antigens were used to distinguish larval and adult erythrocyte clusters in a previous study 67 . The absence of MHC antigens on the surface of larval erythrocytes is consistent with our results. However, larval erythrocytes were cleared from circulation after 60 days after metamorphosis, indicating a mechanism for the protection of adult erythrocytes after metamorphosis. To our surprise, other cell types, such as hepatocytes, enterocytes, neurons, and stomach parietal cells, contained both antigen processing and presentation genes. These data from ectoderm, mesoderm, and endoderm implied that antigen processing and presentation might be a shared biological pathway involving tissue remodeling during metamorphosis.
In conclusion, we constructed the first Xenopus cell landscape at the single-cell level, providing large-scale resource for research on Xenopus metamorphosis and adult organs. We performed comparative analyses among vertebrate species at single-cell resolution and revealed an evolutionary relationship between the respiratory systems of aquatic and terrestrial animals. We identified a common biological pathway involving cell apoptosis and differentiation for cell lineages from the three germ layers. These data are valuable for future research in vertebrate biology. Fabrication of the microwell device. Firstly, a silicon plate (Suzhou chip scientific instrument Co., Ltd., Suzhou, China) containing 100,000 microwells whose diameter and depth were 28 and 35 µm respectively was manufactured. Secondly, a polydimethylsiloxane (PDMS) plate with the same number of micropillars was made using the silicon microwell plate. Lastly, the PDMS plate was used as a mold to make a disposable agarose microwell plate by pouring 5% agarose (Cat. No. BY-R0100, Baygene) solution onto the surface of the PDMS plate. In the second split-pool, the beads were divided into each well of a new 96-well plate after washing with water. A 15 µl polymerase chain reaction (PCR) mix (1× Phanta Master Mix, Cat. No. P510-01, Vazyme) and 5 µM oligonucleotides were added to each well. The oligonucleotides in each well included a sequence with reverse complementarity to linker 1, a unique barcode and a linker 2 sequence, and the PCR program was as follows: 94°C for 5 min; five cycles of 94°C for 15 s, 48.8°C for 4 min, and 72°C for 4 min; and a 4°C hold. Notably, the beads were mixed between denaturation (95°C) and primer annealing (48.8°C) in each cycle. After PCR program, the beads were collected in a 95°C water bath for 6 min and separated using a magnet, removing complementary chains, two times.
In the third split-pool, the PCR program was as follows: 94°C for 5 min, 48.8°C for 20 min, and 72°C for 4 min and a 4°C hold. The oligonucleotides used in each well included a linker 2 reverse-complementary sequence, a unique barcode, a UMI sequence, and a poly-T tail. After the PCR program, the beads were collected and suspended in 200 µl exonuclease I mix (containing 1× exonuclease I buffer and 1 U/µl exonuclease I) and incubated at 37°C for 15 min with a rotary mixer. Then the beads were washed with 200 µl TE-TW [10 mM Tris (pH 8.0) (Cat. No. 15568-025, Invitrogen), 1 mM EDTA (Cat. No. AM9260G, Invitrogen), 0.01% Tween 20 (Cat. No. A600560-0500, Sangon)], 200 µl of 10 mM Tris-HCl (pH 8.0), and 1 ml double-distilled water. Finally, the beads could be stored in 1 ml TE-TW for one month at 4°C.
Cell preparation. Adult Xenopus laevis tissues were minced into pieces (~1 mm) at room temperature using sterilized scissors. The tissue pieces were then transferred to a 15 ml centrifuge tube, washed with DPBS, and suspended in a 5 ml solution containing various dissociation enzymes for different amounts of time until no tissue fragments were visible. Details for single-cell isolation from different tissues are listed in Supplementary Dataset 1. To collect dissociated cells, the centrifuge tube was at 300 × g for 5 min at 4°C, removed the supernatant, and resuspended in 2 ml DPBS. Then the dissociated cells were passed through a 70 µm strainer (Cat. No. F613462-9001, Sangon). After being washed twice (centrifuged at 300 × g for 5 min at 4°C), the cells were resuspended at a density of 1 × 10 5 cells/ml in DPBS.
Tadpoles whole body were also minced into pieces (~1 mm) at room temperature using scissors. The tadpole pieces were transferred to a 15 ml centrifuge tube and dissociated in 5 ml 1x TrypLE containing collagenase I (0.25 mg/ml), collagenase II (0.25 mg/ml), collagenase IV (0.25 mg/ml) and collagenase V (0.25 mg/ml). The remaining steps were consistent with adult Xenopus laevis tissues mentioned above.
Cell collection and lysis. To eliminate the rate of the double cell during Microwell-seq, the cell concentration should be carefully controlled. The suitable cell concentration is about 100,000/ml (with 10% of the wells occupied by individual cells) and can be estimated using a hemocytometer. The cell suspension was pipetted onto the microwell plate and the plate was detected under a microscope to wash extra cells away. Then the bead suspension was loaded into the microwell plate which was placed on a magnet. The four sides of the microwell plate were The size of the dot encodes the percentage of cells within a cell type, and the color encodes the average expression level. d Feature plot in the t-SNE map of erythrocytes collected from larval and adult Xenopus. Cells are colored according to the expression of the indicated marker genes hba1.S (red), hbd.S (blue), and co-expression hba1.S and hbd.S (pink). e Representative GO terms enriched in each module for three cell types. Different cell clusters are labeled in different colors. P value was calculated by the hypergeometric distribution, a statistical test is one-sided, adjustments P values were made after P value is corrected by Benjamin-Hochberg multiple test.

Quantification and statistical analysis
Preprocessing of Microwell-seq sequence data. Microwell-seq datasets were processed using the protocols described by ref. 71 . The raw fastq-format sequencing data from a DNBSEQ-T7 were first split into i7 indexed sub-libraries using "splitBarcode [https://github.com/MGI-tech-bioinformatics/splitBarcode]". Reads from Xenopus laevis were aligned to the Xenopus laevis v9.1 genome, using STAR 72 . The DGE data matrices were obtained using the modified Dropseq tools (https://github.com/ggjlab/mca_data_analysis/tree/master/preprocessing/Drop-seq_tools-1.12/) and the corresponding protocol is available at http://mccarrolllab. org/dropseq/. The DGE data containing the top 30,000 cells sorted by the total number of transcripts were obtained after preprocessing. For quality control, we filtered out cells in which fewer than 500 transcripts were detected.
Correction of RNA contamination. During scRNA-seq data analysis, we commonly assume that all RNAs are endogenous to each individual cell. It has been recognized that some contaminating nonendogenous RNAs are also present even within datasets of the highest quality 73 . While applying Microwell-seq technology, there is a probability that a certain amount of cell-free RNA from the input solution admixed with the cell in a well are also sequenced. Although the amount of cell-free RNAs is very low compared with the total RNA content, it is still important to recognize and correct this contamination. To solve this problem, we strictly removed the batch gene background. We assumed that, for each batch of experiments, cell barcodes with fewer than 500 UMIs correspond to empty beads exposed to free RNA during cell lysis, RNA capture, and washing steps. Genes with extensive-expression in all beads were considered batch genes. The batch gene background value was defined as the average gene detection level for all cell barcodes with fewer than 500 UMIs, multiplied by the median of the fold difference between the detected gene expression of a cell and the average detected gene expression for beads with fewer than 500 UMIs and then rounded to the nearest integer. We subtracted the batch gene background for each cell from the digital expression matrix before performing the cross-tissue or cross-stage comparisons. We used the background-removed matrix to perform downstream analyses.
Filter out potential doublets. scRNA-seq data are commonly influenced by technical artifacts known as doublets, in which two or more different cells receive the same barcode, resulting in an aggregated transcriptome. We used the R package DoubletFinder 74 to detect the potential doublets in each individual library. The overall doublet rate for our experiment was no more than 1.2% 12 . Approximately 5% of cells were labeled as doublets and were removed.
Cell clustering for single-cell datasets on a per-dataset basis. Seurat 75 was used as a tool for cell clustering on a per-dataset basis. The data were log 2 (counts per million (CPM)/100 + 1) transformed, and the number of UMIs and the percentage of mitochondrial gene content were regressed. A total of 2000 genes were selected through the "FindVariableFeatures" function by using the "vst" method and used as inputs for initial principal component analysis (PCA) and the number of principal components (PCs) used for nonlinear dimensional reduction (t-SNE). For clustering, we set different resolution parameters between 0.5 and 2.5 in the "FindCluster" function and chose the final cluster number by distinguishing differential genes among clusters. These parameters, including the resolution and number of PCs, were adjusted on a per-dataset basis.
Batch effect evaluation and clustering analysis for single-cell datasets on all dataset bases. The gene expression matrices for all single cells were merged together and fed to Seurat for clustering analysis. Briefly, the data were first processed by the "SCTransform" function. After that, processed data from larva were subjected to the Seurat standard procedure as described above. We chose 50 PCs for PCA and computed the neighborhood graph of cells. We then used the "FindCluster" function to cluster cells with a resolution of 0.8. Processed data from adults were fed to Scanpy 76 . We chose 100 PCs for PCA and computed the neighborhood graph of cells. We then used Leiden clustering to cluster cells with a resolution of 1 and neighbor number of 15. Finally, 57 clusters for larvae and 106 clusters for adults were produced, and marker genes were defined by the Wilcoxon rank-sum test. t-SNE was applied to visualize the single-cell transcriptional profile in 2D space. The Wilcoxon rank-sum test was used by running the 'FindAllMarkers' function in Seurat and the 'rank genes groups' function in Scanpy to find DEGs in each cluster. We annotated each cell type with marker genes selected based on an extensive literature review.
Collection of orthologous genes. Human, mouse, and zebrafish orthologous pairs were obtained from Ensembl v96 by BioMarkt. Xenopus laevis and human orthologous pairs were obtained from "Xenbase Home [http://www.xenbase.org/ common/]". In this study, we considered one-to-one orthologous and one-to-many paralogous pairs for further analysis.
Cross-species transcriptome comparison. Before transcriptome comparison, we collected human, mouse, and zebrafish cell-gene matrices and cell annotations from the previously published HCL, MCA, and ZCL datasets (https://db.cngb.org/ HCL/; http://bis.zju.edu.cn/MCA; http://bis.zju.edu.cn/ZCA). We extracted the cells in the adult stage and divided the 102 human, 104 mouse, and 135 zebrafish cell clusters into 12 major cell lineages. To reduce the effects of noise and outliers, we randomly sampled 100,000 cells for each species, and we calculated the pseudocells, which was an average of every 50 cells from the same cell type, for further analysis. The gene expression data for each species underwent normalization to the sum of transcripts and multiplication by 100,000. The combined projection was visualized with the "SAMAP.scatter" function. To compare transcriptomes across species, we applied SAMap analysis between Xenopus laevis and humans, mice, and zebrafish. The cell type pairs with mapping scores higher than 0.1 were plotted between species by the package Circlize.
Lineage-specific transcription factor (TF) analysis. In this study, Xenopus laevis TFs were defined as the one-to-one orthologs of TFs found in Xenopus tropicalis downloaded from AnimalTFDB. We applied the same method used in previous research to judge cell-type-specific TFs for Xenopus laevis. We calculated the Jensen-Shannon divergence (JSD) for every TF expression status and defined the TF specificity score as 1 − ffiffiffiffiffiffiffi ffi JSD p . The Z-score-normalized TF specificity score was used to infer the specific TFs in each lineage. With reference to lineage-specific TF data for humans, mice, and zebrafish, we identified TFs conserved in four species for each lineage.
Identification of TFs that drive cell type mapping during metamorphosis. We define g 1 and g 2 to contain TFs from stage 1 and stage 2, respectively. Let X a 1 a 2 denote the set of all cells with cross-stage edges between cell types a 1 and a 2 ; JSD was used to evaluate cell-type-specific TFs for each stage such as stage1: Y 1;g 1 ¼ 1 À ffiffiffiffiffiffiffi ffi JSD p ; We assign each TF a score: h g = T (S(Y 1,g1 ))∘ T (S(Y 2,g2 )), where S(Z) standardizes vector Z to have zero mean and unit variance, and T(Z) sets negative values in vector Z to zero in order to ignore weakly expressed genes. To be inclusive, the top 5 TFs according to h g are identified as "drive TFs".
Gene expression trajectory analysis during metamorphosis. For module analysis, we used a similar method described in Kazer et al. 77 . We identified the DEGs between each stage using the "FindMarkers" function in Seurat. The parameters were logfc.threshold = 0.1, min.pct = 0.1 and min.diff.pct = -Inf. Then, we computed averaged expression values for each gene using the "AverageExpression" function in Seurat and grouped them into 15-30 clusters using the Mfuzz package in R with the fuzzy c-means algorithm 78 .
To determine whether cluster expression varied over time, every cell was scored for the genes within the cluster using the "AddModuleScore" function in Seurat. Then, 1000 two-sided Wilcoxon rank-sum tests were performed between the distribution of scores from 150-1000 cells at the first stage and the same number of cells from each other stage. For each stage, the P values from the 1000 tests were averaged. After FDR correction, if q < 1e-10 for any stage, the module was considered to vary significantly in expression in that stage.
Finally, these clusters were grouped into four modules using one-sided Wilcoxon rank-sum tests: module 1 was defined as the genes with the highest expression at NF59, module 2 was defined as the genes with the lowest expression at NF59, module 3 was defined as the genes that were consistently downregulated during metamorphosis, and module 4 was defined as the genes that were consistently upregulated.
Gene ontology analysis. Gene ontology analysis was performed on Homo sapiens orthologs using "g:profiler [https://biit.cs.ut.ee/gprofiler/gost]", and all orthologous genes were taken as the universe. We selected the Benjamini-Hochberg FDR to compute multiple testing corrections and considered biological pathways with p values smaller than 0.05. Statistics and reproducibility. No statistical method was used to predetermine sample size. No data were excluded from the analyses and all analyses were not randomized.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
The single-cell RNA-seq data generated in this study have been deposited in the NCBI Gene Expression Omnibus database under accession code "GSE195790". The processed single-cell RNA-seq data were available at "Figshare [https://doi.org/10.6084/m9.figshare. 19152839]" and on the "XCL website [http://bis.zju.edu.cn/XCL/]". All other relevant data supporting the key findings of this study are available within the article and its Supplementary Information files or from the corresponding author upon reasonable request.