Introduction

Sweet potato Ipomoea batatas (L.) Lam. is an important root and tuber crop worldwide, which produces a large amount of starch, carbohydrates, and other nutrients. It is well established as a crucial staple crop and industrial resource in many regions with high population density, such as Africa and Asia1. Starch from sweet potatoes can also be used as an ideal material for thickening agents, textiles, paper, chemicals, biofuel, etc. At present, 70–80% of total sweet potato production worldwide is mainly processed into starch products. The price of sweet potato starch is approximately $2100 per ton, which is higher than that of maize, potato, and cassava starch. The sweet potato starch market is thriving, and its value is estimated to be $630 million worldwide in 2020. It is projected to increase at a compound annual growth rate of 3.9% from 2021 to 20262. Employing high-starch sweet potato varieties in processing enhances starch firms’ economic efficiency. A 1% starch content increase may lead to a 5% boost in starch processing benefits3. However, in sweet potato breeding, high starch content usually leads to a decline in fresh yield (storage roots)4,5. To ensure total starch yield, it is urgent to identify the key factors and regulatory elements affecting fresh yield and starch content and utilize gene engineering to achieve high fresh yield and high starch content goals.

Root and tuber crops possess a typical above- and below-ground source-sink relationship, but there are few reports on the regulatory factors affecting their source-sink synergy and fresh yield, as well as starch content. Furthermore, the hexaploid cultivated sweet potato (2n = 6x = 90) has a large and highly complex genome and a high degree of self-incompatibility6. In sweet potato, the individual overexpression of IbAATP (encoding a plastidic ATP/ADP transporter), IbSSI (encoding soluble starch synthase I), IbSnRK1 (encoding sucrose non-fermenting-1-related protein kinase-1), or IbVP1 (encoding a H+-pyrophosphatase) improved starch content and altered starch physicochemical properties in storage roots7,8,9,10. However, the underlying regulatory networks governing starch accumulation remain largely unknown, and the genetic basis and molecular mechanisms underlying natural starch biosynthesis are still poorly studied.

In this study, we reported that the production, loading, and transport of photosynthates in leaves, as well as their unloading and allocation in storage roots, led to the divergence of starch content between sweet potato varieties. The plasma membrane H+-ATPase IbPMA1, which serves as a proton pump that facilitates the movement of photosynthates from the source to the sink, is the key factor in regulating starch accumulation in storage roots. Furthermore, a basic helix-loop-helix (bHLH) transcription factor IbbHLH49 was found to directly activate source-sink synergy-mediated fresh yield and starch accumulation in sweet potatoes. Collectively, our data offer valuable insights and strategies to increase starch yield and provide candidate genes for developing elite high-starch sweet potato varieties.

Results

Development of a high-starch yielding F1 individual H283

To investigate the interactive mechanisms between sweet potato fresh yield and starch content, we developed a population consisting of 994 F1 individuals derived from a cross between two sweet potato varieties, ‘Xu781’ (low fresh yield and high starch content) and ‘Xushu18’ (high fresh yield and moderate starch content)11. We measured the storage root fresh yield and dry matter content (positively and significantly correlated with starch content12) of both parents and 500 F1 individuals across five years (2016–2020), averaging 0.114–1.138 kg per plant and 22.9–40.6% dry matter content (Fig. 1a and Supplementary Data 1). The F1 individual H283 showed superior fresh yield (43,200.00 kg·ha−1) and starch content (37.07% dry matter content, 23.56% starch content, and 10,177.92 kg·ha1 starch yield), significantly outperforming major starch-type varieties Shangshu19 (28,455.00 kg·ha1 fresh yield, 23.42% starch content, and 6664.16 kg·ha1 starch yield) and Xushu22 (33,856.50 kg·ha1 fresh yield, 21.48% starch content, and 7272.37 kg·ha1 starch yield) (Supplementary Data 1).

Fig. 1: Natural variations of starch content in sweet potato.
figure 1

a The average dry matter content and fresh yield of 500 F1 individuals derived from a cross between Xushu18 (female, high fresh yield, and moderate starch content) and Xu781 (male, low fresh yield, and high starch content). HDM, high dry matter content (> 32%). MDM, medium dry matter content (28–32%). LDM, low dry matter content (< 28%). b General schematic representation of storage root development and starch accumulation in sweet potato. Co, cortex; Pp, primary phloem; Px, primary xylem; Sg, starch granules; Ep, Epidermis; Sp, secondary phloem; Sx, secondary xylem; Vc, vascular cambium; Cca, cork cambium; Cor, cork. c Representative leaves and storage roots from H283 and L423 at 60, 95, and 130 days after planting (DAP). Scale bars, 5 cm. d Safranin O-fast green staining of paraffin sections from H283 and L423 storage roots at 60, 95, and 130 DAP. Starch granules were stained red. Three times were repeated independently with similar results. Scale bars, 200 μm. e Starch contents in the storage roots of H283 and L423 at 60, 95, and 130 DAP. f Amylose proportion in storage roots of H283 and L423 at 60, 95, and 130 DAP. All data are means ± SD (n = 3 biological replicates). Different lowercase letters indicate significant differences at P < 0.05 based on one-way ANOVA followed by post-hoc Tukey’s test. Exact P-values are provided in the Source Data file.

Sweet potato has begun to accumulate starch at 60 days after planting (DAP), then the accumulation rapidly increases around 95 DAP, and reaches a relative plateau around 130 DAP (Fig. 1b)13. To study the molecular mechanism of starch accumulation under high-fresh yield conditions, we used H283 and another F1 individual L423 with a similar fresh yield (41,400 kg·ha1) but low starch content (16.86%) for further research (Fig. 1c and Supplementary Data 1). The safranin O-fast green staining of storage roots showed that the number of starch granules in H283 was significantly larger than that of L423 (Fig. 1d). In H283, 27.1–41.0% higher starch content was detected compared with L423 in expanding storage roots at 60, 95, and 130 DAP (Fig. 1e). Moreover, H283 exhibited a higher proportion of amylose and a lower proportion of amylopectin compared to L423 (Fig. 1f). Furthermore, proteome and transcriptome profiling were conducted on H283 and L423 to identify the key factors involved in starch accumulation of sweet potato.

Active photosynthate production and transport in leaves promote starch accumulation in storage roots

Sweet potato leaves produce photosynthates, which are primarily transported to storage roots through the phloem14. A total of 8104 proteins were quantified by liquid chromatography-tandem mass spectrometry (LC-MS/MS) in the leaves of H283 and L423 at 60, 95, and 130 DAP. Among these, 5225 annotated proteins were grouped into six clusters (PL1 to PL6) based on their accumulation patterns (Supplementary Fig. 1a, c and Data 2, 3). The Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis revealed that the differentially abundant proteins in Cluster PL3 (941 proteins, high abundance in H283 but low in L423) were enriched in photosynthesis (map00195) and galactose metabolism (map00052), whereas those from Cluster PL6 (596 proteins, low abundance in H283 but high in L423) were enriched in starch and sucrose metabolism (map00500) and photosynthesis (map00195) (Supplementary Fig. 1f, i and Data 3).

Notably, the subunits PsbR (photosystem II reaction center), PsaA, PsaE, PsaF, PsaK, and PsaN (photosystem I reaction center), as well as PetF (ferredoxin-NADP reductase) and PetB (cytochrome b6), which mediates electron transfer between PSII and PSI15, were generally more abundant in the leaves of H283 than in those of L423 (Fig. 2a and Supplementary Data 4). Consistently, the photosynthetic rate of H283 was higher than that of L423 in the field (Fig. 2b). Further photosynthetic curves in mature leaves indicated that as the irradiance or CO2 concentration increased, the photosynthetic rate of both H283 and L423 increased, but the rate in H283 remained higher than that in L423 (Fig. 2c, d). These results indicate that H283 has a more active photosynthetic system compared with L423.

Fig. 2: The high-starch F1 individual H283 possesses active photosynthesis and effectively loads and transports photosynthates from leaves.
figure 2

a Leaf proteome analysis indicates that H283 has active photosynthesis and efficient photosynthate loading and transport, while L423 exhibits strong starch biosynthesis. The values were normalized between 0 and 1 using MinMax normalization. Solid arrows indicate direct functional interaction or immediate neighboring steps in biochemical pathways. Dashed arrows indicate indirect interactions. PSI, photosystem I; PSII, photosystem II. PsaA-H, K, L, N, PSI reaction center subunit A-H, K, L, N; PsbA-F and O-S, PSII reaction center subunit A-F and O-S; Cyt b6f, cytochrome b6f complex; PetB, cytochrome b6; PetE, plastocyanin; PetF and PetH, Ferredoxin-NADP reductases; Phe, magnesium-free chlorophyll; PQ, plastoquinone; QA and QB, plastoquinone A and B; Fru-6P, fructose-6-phosphate; Glc-6P, glucose-6-phosphate; Glc-1P, glucose-1-phosphate; ADPG, ADP-glucose; AGPL, ADP-glucose pyrophosphorylase large submit; AGPS, ADP-glucose pyrophosphorylase small submit; SSI, II, IV, soluble starch synthases I, II, IV; TPT, triose-phosphate/Pi translocator; SUT, sucrose transporter. bd Photosynthetic rate (b, n = 5 biological replicates), photosynthetic rate in response to light (c, n = 3 biological replicates), and the relationship between photosynthetic rate and intercellular CO2 concentration (d, n = 3). e, f, Sucrose content (e n = 3 biological replicates), and starch content (f, n = 3 biological replicates), respectively, in the leaves of H283 and L423 at 60, 95, and 130 DAP. Different lowercase letters indicate significant differences at P < 0.05 based on one-way ANOVA followed by post-hoc Tukey’s test. (**) and (*) indicate significant differences at P < 0.01 and P < 0.05, respectively, based on a two-tailed Student’s t-test compared to L423. g Relative transcript levels of IbSUTs in the leaves of H283 and L423 at 60, 95, and 130 DAP. The values were determined by RT-qPCR from three biological replicates consisting of pools of three plants each. h Total (h) and distribution (i) of 13C in one-month-old H283 and L423 plants (n = 3 biological replicates). n.s., no significance based on two-tailed Student’s t test. The data and significance analysis are shown in Supplemental Data 5. All data are means ± SD. Exact P-values are provided in the Source Data file.

In plants, triose phosphates (Triose-Ps) produced by photosynthesis are exported from the chloroplasts by triose-phosphate/phosphate translocators (TPTs) and then converted to sucrose in the cytoplasm16. Sucrose is normally loaded into the phloem by sucrose transporters (SUTs) for long-distance transport to sink cells17. When photosynthates are too abundant, or the sink strength is limited, sucrose transport from leaves may be blocked, resulting in starch accumulation in chloroplasts18. Here, TPTs and SUTs were more accumulated in the leaves of H283 than in those of L423, whereas key proteins of the starch biosynthesis pathway19, such as ADP-glucose pyrophosphorylases (AGP small subunit [AGPS] and AGP large subunit [AGPL]) and soluble starch synthases (SSI, SSII, and SSIV) were generally more abundant in the leaves of L423 than in those of H283 (Fig. 2a and Supplementary Data 4). Consistently, sucrose content was significantly higher, but starch content was significantly lower in the H283 leaves than in the L423 leaves (Fig. 2e, f).

Moreover, in H283, IbSUT1 and IbSUT3 expression was significantly upregulated at all three time points, IbSUT2 was significantly upregulated at 60 and 130 DAP, and IbSUT4 was significantly upregulated at 60 DAP compared with L423 (Fig. 2g). H283 may thus be more efficient in sucrose phloem loading and long-distance transport than L423. We further traced the loading of photoassimilates from leaves to storage roots using 13CO2 stable isotopic labeling followed by a stable isotope mass spectrometer and determined that H283 fixed 8.19–26.07% more 13C and its storage roots accumulated 282.73–1017.01% more 13C than L423 (Fig. 2h, i and Supplementary Data 5). Collectively, our data indicate that the high-starch individual H283 possesses active photosynthesis in leaves, making it more efficient for the production, loading, transport, and unloading of photosynthates from leaves into storage roots, which promotes starch accumulation in storage roots.

Key metabolic pathways cause starch accumulation in storage roots

In the storage roots of H283 and L423 at 60, 95, and 130 DAP, a total of 5987 proteins were quantified by LC-MS/MS. Among these, 3337 annotated proteins were divided into six clusters (PR1 to PR6) based on their accumulation patterns (Supplementary Fig. 2a, c and Data 6, 7). KEGG enrichment analysis revealed that 1111 differentially abundant proteins from Clusters PR2 and PR5 (high abundance in H283 but low in L423) were enriched in the pathways of glycolysis/gluconeogenesis (map00010), citrate cycle (TCA cycle, map00020), and oxidative phosphorylation (map00190). Differentially abundant proteins from Clusters PR3 and PR4 (low abundance in H283 but high in L423) were enriched in pentose and glucuronate interconversion (map00040), amino sugar, and nucleotide sugar metabolism (map00520), and phenylalanine, tyrosine, and tryptophan biosynthesis (map00400) (Supplementary Fig. 2e–h and Data 7).

After phloem loading, the sieve element-companion cell (SE–CC) complex moves sucrose from source to sink, and subsequent phloem unloading is crucial for assimilate allocation20,21. In plants that utilize an apoplasmic phloem unloading mode, the plasma membrane-localized Sugar Will Eventually be Exported to Transporters (SWEET) to facilitate the transport of sucrose from the SE–CC complex into the apoplastic space between cells22. Plasma membrane H+-ATPases (PMAs) create a pH gradient across the plasma membrane, which promotes plasma membrane-localized sucrose transporters (SUCs or SUTs) to move sucrose from the apoplastic space into parenchyma cells (PCs)23,24. Simultaneously, cell wall invertases (CWINs), which are localized in the cell wall, convert sucrose into glucose and fructose, which are then assimilated into PCs by sugar transporter proteins (STPs) located on the plasma membrane25,26. In Clusters PR2 and PR5, PMA1, PMA4, and STP12 were significantly more abundant in H283 storage roots compared to L423 (Fig. 3a and Supplementary Data 8). In addition, transcriptome data showed that IbSWEET1, IbSWEET2, IbPMA1, IbSUT1, IbSUT2, IbSUT3, IbSUT4, IbCWIN, and IbSTP12 were generally upregulated in H283 storage roots compared to L423 at 60, 95, and 130 DAP (Fig. 3b). The results validating the observations showed that CWIN activity and soluble sugar were significantly higher, but sucrose contents were much lower in H283 than in L423 (Supplementary Fig. 3a–c). These results suggest that apoplasmic phloem unloading is more active in H283, leading to more efficient hydrolysis of sucrose and subsequent loading of soluble sugars into storage root cells.

Fig. 3: A model of carbohydrate metabolism in storage roots.
figure 3

a Proteome analysis shows that H283 exhibits active and efficient photosynthate pumping into storage root cells. Moreover, the allocation of photosynthates to the storage roots of H283 and L423 differs: Sucrose is primarily allocated for glycolysis/gluconeogenesis in H283, enabling starch biosynthesis, while it is primarily allocated to form cell walls in L423. Protein abundance values were normalized between 0 and 1 using MinMax normalization. Solid arrows indicate direct functional interactions or immediate neighboring steps in biochemical pathways. Dashed arrows indicate indirect interactions. The protein color is consistent with the cluster to which it belongs. SE/CC, sieve element-companion cell; SWEET, sugar will eventually be exported transporter; PMA, plasma membrane H+-ATPase; SUT, sucrose transporter; CWIN, cell wall invertase; STP, sugar transport protein; SUS, sucrose synthase; UDPG, UDP-glucose; UGD, UDP-glucose dehydrogenase; UDP-GlcA, UDP-6-glucuronic acid; CESA, cellulose synthase; Glc-1P, glucose-1-phosphate; PGM, cytosolic phosphoglycerate mutase; Glc-6P, glucose-6-phosphate; PGIC, Phosphoglucose isomerase; Fru-6P, fructose-6-phosphate; PFPα/β, Pyrophosphate-fructose 6-phosphate 1-phosphotransferase subunit alpha/beta; Fru-1,6P2, fructose-1,6-bisphosphate; FBA6, Fructose-1,6-bisphosphate aldolase 6; DHAP, dihydroxyacetone phosphate; GAP, glyceraldehyde-3-phosphate; GAPC, cytosolic glyceraldehyde-3-phosphate dehydrogenase; 1-3BisPGAP, 1,3-bisphosphoglycerate; 3-PGA, 3-phosphoglycerate; PEP, phosphoenolpyruvate; PKc and PKp, pyruvate kinases in cytosolic and plastid; GPT1, glucose phosphate/Pi translocator 1; AGPL, ADP-glucose pyrophosphorylase large submit; AGPS, ADP-glucose pyrophosphorylase small submit; ADPG, ADP-glucose; GBSS, granule-bound starch synthase; PTST, protein target to starch; SSII, soluble starch synthases II; SBEI, 1,4-α-glucan-branching enzyme 1. b Expression patterns of apoplasmic phloem unloading related genes in the storage roots of H283 and L423 at 60, 95, and 130 DAP. The values were normalized between 0 and 1 using MinMax normalization. c Transmission electron microscopy of a transversal section showing the cell wall thickness of storage roots of H283 and L423 at 130 DAP. Three times were repeated independently with similar results. Scale bar, 500 nm. d Cell wall components (pectin, hemicellulose, and cellulose contents) in the storage roots of H283 and L423. Data are means ± SD (n = 3 biological replicates). Different lowercase letters indicate significant differences at P < 0.05 based on one-way ANOVA followed by post-hoc Tukey’s test. Exact P-values are provided in the Source Data file.

Photosynthates transported into storage roots support essential life activities and energy storage (mainly as starch). Sucrose synthases (SUSs) in sink tissues regulate cellulose and starch derived from sucrose. SUSs reversibly cleave sucrose into fructose and UDP-glucose (UDPG), a precursor for starch and cell wall building blocks27. In Clusters PR2 and PR5, key enzymes related to the glycolysis/gluconeogenesis were significantly more abundant in H283 compared with L423: SUS2, which converts sucrose to fructose and UDP-glucose; phosphoglucose isomerase (PGIC), which catalyzes the fructose-to-glucose-6P (Glc-6P, the precursor of starch) conversion; cytosolic phosphoglucomutase (PGMc), pyrophosphate-fructose 6-phosphate 1-phosphotransferases (PFPα and PFPβ), fructose-1,6-bisphosphate aldolase 6 (FBA6), glyceraldehyde-3-phosphate dehydrogenase in the cytoplasm (GAPC), and pyruvate kinases (PKc and PKp, two isoenzymes located in plastids and the cytosol, respectively) (Fig. 3a and Supplementary Data 8)28,29. H283 storage roots accumulated significantly more glucose 6-phosphate (Glc-6P) and adenosine triphosphate (ATP) than L423 (Supplementary Fig. 3d, e), indicating that the active glycolysis/gluconeogenesis in H283 provides more precursors for starch biosynthesis.

Notably, in Clusters PR3 and PR4, key enzymes related to the cell wall biosynthesis pathway were significantly more abundant in L423 compared with H283: SUS5 and SUS6, which parietally localized in sieve elements and effect callose formation within them17,30; four UDP-glucose dehydrogenases (UGD1, UGD2, UGD3, and UGD4, which synthesize UDP-glucuronic acid, precursor of the xylose, pectin, and hemicellulose), and cellulose synthases (CESA1, CESA2, and CESA3) (Fig. 3a and Supplementary Data 8). L423 storage roots had significantly higher callose (Supplementary Fig. 3f, g), pectin, hemicellulose, and cellulose contents, and thicker cell walls than H283 (Fig. 3c, d). These results indicate that sucrose is primarily allocated for glycolysis/gluconeogenesis in the H283 storage roots, but primarily for callose biosynthesis and cell wall formation in the L423 storage roots.

In vascular plants, starch is synthesized in plastids of both photosynthetic and nonphotosynthetic cells31. In H283, active glycolysis/gluconeogenesis produced sufficient Glc-6P and ATP for starch biosynthesis (Fig. 3a and Supplementary Data 8). Glc-6P was transported into amyloplasts by glucose-6-phosphate/phosphate translocator 1 (GPT1), which was more abundant in H283 than in L423 (Fig. 3a and Supplementary Data 8)32. Moreover, key enzymes involved in starch biosynthesis (e.g., AGPL and AGPS) and amylose (e.g., GBSS and PROTEIN TARGET TO STARCH, PTST)33 and amylopectin (e.g., SSII and SBEI) biosynthesis were more abundant in H283 than in L423 (Fig. 3a and Supplementary Data 8). These results indicate that although the fresh yield of H283 and L423 is similar, their allocation of photosynthates is different. More photosynthates enter the glycolysis/gluconeogenesis pathway for high-efficiency starch biosynthesis, explaining the increased starch accumulation in the H283 storage roots.

IbPMA1 is a key factor promoting source-sink synergy-mediated starch accumulation

To screen the key genes regulating starch accumulation in sweet potato storage roots, a comparison of proteome and transcriptome data of H283 and L423 was performed (Supplementary Figs. 1c, 2c, 4 and Supplementary Datas 3, 7, 9). A total of 1235 genes/proteins were simultaneously up-regulated in the leaves of H283 compared with L423 at 60, 95, and 130 DAP, which were enriched in the pathways of photosynthesis (map00195), porphyrin and chlorophyll metabolism (map00860), and glycolysis/gluconeogenesis (map00010) (Fig. 4a, Supplementary Fig. 5a and Supplementary Data 10). Moreover, 426 genes/proteins were simultaneously up-regulated in the storage roots of H283 compared with L423 at 60, 95, and 130 DAP, which were enriched in the pathways of starch and sucrose metabolism (map00500), oxidative phosphorylation (map00190), glycolysis/gluconeogenesis (map00010), and citrate cycle (TCA cycle, map00020) (Fig. 4b, Supplementary Fig. 5b and Supplementary Data 10). Notably, 14 genes regulating source-sink relationships, involved in photosynthates production (IbPsaE, IbPsaN, IbPsbR, and IbTPT) and unloading (IbPMA1), glycolysis/gluconeogenesis (IbGAPC, IbPKc, and IbPKp), and starch biosynthesis (IbAGPL, IbAGPS, IbGBSS, IbPTST, IbSSII, and IbSBEI) were generally more highly expressed in H283 than in L423 (Fig. 4c and Supplementary Datas 10, 11). These genes were referred to as source-sink related genes.

Fig. 4: IbPMA1 is a key factor promoting source-sink synergy-mediated starch accumulation.
figure 4

a, b Proteome and transcriptome comparison of H283 and L423.TL, leave transcriptome; PL, leave proteome; TR, storage roots transcriptome; PR, storage roots proteome. c Expression patterns of genes involved in photosynthate production, unloading, allocation (in yellow), glycolysis/gluconeogenesis (in blue), and starch biosynthesis (in red) in storage roots of H283 and L423. d IbPMA1 Haplotypes identified in F1 individuals. Red triangles present seven linkage SNPs. e Distribution of dry matter content by six haplotypes (Hap1-6) in F1 individuals. The n numbers are shown in Fig. 4d. Adjust p-value in pairwise comparisons were annotated based on one-way ANOVA, post-hoc Tukey’s test. f Distribution of dry matter content in parents and F1 individuals based on parental genotypes. The genotypes of the parents were determined using seven linked simplex SNPs (red triangles) on Chr03:18790287-18794216, which are represented as either homozygous (0/0) (n = 221 F1 individuals) or heterozygous (0/1) (n = 217 F1 individuals) by two-tailed Student’s t-test. Box edges in (e, f) represent the 0.25 and 0.75 quantiles, with median values shown by thickened lines. Whiskers extend to data within 1.5 interquartile ranges, and dots in (f) indicate the remaining data. g h Phenotypes (g, scale bars, 5 cm) and safranin O-fast green staining of paraffin sections (h scale bars, 100 μm) of storage roots in IbPMA1 transgenic and WT plants at 130 DAP. Starch granules are stained red. i, j Starch and sucrose content in leaves and storage roots of IbPMA1 transgenic and WT plants at 130 DAP (n = 3 biological replicates, two-tailed Student’s t-test). k Transcript levels of IbSUTs in leaves of IbPMA1 transgenic and WT plants. l, m Transcript levels of IbSWEETs (l), IbSUTs, IbPMA1, IbCWIN, and IbSTP12 (m) in storage roots phloem and parenchymal cells of IbPMA1 transgenic and WT plants. Three biological replicates consisting of pools of three plants each. n Total and distribution of 13C in one-month-old IbPMA1 transgenic and WT plants (n = 3 biological replicates). The data and significance analysis are shown in Supplemental Data 12. All data are means ± SD. Exact P-values are provided in the Source Data file.

As sweet potato is a highly heterozygous autohexaploid, a significant number of potential allelic combinations (\({C}_{6}^{3}\times {C}_{6}^{3}\))15 are observed in segregating population. This complexity renders the task of determining the genotypes of all F1 individuals challenging. To accurately identify key genes regulating starch accumulation without affecting fresh yield, we prioritized and selected a total of 338,475 simplexes (single-dose variants present in only one parent, e.g., ATTTTT × TTTTTT) and double-simplex (single-dose variants present in both parents, e.g., ATTTTT × ATTTTT) single-nucleotide polymorphisms (SNPs) based on accurate parental genotype and low-coverage resequencing in the F1 population. Simultaneously, we applied the best linear unbiased estimate (BLUE) to standardize the fresh yield and dry matter content, which showed a high correlation with mean value (Supplementary Fig. 6) and conducted haplotype-based association analysis on the F1 population34. We scanned a 3 kb up- and down-stream region of the 14 differentially expressed key source-sink related genes (Fig. 4c), and grouped F1 individuals with the same genotype into the same haplotype. Particularly, significant variations in dry matter content were observed among six different haplotypes of IbPMA1, while the fresh yield remained constant (Fig. 4d, e and Supplementary Fig. 7). IbPMA1 encodes a plasma membrane H+-ATPase, which is suggested to function as a proton pump, potentially enhancing the transport of sucrose from the source to the sink. The dry matter content in IbPMA1Hap4 was notably lower compared with other haplotypes, with a significant difference observed when contrasted with IbPMA1Hap1 (p = 0.0135653, Tukey’s HSD test, Fig. 4e). Moreover, upon examining the parental lines, we identified the predominant heterozygous allelic variations within the IbPMA1 gene region. Interestingly, we found that the genotypes of seven simplex SNPs within the gene region on Chr03:18790287-18794216 showed linkage (Fig. 4d). In addition, all homozygous individuals (represented as 0/0) displayed a higher average dry matter content, represented by parent Xu781, while all heterozygous individuals (represented as 0/1) exhibited a lower average dry matter content, represented by parent Xushu18 (Fig. 4f). Notably, the dry matter contents in homozygous individuals of IbPMA1 were significantly higher compared with those in heterozygous individuals (p = 0.001699, Student’s t test, Fig. 4f).

Then, IbPMA1 was cloned from H283 to investigate whether it contributes to fresh yield and starch accumulation of sweet potato. IbPMA1 contains a conserved ATPase-IIIA_H domain, and its genomic sequence contains 21 exons and 20 introns (Supplementary Figs. 8, 9e). IbPMA1 was most highly expressed in the storage roots, with expression levels continuously increasing from 60 to 90 DAP (Supplementary Fig. 9a, b). Moreover, it was significantly induced by a 175 mM sucrose treatment and peaked at 6 h (14.37-fold increase) in the leaves of H283 (Supplementary Fig. 9c). Subcellular localization indicated that IbPMA1 is localized in the cell membrane (Supplementary Fig. 9f).

We generated IbPMA1 overexpression and RNA interference (RNAi) lines in the sweet potato variety Lizixiang (Fig. 4g and Supplementary Fig. 10a–g). At 90 DAP, with increasing irradiance or CO2 concentration, the photosynthetic rate of transgenic plants generally increased. IbPMA1-OE plants maintained a higher photosynthetic rate compared to WT, whereas IbPMA1-Ri plants performed a lower photosynthetic rate (Supplementary Fig. 11a, b). IbPMA1-OE leaves accumulated less starch, whereas IbPMA1-Ri leaves accumulated less sucrose and more starch compared to WT (Fig. 4i, j). These results indicated that IbPMA1 is associated with an increase in photosynthetic capacity.

IbPMA1 transgenic and WT plants were harvested at 130 DAP (Fig. 4g). No significant differences were observed in the fresh yield and cell density of IbPMA1 transgenic and WT plants (Supplementary Fig. 10h, i). Safranin O-fast green staining of paraffin sections revealed more starch granules in IbPMA1-OE plants but fewer in IbPMA1-Ri plants compared to WT (Fig. 4h). The starch content in the storage roots of IbPMA1-OE plants significantly increased by 9.35–12.91% and amylose proportion significantly decreased by 20.99–26.13%. In contrast, these values significantly decreased by 27.53–34.61% and increased by 9.27–17.23% in IbPMA1-Ri plants compared to WT (Fig. 4i, j). The soluble sugar and sucrose content in IbPMA1-OE plants significantly increased by 15.04–27.02% and 14.43–16.97%, whereas those in IbPMA1-Ri plants significantly decreased by 12.62–20.82% and 19.18–23.36%, respectively, compared to WT (Supplementary Fig. 11c, d).

In leaves, IbSUT1, IbSUT2, IbSUT3, and IbSUT4 were significantly upregulated in IbPMA1-OE plants but downregulated in IbPMA1-Ri plants compared to WT (Fig. 4k). In storage roots, IbSWEETs and IbSUTs were highly expressed in phloem compared to PCs (Fig. 4l, m). IbSWEET1, IbSWEET2, IbSWEET4, IbSWEET12, IbSWEET15, IbPMA1, IbSUT1, IbSUT2, IbSUT3, IbSUT4, IbCWIN, and IbSTP12 were significantly upregulated in the phloem of IbPMA1-OE plants but downregulated in IbPMA1-Ri plants compared to WT (Fig. 4l, m). Except for IbSWEET4 and IbSUT4, these genes show higher expression in PCs of IbPMA1-OE plants but were downregulated in IbPMA1-Ri plants compared to WT (Fig. 4l, m). The 13C stable isotope labeling assay showed that IbPMA1-OE plants fixed 30.75%–42.86% more 13C, and their storage roots accumulated 75.08%–110.27% more 13C, whereas IbPMA1-Ri plants decreased by 18.31%–25.11% and 40.90%–56.22%, respectively, compared to WT (Fig. 4n and Supplementary Data 12). These findings establish a link between IbPMA1 and starch yield in storage roots through promoting source-sink synergy.

Candidate TFs involved in source-sink synergy-mediated starch accumulation

To further identify the key regulatory factors involved in source-sink synergy-mediated starch yield, we screened a yeast one-hybrid (Y1H) library using the promoter of IbPMA1. This effort led to the identification of nine potential transcription factors (TFs; IbMYB73, IbbHLH49, IbERF2, IbERF3, IbERF73, IbWRKY7, IbWRKY21, IbMADS1, and IbMADS23) directly targeting to IbPMA1 (Fig. 5a). Further Y1H assays showed that all nine TFs could target to IbSBEI. In addition, among these TFs, only IbbHLH49 could directly target IbAGPS, which is crucial for the formation of the first rate-limiting enzyme AGPase in starch biosynthesis, thus determining the biosynthesis rate and starch accumulation (Fig. 5a). Transcriptome data indicated that these nine TFs were highly expressed in H283 (Fig. 5b and Supplementary Data 11). Their expression levels were further examined in six F1 individuals with high, medium, or low dry matter content. The results showed that the nine potential TFs were generally highly expressed in the individuals with high dry matter content, but no significant difference was seen in their expression levels between medium- and low-dry matter content individuals (Fig. 5c Supplementary Fig. 13 and Supplementary Data 1).

Fig. 5: Candidate transcription factors (TFs) involved in source-sink synergy-mediated starch accumulation in sweet potato.
figure 5

a Y1H assay of TFs binding to the promoters of key functional genes involved in sucrose unloading (in yellow), glycolysis/gluconeogenesis (in blue), and starch biosynthesis (in red). Red asterisks represent Y1H interactions. Three times were repeated independently with similar results. b Transcriptome data showed that the 9 TFs were highly expressed in H283. c Expression patterns of 9 TF genes in the storage roots of F1 individuals with HDM, MDM, and LDM at 60, 95, and 130 DAP. The values were normalized between 0 and 1 using MinMax normalization. dh EMSAs showed that the TFs IbbHLH49 (d), IbMYB73 (e), IbERF73 (f) and IbMADS (h) directly bound to the promoter of IbPMA1 and IbMYB73 (d), IbbHLH49 (e), IbERF73 (f), IbWRKY7 (g) and IbMADS (h) directly bound to the promoter of IbSBEI. Only IbbHLH49 is directly bound to the promoters of IbAGPS (d). Three times were repeated independently with similar results. The red lines on the promoters represent the wild-type (wt) probes. ik Transactivation of IbPMA1, IbAGPS, and IbSBEI promoters, respectively, by IbMYB73, IbbHLH49, IbERF73, IbWRKY7, or IbMADS1 in transfected protoplasts. REN activity level was used as an internal control to normalize LUC activity. Three times were repeated independently with similar results. Data are means ± SD (n = 3). Different lowercase letters indicate significant differences at P < 0.05 based on one-way ANOVA followed by post-hoc Tukey’s test. Exact P-values are provided in the Source Data file.

Then, electrophoretic mobility shift assays (EMSAs) showed that IbbHLH49 bound to the E-boxes within the IbPMA1, IbAGPS, and IbSBEI promoters, IbMYB73 bound to the MYB-binding sites within the IbPMA1 and IbSBEI promoters, IbERF73 bound to the DRE/CRT elements within the IbPMA1 and IbSBEI promoters, IbWRKY7 bound to the W-box within the IbSBEI promoter, and IbMADS1 bound to the CArG-box in the promoters of IbPMA1 and IbSBEI (Fig. 5d–h). Transient dual-luciferase (LUC) assays in sweet potato protoplasts revealed significantly higher relative LUC activity upon the overexpression of each TF gene compared with the empty vector control (Fig. 5i–k). These efforts identified candidate TFs involved in source-sink synergy-mediated starch accumulation in sweet potatoes.

IbbHLH49 directly promotes source-sink synergy and improves fresh yield and starch accumulation

Given that IbbHLH49 exhibited strong co-expression with starch accumulation-related genes IbPMA1, IbSBEI, and IbAGPS (Supplemental Fig. 14), and directly activated their transcription (Fig. 5d, i–k), we focused on IbbHLH49 for its potential role in sweet potato. Its coding sequence and genomic sequence were cloned from H283. IbbHLH49 belongs to clade XII of the bHLH TF family35, contains one conserved bHLH domain, and is closest to AtbHLH49 in Arabidopsis (Supplementary Fig. 15a, b). IbbHLH49 genomic sequence contains eight exons and seven introns, which is similar to the exon-intron structure of AtbHLH49 (Supplementary Fig. 15c). The IbbHLH49 promoter region in both H283 and L423 contain various elements, including the light-response element Box4, GATA-motif, I-box, GT1 motif, and stress-response element WUN-motif (Supplementary Fig. 16). Interestingly, the promoter in H283 features an additional WUN-motif, and its TCCC-motif and GT1-motif are positioned closer to the transcription start site compared to those in L423 (Supplementary Fig. 16), suggesting a higher likelihood of being bound and regulated by its regulatory factors36. The expression level of IbbHLH49 reached a peak at 70 and 100 DAP in storage roots and was highest in the stems of 3-month-old field-grown plants (Fig. 6a, b). In addition, IbbHLH49 was significantly induced by a 175 mM sucrose treatment and peaked at 3 h (58.9-fold increase) in the leaves of H283 (Fig. 6c). Subcellular localization and transcriptional activation assays have indicated that IbbHLH49 is a nucleus-localized transcriptional activator and its transcriptional activation domain is located within the 1–100 N-terminal and 405–521 C-terminal amino acid residues (without bHLH domain) (Supplementary Figs. 17, 18).

Fig. 6: IbbHLH49 directly promotes source-sink synergy and improves fresh yield and starch accumulation in sweet potatoes.
figure 6

a Transcript levels of IbbHLH49 at different storage root developmental stages of H283 (n = 3). b Transcript levels of IbbHLH49 in different tissues of H283 (n = 3). L, leaf; S, stem; FR, fibrous root; PR, pencil root; SR, storage root. c Transcript levels of IbbHLH49 in response to 175 mM sucrose treatment (n = 3). df Photosynthetic rate (d n = 5), Sucrose content (e n = 6), and starch content (f n = 3), respectively, in leaves of IbbHLH49 transgenic and WT plants at 90 DAP. g ChIP-qPCR assays using leaves (90 DAP) from 35S:IbbHLH49-GFP and 35S:GFP plants with an anti-GFP antibody (n = 3). h Transcript levels of IbbHLH49 target genes in leaves (90 DAP) or storage roots (130 DAP) of IbbHLH49 transgenic and WT plants (n = 3). i Phenotypes (scale bars, 5 cm) and safranin O-fast green staining of paraffin sections (scale bars, 100 μm) of storage roots in IbbHLH49 transgenic and WT plants at 130 DAP. Starch granules are stained red. jl Fresh yield (j n = 10), starch content (k n = 3), and amylopectin proportion (l) (n = 3), respectively, of storage roots of IbbHLH49 transgenic and WT plants at 130 DAP. m AGPase and SBE activities in the storage roots of IbbHLH49 transgenic and WT plants at 130 DAP (n = 3). n ChIP-qPCR assays using storage roots (130 DAP) of 35S:IbbHLH49-GFP and 35S:GFP plants with anti-GFP antibody (n = 3). o Total and distribution of 13C in one-month-old IbbHLH49 transgenic and WT plants (n = 3). The data and significance analysis are shown in Supplemental Data 13. All data are presented as means ± SD (biological replicates). Different lowercase letters indicate significant differences at P < 0.05 based on one-way ANOVA, post-hoc Tukey’s test. The β-Actin promoter was used as an internal reference for ChIP-qPCR. (**) and (*) indicate significant differences at P < 0.01 and P < 0.05, respectively, based on a two-tailed Student’s t-test compared to WT. Exact P-values are provided in the Source Data file.

We generated IbbHLH49 overexpression and RNAi lines in the sweet potato variety Lizixiang (Supplementary Fig. 19). At 90 DAP, the leaves of IbbHLH49-OE lines exhibited higher photosynthetic rates and maintained this increase as irradiance or CO2 concentration increased, with increased sucrose content, and decreased starch content compared with WT, but the IbbHLH49-Ri lines displayed the opposite pattern (Fig. 6d–f and Supplementary Fig. 20). Chromatin immunoprecipitation (ChIP)-qPCR and reverse transcription-quantitative real-time PCR (RT-qPCR) assays showed that IbbHLH49 directly bound to the IbTPT, IbPMA1, IbSUT2, IbSUT4, IbAGPS, and IbSBEI promoters to activate their expression in leaves (Fig. 6g, h), suggesting that IbbHLH49 overexpression promotes photosynthate production and loading in leaves.

We harvested IbbHLH49 transgenic and WT plants at 130 DAP (Fig. 6i). The safranin O-fast green staining of paraffin sections revealed more starch granules in IbbHLH49-OE lines and fewer in IbbHLH49-RNAi lines than in WT (Fig. 6i). IbbHLH49-OE plants exhibited a 10.00–21.93% increase in fresh yield (measured as total weight of storage roots) per plant, whereas IbbHLH49-Ri lines displayed a 9.87–21.85% decrease (Fig. 6j and Supplementary Fig. 21b, c). The starch content in the storage roots of IbbHLH49-OE plants significantly increased by 3.06–20.52%, and the amylose proportion significantly decreased by 12.29–19.41%, whereas those in IbbHLH49-Ri plants significantly decreased by 3.84–15.97% and increased by 15.28–34.69%, respectively, compared to WT (Fig. 6k, l and Supplementary Fig. 21d, e). The activities of AGPase and SBE were significantly higher in IbbHLH49-OE plants but significantly lower in IbbHLH49-Ri plants (Fig. 6m). ChIP-qPCR and RT-qPCR assays showed that IbbHLH49 directly bound to the IbTPT, IbPMA1, IbSUT2, IbSUT4, IbAGPS, and IbSBEI promoters to activate their expression in storage roots (Fig. 6h, n), suggesting that IbbHLH49 overexpression promotes photosynthates unloading and starch biosynthesis in storage roots. The 13C stable isotope labeling assay showed that IbbHLH49-OE plants fixed 22.86%–30.64% more 13C, and their storage roots accumulated 68.05%–93.93% more 13C, whereas IbbHLH49-Ri plants decreased by 34.63%–37.43% and 47.48%–68.46%, compared to WT (Fig. 6o and Supplementary Data 13). Overall, these results indicate that IbbHLH49 plays a pivotal role in source-sink synergy-mediated fresh yield and starch accumulation in sweet potatoes.

Discussion

Crops are grown primarily as a source of starch. A considerable part of the world’s annual starch production comes from sweet potato37. However, there’s a significant negative correlation between sweet potato fresh yield and starch content. As starch demand from food and non-food industries rises, understanding the synergistic regulatory mechanism between fresh yield and starch accumulation is key to devising new strategies for enhancing sweet potato starch yield38. This study systematically uncovered the molecular mechanism of source-sink synergy-mediated high starch yield in sweet potatoes.

The sweet potato storage root, unlike cereal crops, is an early-forming underground nutrient organ rather than a reproductive one. It continues to accumulate photosynthates and other metabolites throughout its entire development period. The establishment and coordinated development of a typical source-sink relationship guarantee the high starch yield of sweet potato39. As a source, leaves produce photosynthates, determine their production and transport capacity, and affect sink strength40. In this study, H283 had a more active photosynthetic system with high abundance of key subunits of PSII and PSI as well as PetB, and its photosynthetic rate was higher than that of L423 (Fig. 2a, b and Supplementary Data 4). During CO2 assimilation, the Calvin-Benson-Bassham cycle generates triose-P at the expense of ATP and nicotinamide adenine dinucleotide phosphate H+ (NADPH) generated by photosynthetic light reactions. Triose-P is retained within chloroplasts to feed into transitory starch biosynthesis, or exported into the cytosol via TPT to produce sucrose41. In potato, and Arabidopsis, a loss of TPT function results in an increased accumulation of transitory starch42,43,44. We showed that the distribution of triose-P was significantly different in the leaves of H283 and L423. TPT was more abundant in the H283 leaves and exported more triose-P into the cytosol to produce sucrose (Fig. 2a and Supplementary Data 4). However, TPT was not very abundant in the L423 leaves, leading to starch biosynthesis within chloroplasts and less sucrose transporting to storage roots (Fig. 2a and Supplementary Data 4). Thus, developing the production capacity of source tissues (leaves) and promoting the distribution of triose-P to the cytosol can effectively enhance starch accumulation in sink tissues (storage roots) of sweet potatoes (Fig. 7).

Fig. 7: Working model of source-sink synergy-mediated starch accumulation in sweet potato.
figure 7

Source-sink synergy: the production, loading, and transportation of photosynthetic products, particularly sucrose in leaves, then unloading and distribution in storage roots. When the source is abundant, active PMA promotes a pH gradient across the plasma membrane, enhancing apoplasmic unloading and boosting flow inflow, thereby increasing starch yield in sweet potatoes. In addition, IbbHLH49 directly activates source-sink-related genes in leaves and storage roots, facilitating the loading of photosynthates in the leaves and their subsequent unloading and starch biosynthesis in storage roots. This regulation promotes source-sink synergy-mediated fresh yield and starch accumulation in sweet potatoes. SE/CC, sieve element-companion cell; Suc, sucrose; SWEET, sugar will eventually be exported transporter; PMA, plasma membrane H+-ATPase; SUT, sucrose transporter; CWIN, cell wall invertase; STP, sugar transport protein; Glu, glucose; Fru, fructose; AGPS, ADP-glucose pyrophosphorylase small submit; ADPG, ADP-glucose; GBSS, granule-bound starch synthase; SBEI, 1,4-α-glucan-branching enzyme 1.

As storage roots expand, numerous secondary phloem structures differentiate, playing a vital role in the unloading of photosynthates into the storage roots45. Phloem unloading involves symplastic and apoplastic pathways. In symplastic unloading, photosynthates move directly from the SE-CC complex to surrounding PCs through plasmodesmata (PD), driven by a concentration gradient. In apoplastic unloading, photosynthates are transported passively or actively from the SE-CC complex to the apoplastic space, and then into cells20,21. The phloem unloading pathway varies among sink types and developmental stages. For instance, an extensive apoplastic pathway is observed during the fruit development of apples, pears, and kiwifruit46,47,48. Liu et al. (2019) observed increased PD in phloem parenchyma during the storage root growth of sweet potato, with SUS activity increasing and maintaining a higher level than that of IAI (CWIN)49. While the authors inferred a shift from apoplasmic to symplasmic unloading during storage root formation, IAI activities showed a gradual increase after 80 DAP in both sweet potato varieties, Hong Xiangjiao and Beijing 553 (Fig. 6 of Liu et al., 2019), suggesting that apoplasmic unloading remains active during the starch accumulation stages of sweet potato49. However, the effect of apoplasmic unloading on sweet potato fresh yield and starch accumulation remains unclear. In our study, apoplasmic phloem unloading was more active in H283 than L423, accompanied by increased expressions of SWEETs, PMA1, SUTs, CWIN, and STP12, and more 13C accumulation in storage roots (Fig. 3a, b), suggesting that apoplasmic unloading significantly contributes to starch accumulation in sweet potato without affecting yield.

In plants, PMAs are essential proton-pumping proteins involved in apoplasmic unloading, playing crucial roles in cellular growth, sugar transport, mineral nutrient translocation, and grain filling23. Silencing the rice PMA isoform, OsA2, led to significantly decreased fresh weight and grain yield50. In potatoes, PHA1 significantly increased yield and starch content by driving force for SUTs to maintain apoplastic sucrose transport during elongation51. Our data revealed that the dry matter content associated with the main six haplotypes of IbPMA1 exhibited significant differences among the F1 population (Fig. 4e). Overexpression of IbPMA1 enhances photosynthates loading in leaves (Fig. 4i, j, k, n and Supplementary Data 12) and apoplasmic unloading in storage roots (Fig. 4i, j, l, m, n and Supplementary Data 12), resulting in increased storage roots starch content but without affecting fresh yield (Fig. 4i and Supplementary Fig. 9h). Notably, the dry matter contents in the homozygous individuals of linked simplex SNPs within the IbPMA1 gene region were significantly higher than in the heterozygotes (Fig. 4f). These linked simplex SNPs possess the potential to serve as markers for high starch breeding (Fig. 4d, f). These results suggest that active PMA enhances sink strength and boosts flow inflow, thereby increasing starch yield in sweet potatoes. The carbon assimilates unloaded into sink tissues are allocated to specific reactions. SUSs play a central role in coordinating carbon allocation to cell wall or starch biosynthesis52. In fact, SUSs are the main sink strength determinants in potato tubers and control carbon import in young tomato fruits53,54. In cotton, GhSUS2 overexpression results in a high cotton fiber yield55. BsSUS5-RNAi aspen trees (Populus tremuloides) exhibited a mild (smaller or weaker) stem phenotype due to effects on the cell wall56. Our results showed that SUS2 was highly expressed in storage roots of H283, which was accompanied by active starch biosynthesis, while SUS5 and SUS6 were highly expressed in storage roots in L423 with abundant accumulation of callose, pectin, hemicellulose, and cellulose (Fig. 3a, d, Supplementary Fig. 3f, g and Data 8). Therefore, we speculate that SUS2, SUS5, and SUS6 in sweet potato underlie the functional differentiations in regulating carbon allocation. In cassava, the structural components (cellulose and lignin) in storage roots show a negative correlation with starch accumulation. The optimal dynamic transition between these components during root development promotes starch accumulation in the parenchymal tissue57. In sweet potato, a common issue with high starch content but low fresh yield may arise from a production and transport limitation of photosynthates in leaves. Then, under limited arrival of assimilates, the different genotypes may decide on less cell division but more storage, resulting in insufficient structural components (e.g., pectin, hemicellulose, and cellulose, Fig. 3d) in storage roots. A potential strategy was proposed for ensuring enough assimilates and optimizing the balance between storage root carbon allocation and starch accumulation through plant breeding.

Glycolysis/gluconeogenesis and TCA cycle are not only essential components of plant respiration, but also provide ATP, reducing agent, and various precursors for plant growth and development58. In rice, pfp1-3 and pfpβ mutants altered the expression of key enzymes in starch biosynthesis, leading to remarkably low grain weight and starch content59,60. The ospk1 mutant displayed dwarfism and panicle enclosure, and the ospk2 mutant showed reduced starch granules61,62. Overexpression of the TCA cycle key gene FLO16 (encoding a NAD-dependent cytosolic malate dehydrogenase) significantly improved grain weight, while flo16 mutant showed decreased ATP content, reduced activities of starch synthesis-related enzymes, and defective starch grain formation63. In both Arabidopsis and maize, TCA metabolites regulate root development in diverse and distinct ways, highlighting their crucial roles in root growth and development64. In our study, H283 exhibited more active glycolysis/gluconeogenesis and TCA cycle compared to L423 (Fig. 3a and Supplementary Fig. 2e, h). It is speculated that H283 fixed more carbon to compensate for the carbon loss caused by its higher glycolysis and TCA cycle activities (Fig. 2b, h, i and Supplementary Fig. 3b), which provided more precursors, such as ATP and Glc-6P (Supplementary Fig. 3d, e), for maintaining high starch accumulation and fresh yield formation.

In plants, the source and sink are interconnected and influence each other65. A robust source supports the potential strength of a powerful sink, while a strong sink aids in maintaining a vigorous source. Conversely, a limited sink may constrain the source, and a weak source could hinder sink development66. In agricultural production, a well-coordinated source-sink relationship is essential for enhancing crop yield and quality. Here, we identified several candidate TFs that directly activate source-sink related genes in sweet potatoes (Fig. 5). Specifically, IbbHLH49 was found to directly activate IbTPT, IbPMA1, IbSUT2, IbSUT4, IbAGPS, and IbSBEI in leaves and storage roots (Fig. 6g, h, n), promoting photosynthates production (Fig. 6d), loading them into leaves, and then unloading and synthesizing starch in storage roots (Fig. 6j, k, o). These findings provide valuable source-sink synergy regulators of fresh yield and starch content for increasing starch yield by gene engineering technologies in sweet potatoes.

In summary, our research systematically explored the correlation between sweet potato fresh yield and starch accumulation mediated by the source-sink process. This study provides valuable genetic and theoretical insights for future starch yield enhancements in sweet potatoes, as well as other starchy root and tuber crops.

Methods

Plant materials

Sweet potato varieties Xushu18 and Xu781 were hybridized to generate 994 F1 seeds, which were cultivated to produce storage roots in the current generation. Stems derived from these storage roots were used for asexual propagation in subsequent generations. Xushu18 and Xu781 and their 500 F1 individuals, and major starch-type varieties Shangshu19 and Xushu22 were planted in the field at the experimental station of Tianjin Fenghua Yulong Agricultural Development Co., Ltd. (39°96’32”N, 116°40’02”E), Baodi, Tianjin, China. The field trial employed a randomized complete block design with three replications. Each plot contained 20 plants, arranged at an 80 cm row spacing and a 25 cm plant spacing.

Leaves and storage roots from H283 and L423 with contrasting starch contents were sampled at 60, 95, and 130 DAP. Samples from five individual plants were pooled as one biological replicate and immediately frozen in liquid nitrogen. Three independent biological replicates were collected for proteome and transcriptome analyses.

Measurements of fresh yield and dry matter content

Except for the two plants at each end of each plot, five plants were randomly selected to measure the fresh yield. A 100 g mixed fresh slices sample was weighed, and then dried at 105 °C for 30 min, at 60 °C for 5 h, and finally at 80 °C for 48 h. The dry matter content, which is positively correlated with starch content, was used to estimate starch content using the conversion formula y = 0.86945x-0.0634587, where y represents the starch content and x represents the dry matter content12.

Measurements of source-sink indices

Photosynthetic rate and curve measurements were performed between 10 a.m. and 2 p.m. under sunny conditions. Photosynthetic curves were measured using the LI-6400 System (Li-Cor) with an LED chamber according to Zhang et al. 50. The flow rate and leaf temperature were maintained at 500 μmol ·s−1 and 25 °C, respectively. For each light/CO2 condition, photosynthetic rate and stomatal conductance data were collected after these values reached a steady state. The light response curve was measured by varying irradiance using the LED chamber. For the CO2 response curve, leaves were measured under saturating light conditions (~ 1500 μmol m−2·s−1) with an additional CO2 source to maintain the CO2 concentration.

Starch, amylose, Glc-6P, and ATP contents, as well as AGPase and SBE activities and safranin O-fast green staining of paraffin sections were measured according to Ren et al. 7. Sucrose and soluble sugar contents were measured according to Boxall et al. 67. Photosynthetic rate was measured according to Zhang et al. 68. Callose, Cellulose, pectin, and hemicellulose contents were measured using assay kits (Comin Biotechnology Co. Ltd, Suzhou, China). In situ detection of callose was performed using aniline blue staining as previously described69,70. Three biological replicates, each consisting of three plants, were used. Storage roots and leaves were sampled at sunset, cleaned with deionized water, and dried for measurements.

Proteome analysis

The proteomic analysis was conducted by PTM BioLab Co., Ltd. (Hangzhou, China), using a tandem mass tag coupled to LC-MS/MS. The 36 proteomes of H283 and L423 were analyzed. The protein isolation and trypsin digestion were performed according to Hao et al. 71. In brief, the samples were ground and sonicated by a high-intensity ultrasonic processor (Scientz) in lysis buffer (1% [v/v] Triton X-100, 10 mM dithiothreitol, and 1% [v/v] Protease Inhibitor Cocktail). An equal volume of Tris-saturated phenol (pH 8.0) was added, vortexed, and centrifuged. Proteins were precipitated by ammonium sulfate–saturated methanol and incubated at – 20 °C for 8 h. After centrifugation at 4 °C for 10 min, the protein pellet was washed with ice-cold methanol and ice-cold acetone and then redissolved in 8 M urea to determine the protein concentration using a BCA kit (Beyotime).

For trypsin digestion, protein samples were precipitated with trichloroacetic acid (TCA) at 4 °C for 2 h, washed two to three times with acetone, and resuspended in 200 mM tetraethylammonium bromide (TEAB). Trypsin was added at a 1:50 trypsin:protein mass ratio for the first digestion overnight, and the protein solution was reduced with the addition of 10 mM DTT for 30 min at 56 °C and alkylated with 11 mM IAA (Iodacetamide) for 15 min at room temperature in darkness. Finally, trypsin was added at a 1:100 trypsin: protein mass ratio for a second 4-h digestion. The tryptic peptides were dissolved in 0.1% formic acid (solvent A) and loaded onto a homemade reversed-phase analytical column (15-cm length, 75 μm i.d.). The gradient consisted of an increase from 6% to 23% solvent B (0.1% formic acid in 98% acetonitrile) over 26 min, 23% to 35% in 8 min, and climbing to 80% in 3 min, then holding at 80% for the last 3 min, at a constant flow rate of 400 nL/min on an EASY-nLC 1000 UPLC system (Thermo Fisher, Waltham, USA). The peptides were subjected to NSI source followed by tandem mass spectrometry (MS/MS) in Q ExactiveTM Plus (Thermo Fisher, Waltham, USA) coupled online to the UPLC. The electrospray voltage applied was 2.0 kV. The m/z scan range was 350 to 1800 for a full scan, and intact peptides were detected in the Orbitrap at a resolution of 70,000. Peptides were selected for MS/MS using NCE setting 28, and the fragments were detected in the Orbitrap at a resolution of 17,500. A data-dependent procedure that alternated between one MS scan followed by 20 MS/MS scans with 15.0 s dynamic exclusion. Automatic gain control (AGC) was set at 5E4. The fixed first mass was set as 100 m/z. Peptides were separated on a nanoElute UHPLC system (Bruker Daltonics) and subjected to a capillary source, followed by mass spectrometry analysis on a timsTOF Pro instrument (Bruker Daltonics). MS/MS data were processed using the MaxQuant search engine (v.1.6.6.0). Tandem mass spectra were searched against the Ipomoea_batatas database (44,924 entries) concatenated with the reverse decoy database. Trypsin/P was specified as the cleavage enzyme, allowing up to two missing cleavages. The mass tolerance for precursor ions was set to 20 ppm in the First search, and 20 ppm in the Main search, and the mass tolerance for fragment ions was set to 20 ppm. The false discovery rate (FDR) was adjusted to < 1%.

Transcriptome analysis

Total RNA was extracted from the leaves and storage roots of H283 and L423 using Trizol reagent (Invitrogen, CA, USA)68. Three biological replicates were conducted. Sequencing libraries were generated and sequenced using a BGISEQ-500 system as paired-end 150-bp reads according to the manufacturer’s instructions. Adapter sequences and low-quality reads were removed before mapping clean reads to the reference genome. An FDR < 0.05, as determined by DESeq268, was considered to indicate differential expression.

13C stable isotope labeling

One-month-old, non-sprawling plants were used for measurement. Plastic bags filled with 30 mL 13CO2 (99% 13C abundance) were used to cover all leaves for 2 h on a sunny day. After 48 h, labeled samples were collected, and the fresh weight of leaves, stems, and roots was measured. The samples were then dried to constant weight, chopped, and sieved through a 100-mesh sieve. 13C abundance was measured with the Isoprime100 mass spectrometer (Cheadle, UK).

Haplotype-based association analysis

A continuous five-year investigation was conducted on the phenotypes of dry matter content and fresh yield in 500 F1 individuals and their two parents. BLUE was employed to meticulously adjust for phenotypes from different years. A linear mixed model with fixed (effects over years) and random (replicate differences) effects was fitted using the R package lme472. Parameter estimation was carried out using the least squares method. BLUE values were calculated for each observation by aggregating the estimated fixed and random effects.

Leaves from 517 F1 individuals (including phenotyping ones) and two parents were collected, flash-frozen in liquid nitrogen, and used for genomic DNA extraction using the CTAB method68. Whole-genome sequencing libraries were constructed and sequenced on the Illumina NovaSeq 6000 platform, generating 150 bp paired-end reads with an average insert size of 500 bp for further analyses (Novogene Co., Ltd; Tianjin, China).

For cost-effectiveness and accuracy in genotyping, simplex SNPs were identified using ultra-high-throughput sequencing (~ 200x) from the two heterozygous parents, complemented by low-coverage sequencing of the offspring (~ 12x). Clean sequence reads were aligned to the I. trifida reference genome (http://sweetpotato.uga.edu/gt4sp_download.shtml) with the BWA mem module73. PCR duplicates were removed using the “MarkDuplicates” in GATK4. For the parents, the polyploid mode (parameters: “-T UnifiedGenotyper -glm BOTH -sample_ploidy6”) in GATK3 was employed to determine the six potential genotypes for each variant, then the simplex (single-dose variants present in only one parent, e.g., ATTTTT x TTTTTT) and double-simplex (single-dose variants present in both parents, e.g., ATTTTT x ATTTTT) variants were selected. SNP sites with a quality below 100 were filtered. For the F1 population, GATK3 UnifiedGenotyper standard module was used, where the genotype was set to be heterozygous (0/1) once the depth of the minor allele (e.g., the “A” allele in ATTTTT) was ≥ 174. The proportion of different genotypes in the population was calculated for each SNP. SNPs where the proportion of heterozygous genotypes (0/1) in the population ranged from 0.04 to 0.28, and SNP sites with less than 50% missing calls were retained for subsequent analyses, taking into account the expected ratios for simplex and double-simplex SNPs and the biases introduced by skim sequencing. Using the filtered but unimputed genotype-calling data (VCF format file) as input, the Autopolyploid-Plant module in the OutcrossSeq package was utilized in missing SNPs imputation step by step75. In clustering analysis within a partitioning window of 500 SNPs in size across each chromosome, a correlation cutoff with 0.9 and a minimum of 10 SNPs in each group were required.

Consecutive SNPs and fine-scale genotype maps were employed to extract the coding regions of the candidate genes, along with their 3 kb upstream and downstream regions. Individuals sharing identical variations were grouped into identical haplotypes, retaining heterozygous alleles but not unresolved missing sites. Haplotypes containing only a single individual were eliminated using an in-house Python script. Differences in dry matter content and fresh yield among haplotypes were identified using one-way ANOVA, followed by pairwise comparisons conducted using Tukey’s Honest Significant Difference (HSD) test as a post-hoc analysis in R.

Production of transgenic sweet potato

The coding sequence of IbPMA1 and IbbHLH49 was separately inserted into the pCAMBIA1302 or pCAMBIA1300-GFP binary vector, both driven by the cauliflower mosaic virus 35S promoter. A pair of forward and reverse highly specific fragments of IbPMA1 and IbbHLH49 were inserted into the plant RNA interference (RNAi) vector pFGC5941. Constructs of IbPMA1 and IbbHLH49 were individually introduced into the sweet potato variety Lizixiang via one-step Agrobacterium-mediated and Agrobacterium-mediated transformation, respectively, as previously described68,76. Transgenic plants were grown in a field at China Agricultural University (40°02’63”N, 116°28’15”E), Beijing, China. The yield of the IbPMA1 transgenic plants was measured in 2023, and the yield of the IbbHLH49 transgenic plants was measured for two consecutive years, in 2022 and 2023.

Expression analysis

Total RNA was extracted using the Trizol reagent (Invitrogen, CA, USA). First-strand cDNA was synthesized using a PrimeScript RT reagent Kit (TaKaRa, Dalian, China). The experiments were conducted as three biological replicates, each with three plants. Relative transcript levels were determined using quantitative PCR (qPCR) with the sweet potato β-Actin (AY905538) gene as an internal control. Gene expression was quantified using the comparative CT method77.

Yeast one-hybrid assay

The coding sequences of the appropriate genes were inserted into the pB42AD vector, while promoter fragments were cloned into the pLacZi2μ vector. The vectors and the empty vector were transformed as pairs into yeast strain EGY48 by the PEG/LiAc method. Positive colonies were cultured on synthetic defined (SD)/–Trp/–Ura/+X-gal medium to screen possible interactions according to the protocol of the Matchmaker OneHybrid System (Clontech, Palo Alto, USA).

Electrophoretic mobility shift assays (EMSAs)

EMSAs were carried out according to the method of Zhang et al. 78. The pGEX6p-1-IbMYB73, pGEX6p-1-IbERF73, pGEX6p-1-IbbHLH49, pETM40-IbWRKY7, and pGEX6p-1-IbMADS1 plasmids were transferred into competent E. coli strain Transetta (DE3) cells. The oligonucleotide probes for EMSAs were synthesized by Invitrogen (San Diego, USA). Probes labeled with or without biotin at their 5′ ends were used as binding probes or cold competitors.

Dual-LUCassay

The coding sequences of genes were cloned into the pGreenII 62-SK vector, which was used as an effector. The empty pGreenII 62-SK vector was used as a negative control. The promoter sequences were individually inserted upstream of the firefly luciferase (LUC) reporter gene into the pGreenII 0800-LUC vector to generate reporters. Sweet potato protoplasts were isolated and used for dual-LUC assays as described previously79. LUC and Renilla luciferase (REN) activity levels were measured using a Dual-LUC Reporter Assay System (Promega). LUC activity was normalized to REN activity. Three biological replicates were performed for this analysis.

Sequence alignment and phylogenetic analysis

Genomic DNA (EasyPure Plant Genomic DNA Kit, TransGen Biotech, Beijing, China) and total RNA (RNAprep Pure Plant Kit, Tiangen Biotech, Beijing, China) were extracted from fresh storage roots of H283. The genomic DNA and cDNA sequences were amplified using primers listed in Supplemental Data 14. Conserved domains were searched using INTERPRO. Multiple protein sequence alignment was performed using DNAMAN software (Lynnon-BioSoft, San Ramon, CA, USA). Phylogenetic analysis was conducted using the neighbor-joining method in MEGA11.0 with 1000 bootstrap iterations80. The exon-intron structures of genes were analyzed using the SPLIGN program (https://www.ncbi.nlm.nih.gov/sutils/splign).

Subcellular localization

The coding sequence of IbbHLH49 without stop codon was cloned into the binary vector pCAMBIA1300 in-frame and upstream of the green fluorescent protein gene (GFP) sequence. The vectors and nucleus localization marker construct (NLS-RFP) were transiently infiltrated into Nicotiana benthamiana leaf epidermal cells by Agrobacterium infiltration. The green fluorescence signal was observed using a confocal laser scanning microscope (LSM880, Zeiss, Jena, German) with an argon laser (488-nm excitation) after 48 h of growth.

Transactivation assay

The transactivation assay was carried out according to the Yeast Protocols Handbook (Clontech, Palo Alto, USA). The full-length IbbHLH49 coding sequence and fragments encoding amino acids 1–100, 101–404, and 405–521 were individually cloned into the vector pGBKT7 (Invitrogen, San Diego, USA). The empty pGBKT7 vector was used as a negative control, while pGBKT7-53 was used as a positive control. The constructs were transformed into the Y2H Gold yeast strain by the PEG/lithium acetate method. Positive yeast colonies were screened on SD medium lacking tryptophan (–Trp) and transferred to SD medium lacking tryptophan, histidine, and adenine (–Trp –His –Ade).

ChIP assay

The ChIP assay was performed according to Xue et al. 79. In brief, leaf or storage root tissues (approximately 2 g) of IbbHLH49-OE-13 and empty vector plants were cross-linked in 1% (w/v) formaldehyde under vacuum. The cross-linking was stopped by the addition of 0.125 M glycine. Samples were ground to a powder in liquid nitrogen and subjected to nuclei isolation. Anti-GFP (M20004, Absmart, 1:500 dilutions) antibodies were used to immunoprecipitate the protein–DNA complexes, and the precipitated DNA was recovered. An equal amount of chromatin samples without antibody precipitation was used as an input control. ChIP DNA was analyzed by qPCR, and the ChIP values were normalized against the values of the respective input. The primers used for ChIP-qPCR are listed in Supplemental Data 14. The experiment was independently performed three times with similar results.

Statistical analysis

All data were analyzed using one-way ANOVA followed by post-hoc Tukey’s test or a two-tailed Student’s t test with SPSS 26.0 (https://www.ibm.com/support/pages/downloading-ibm-spss-statistics-26). Data are shown as means ± standard deviation (SD). Proteomics and transcriptomics data were normalized between 0 and 1 using MinMax normalization81 in heatmaps, which involves subtracting the minimum value of the dataset, dividing the result by the max-min difference, and scaling the data proportionally for better presentation. Heatmaps were generated using TBtools software v1.12082.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.