Constitutive expression of a pea apyrase, psNTP9, increases seed yield in field-grown soybean

To address the demand for food by a rapidly growing human population, agricultural scientists have carried out both plant breeding and genetic engineering research. Previously, we reported that the constitutive expression of a pea apyrase (Nucleoside triphosphate, diphosphohydrolase) gene, psNTP9, under the control of the CaMV35S promoter, resulted in soybean plants with an expanded root system architecture, enhanced drought resistance and increased seed yield when they are grown in greenhouses under controlled conditions. Here, we report that psNTP9-expressing soybean lines also show significantly enhanced seed yields when grown in multiple different field conditions at multiple field sites, including when the gene is introgressed into elite germplasm. The transgenic lines have higher leaf chlorophyll and soluble protein contents and decreased stomatal density and cuticle permeability, traits that increase water use efficiency and likely contribute to the increased seed yields of field-grown plants. These altered properties are explained, in part, by genome-wide gene expression changes induced by the transgene.

www.nature.com/scientificreports/ The T-DNA Insertion sites were confirmed by PCR amplification. Primers were designed to amplify the fragments that span the T-DNA region and the upstream or downstream flanking region. The expected amplicon sizes of the primer set of 35SproR/Gm11R for 14A, Gm03F/LB3 for 17B, and LB3/GM14R for 16C were 1,667 bp, 743 bp, and 489 bp respectively based on the integration sites in each transgenic line (Fig. 1b). The PCR size product from three transgenic soybean lines matched the expected product length (Fig. 1c-e). These results further confirmed the T-DNA insertion loci in the genome of each transgenic line.
We identified the genes neighboring the single insertion site for all three transgenic lines. For 17B, the two neighboring genes were Glyma.3G074700, encoding a predicted membrane protein involved in ER to Golgi vesicle-mediated transport, and Glyma.3G074600, which encodes an F-box/RNI-like superfamily protein subunit of the SCF complex that functions in ubiquitin-mediated protein degradation 29 . Neither gene was differentially expressed (DE)in the transgenic line. For 16C, the closest neighboring gene to the insertion site was Glyma.14G192500, encoding Cytochrome P450 86A1, which functions in fatty acid metabolism and suberin synthesis in the root 30 . This gene also was not DE. For 14A, there were two neighboring genes. Glyma.11G183600, whose 3′-end is 2716 bp upstream of the insert, encodes an F-box/LRR-repeat protein.
Glyma.11G183700 encodes the bZIP TF GmTGA14, whose closest Arabidopsis ortholog is TGA9. Although the 5′-end of Glyma.11G183700 is located 9583 bp downstream of the T-DNA insertion, its expression was strongly upregulated (13.5-fold) in 14A leaves. In fact, it was the most highly DE gene in leaf tissues. qRT-PCR analyses showed a 21.7-fold increase in expression of this gene, confirming transcriptome data (Supplementary Table S2). In contrast, Glyma.11G183700 expression in 17B increased 1.9-fold (Supplementary Table S2).
Seed yield performance of transgenic events expressing psNTP9 in field trials. Figure 2 showed yield performance of psNTP9 transgenic events 14A, 16C, and 17B trialed at multiple locations in the US over multiple years. In 2016, soybean field trials were conducted at three different locations of the primary soybean production regions-El Paso, Illinois; Henderson, Kentucky; and Atlanta, Indiana (Fig. 2a). Combined site analyses indicated that as compared to the parental line Williams 82, psNTP9 event 17B significantly increased soybean yield by 13% (p = 0.05); transgenic event 14A by 9% (p = 0.05). In 2017, field tests were implemented during the counter-season seed increase at three locations in Puerto Rico (Fig. 2b). As compared with Williams 82, all three events showed a significant yield increase of 28-44%. In 2018, trials were conducted at three different production regions in the US-Lawrence, Kentucky; Smithville, Missouri and Troy, Ohio; sibling null segregants of each transgenic event which went through the transformation processes, but did not contain psNTP9 gene, were included, in addition to non-transformant parent Williams 82. All three transgenic events showed yield increases of 7-13% when compared to Williams 82 (Fig. 2c), similar to results obtained from the prior two years. When compared to corresponding nulls, psNTP9 event 17B had higher yields than the 17B null plants by 12% (p = 0.05) and event 14A had a 9% increased yield over the 14A null (p = 0.05), though 16C showed no yield difference from 16C null (Fig. 2d). Multi-year field trials in the main soybean production regions indicated that lead event 17B showed significant yield efficacy by 13% and 12% over parent Williams 82 and its sibling null, respectively.

Seed yield increases in elite varieties harboring the introgression of psNTP9 event 17B.
To test whether psNTP9 gene efficacy can be transferred into different genetic backgrounds as well as to further integrate apyrase control traits (ACT) into elite varieties for commercial development, three soybean cultivars differing in maturity groups were used as recurrent parents (RPs) for introgression of lead event 17B through multiple generations of backcrossing. Field tests of introgression lines were conducted at multiple locations of primary soybean production regions both in Argentina and in the US over 2019 crop season. Figure 3a demonstrated yield results of all introgression lines of 17B × A1900. As compared to recurrent parent A1900 with relative maturity of 1.90, both introgression line 1 and 2 had significant yield increases of 25% and 34%, respectively, at the level of p = 0.05. Figure 3b shows yield performance of every introgression line of 17B × DSR262 (relative maturity 2.6). Out of all 20 introgression lines, 18 lines showed increased yields, ranging from 4 to 17%, as compared to the recurrent parent. Yield efficacy from 10 introgression lines reached the significance level of p = 0.05, and averaged 11% yield increase with the range of 8%-17%. Each of the introgression lines of 17B × A3431 had higher yields than recurrent parent A3431 (relative maturity 3.4), across 13 locations in both countries (Fig. 3c). Out of the 12 introgression lines, 9 lines showed significant yield increases (p = 0.05) averaging 8%. In practice, trait integration for further commercial development is generally concentrated on the top introgression lines of targeted backcrosses. As seen in Fig. 3d, each of the top two introgression lines deriving from ACT integrated backcrosses, demonstrated significant yield benefit over corresponding RPs, with the range from 9 to 34% (p = 0.05). On the average, yield increase of the two top lines over corresponding RP averaged 30%, 15%, and 10% in backcrosses of 17B × A1900, 17B × DSR262, and 17B × A3431, respectively.
Effects of expressing psNTP9 on other agronomic attributes. During the course of plant growth and development, thorough field surveys were made to determine whether overexpression of psNTP9 gene significantly impacts other agronomic attributes, apart from soybean seed yield. As summarized in Table 1, there were no significant differences between psNTP9 event lines and their sibling nulls in most of plant growth and phenological attributes, such as early growth vigor score, plant stand, days from sowing to initial flowering, and days of crop cycle. Test weight and seed moisture at harvest are important economic parameters for soybean production and sales. Figure 4a shows seed moisture at harvest of psNTP9 events against sibling nulls and parent line Williams 82, across 16 field trials in Argentina and US over 2018-2019. Figure 4b shows seed test weight and Atlanta, Indiana. Seed yield performance and analysis of psNTP9 events at each of the individual sites can be found in supplemental data. Combined site analysis over the three trials determined 5.25 bushels per acre as the value of 5% least squared difference (LSD 0.05 ) for mean separations. Asterisk symbol (*) next to yield delta (%) inside the figure indicates significant yield difference from parental line WM 82 at p = 0.05. All data are presented as mean of eight plot replicates over three individual sites ± standard error (SE). (b) Seed yield of psNTP9 events and parental line trialed in 2017. The field trials were implemented during the counter-season seed increase at three locations in Puerto Rico. Seed yield performance and analysis of psNTP9 events at each of the individual sites can be found from supplemental data. Combined site analysis of seed yield over the three trials determined 3.91 bushels per acre as LSD 0.05 value for mean separations. Asterisk symbol (*) next to yield delta (%) values inside the figure indicates significant yield difference from parental line WM 82 at p = 0.05. All data are presented as mean of fifteen plot replicates over three individual sites ± SE. (c) Seed yield of psNTP9 events and parental line trialed in 2018. The field trials were conducted at three different production regions in the US-Lawrence, Kentucky; Smithville, Missouri and Troy, Ohio. Seed yield performance and analysis of psNTP9 events at each of the individual sites can be found from supplemental data. Combined site analysis of seed yield over the three trials determined 3.30 bushels per acre as LSD 0.05 value for mean separations. All data are presented as mean of twelve plot replicates over three individual sites ± SE. (d) Seed yield of psNTP9 and sibling nulls trialed in 2018. The field trials were conducted at three different production regions in the US-Lawrence, Kentucky; Smithville, Missouri and Troy, Ohio. Combined site analysis of seed yield over the three trials determined 3.30 bushels per acre as LSD 0.05 value for mean separations. Asterisk symbol (*) next to yield delta (%) inside the figure indicates significant yield difference from corresponding sibling null at p = 0.05. All data are presented as mean of twelve plot replicates over three individual sites ± SE. www.nature.com/scientificreports/ Supplementary Fig. S1A). In parallel, there was also a significantly higher psNTP9 protein content in 17B over 14A in this field trial (Fig. 5b). The levels of psNTP9 transcripts and psNTP9 protein were only slightly higher in 17B than in 16C (Fig. 5a,c). In this comparison of psNTP9 levels, equal protein loading in each lane was confirmed by actin immunostaining (Fig. 5b,c). When less extract from a 17B line was loaded onto the SDS-PAGE gel (lane boxed in), there was less psNTP9 detected. The primary antibody used for these immunoblots was a highly specific monoclonal antibody (mAb), 8B6, which immuno-labeled only the 48 kDa psNTP9 in crude extracts of the leaves, and did not label any bands in extracts from WT samples or null transformants (Fig. 5b,c). The immunostaining in Fig  www.nature.com/scientificreports/ In leaves harvested from other field trials, RT-PCR assays of psNTP9 confirmed its expression in all three transgenic lines ( Supplementary Fig. S1A), and in immunoblots of leaf, flower and root tissues harvested from greenhouse-grown soybeans, the psNTP9 protein was detected in immunoblots of all three tissues of 17B by the 8B6 mAb ( Supplementary Fig. S1B). The blot also shows that psNTP9 was expressed in the flower and root tissues of the 14A transgenic line and in the flower tissue of the 16C transgenic line ( Supplementary Fig. S1B). The minimal level of purified psNTP9 detectable in immunoblots using mAb 8B6 as the primary antibody is approximately 0.1 ng ( Supplementary Fig. S2).

Leaf transcriptome analyses.
Based upon early data with greenhouse grown plants, prior to collection and analyses of data from field-grown plants, leaf tissues from WT and transgenic line 14A plants were selected to transcriptome analyses. Compared with WT, 996 genes were differentially-expressed (DE) in 14A leaf tissues In panel (a) the relative expression levels were normalized to the expression level in 14A, taken as 1.0. Data represent means ± SE (n = 3). Different letters above the bars indicate statistically significant differences between the wild type and the transgenic lines using one-way ANOVA (p ≤ 0.01) with Tukey HSD test. In panel (b) equal loading was confirmed by dual staining with anti-actin antibodies. In panel (c) the lane outlined had less protein loaded than the following four lanes, as indicated by lower actin levels. 17B/DSR262 has a slightly higher level of psNTP9 immunostaining than 16C/ A2835 and 16C/A2835, but all three have higher levels than line 14A/3431, which is below detection limit in this panel. In panel (b) the samples loaded were from the following events: 17B/AA3431 (lanes 1, 2, 3 and 6);  Table S2). Enriched GO BioProcess terms associated with DE genes (DEG) are shown in Table 2. Genes involved in RNA processing, ribosome biogenesis and protein synthesis comprised approximately 81% of up-regulated DEGs. Genes associated with fatty acid and wax biosynthesis and cuticle development were also over-represented in this category. Enriched down-regulated categories included regulation of cell division, DNA replication and transcription. Expression of genes regulating epidermal cell fate specification and stomatal complex development was also decreased.
Genes related to cuticle formation are differentially-expressed in 14A leaves. Soybean genes functioning in the synthesis of very long-chain fatty acids (VLCFA), cutin monomers and cuticular waxes, or regulation of cuticle development, were identified using Soybase genome annotation data, which includes Arabidopsis ortholog information based on top BlastP hits (TAIR10), and GO BioProcess terms related to cuticle development. Most enzymes, carrier proteins and transporters involved in these processes are encoded by gene families in Arabidopsis. Supplementary Table S3 summarizes the normalized expression of DE soybean orthologs of Arabidopsis genes involved in cuticle formation. Expression of non-DE orthologs for each gene are also included to provide a more complete understanding of how DE genes may contribute to changes in cuticle formation and properties. All DEG encoding enzymes involved in synthesis of fatty acid precursors of cutin and cuticular waxes (see Refs. 34,35 ) were up-regulated in 14A leaves. These included acyl carrier proteins ACP1 and ACP4, which serve as co-factors in fatty acid synthase (FAS) complex reactions in the plastid, enoyl-ACP reductase (ENR1), which catalyzes the final reduction step in each FAS cycle (2-C addition), and fatty acyl-ACP thioesterase B (FATB), which terminates the elongation cycle by hydrolyzing ACP, releasing free C16/C18 fatty acids. Up-regulation of long-chain acyl-CoA synthetase 1 (LACS1), which activates the free fatty acids to CoA esters, and cytosolic acyl-CoA-binding protein 6 (ACBP6), which facilitates acyl-CoA export from the plastid to the endoplasmic reticulum site of VLCFA, cutin and wax biosynthesis, was also seen. Genes important for VLCFA synthesis included cytosolic ATP citrate lyase subunit B2 (ACLB-2), which generates acetyl-CoA used in the synthesis of malonyl CoA, the 2-C donor for VLCFA elongation, and nine genes encoding all four enzymatic activities in the ER fatty acid elongase (FAE) complex. Up-regulation of the single soybean gene ECERIFERUM 2 (CER2), encoding an additional component of the FAE that facilitates elongation of fatty acids greater than 28-C 36 and provides VLC acyl-CoA for wax biosynthesis, was also noted. In most cases, DE genes were among the most highly-expressed paralogs encoding each enzyme activity.
Relatively few soybean genes involved in wax synthesis 35 were DE in 14A leaves, but several changes are noteworthy. Of the nine paralogs of VLCFA CoA reductase (CER4 37 ), down-regulation of the highly-expressed Glyma.11g185100 may result in decreased flux through the alcohol-forming branch of wax synthesis. However, none of the three highly-expressed WSD1 genes encoding the bifunctional wax ester synthase/diacylglycerolacyltransferase that catalyzes the second step of this pathway 38 were DE in 14A leaves. In contrast, up-regulation of three highly-expressed VLC aldehyde decarbonylase complex genes 39 (CER1, Glyma.03G101800; CER3, Glyma.13G091200 and Glyma.17G069100) support increased formation of VLC-alkanes, despite decreased expression of Glyma.03G101200 and Glyma.07G114200. CER3 expression in 14A leaves increased 4.4-fold, relative to WT leaves, in qRT-PCR assays, independently confirming the up-regulation of this gene (Supplementary  Table S2). Three soybean genes encoding the midchain alkane hydroxylase 1 (MAH1), which catalyzes formation of VLC-secondary alcohols and VLC-ketones in latter steps of the alkane pathway 40 , were not differentially expressed (data not shown).
Transport of cutin monomers and cuticular waxes from the ER to the epidermal surface is poorly understood 35 . Homodimers of ABCG11 half-transporters and heterodimers of ABCG11/ABCG12 function in secretion of cuticular lipids in epidermal cells of Arabidopsis stems 44 . Soybean ABCG11 genes were not DE, including the highly-expressed Glyma.09g160000 and Glyma.16g209400 genes. Soybean genes annotated as ABCG12 were not identified but BlastP searches using the Arabidopsis ABCG12 (AT1G51500) sequence identified 10 highly-similar ABCG15 (CER5) orthologs. None were DE except for Glyma.03g135525, which exhibited approximately 20-fold lower expression than the two most highly-expressed genes in this family (Supplementary  Table S3). Three soybean genes encoding apoplastic GPI-anchored lipid transfer proteins (LTPG1, LTPG2) were up-regulated in 14A leaves. Both proteins function in cuticular wax export and accumulation 45 . Expression of two cell wall-localized members of a large family of non-specific lipid transfer proteins, LTP1 and LTP5, was increased > twofold in 14A leaves but the role of LTP in cuticle formation is unclear 46 .
Transcriptional and post-transcriptional mechanisms regulating cuticle formation have been previously reviewed 35,47 . In 14A leaves, three MYB94 genes were up-regulated. Lee et al. 48 reported that MYB94 and functionally-redundant MYB96 activate a number of genes important for synthesis of VLCFA (CER6, KCR1) and cuticular waxes (CER1, CER3, WSD1), consistent with findings in the present study.
Genes that regulate stomatal development are downregulated in 14A leaves. Among Table 4). Interestingly, a number of genes encoding a signaling peptide and components of a receptor-mediated signaling cascade that negatively regulates SPCH activity were also down-regulated, perhaps as compensatory response to decreased SPCH expression.
Increased chlorophyll and protein content of 14A and 17B leaves. Total chlorophyll and protein contents were measured from completely opened V3 stage leaves of wild type and transgenic plants. The chlorophyll content of 16C leaves was not significantly different than in leaves from wild type plants. There was 40% increase in the chlorophyll content of both 14A and 17B transgenic leaves when compared to wild type leaves (Fig. 6a). There was also a 40% increase the total soluble protein content in 14A and 17B leaves and a smaller but significant 20% increase in 16C leaves, compared to wild type leaves (Fig. 6b).
Increased thickness of the cuticle and cell wall in 17B leaves. Cuticle and wall thickness of epidermal cells in WT and 17B transgenic leaves was examined by direct measurements using TEM images. Both adaxial and abaxial cell wall and cuticle thickness of epidermal cells were greater in leaves of the transgenic 17B line compared to wild-type (Fig. 7a-c). The differences between WT and 17B leaf epidermal cells for both cell wall and cuticle thickness were greatest for the abaxial side of leaves. Adaxial cuticle width was on average 14% percent greater in the 17B mutant epidermal cells compared to WT while it was 95% greater on the abaxial side of epidermal cells (Fig. 7a). Cell wall thickness showed a similar pattern with adaxial wall width in 17B epidermal cells averaging 50% greater than WT while it was 95% greater for abaxial cell walls (Fig. 7b). Because the greatest increases in cuticle and wall thickness occurred on the abaxial sides of the 17B leaves, the ratios between adaxial and abaxial cuticle and cell wall thickness were greatly reduced in mutant leaves compared to WT leaves The total soluble protein contents of 14A, 16C and 17B transgenic leaves from thirty-four-day-old greenhouse grown plants at V3 stage (the third trifoliate stage) is higher than in WT leaves. The data shown are means ± SE from 3 biological replicates (n = 6). Different letters above the bars indicate statistically significant differences between the wild type and the transgenic lines using one-way ANOVA (p ≤ 0.01) with Tukey HSD test. www.nature.com/scientificreports/ www.nature.com/scientificreports/ (WT cell wall ratio = 1.94, cuticle ratio = 2.2; 17B cell wall ratio = 1.49, cuticle ratio = 1.29). These differences in cell wall and cuticle thickness can be seen in representative TEM photos (Fig. 7c).

Reduced transpiration water loss and chlorophyll leakage in leaves of 14A and 17B.
Assays of transpiration water loss (Fig. 8a) and chlorophyll leakage (Fig. 8b) from detached leaves were used to investigate differences in cuticle permeability between WT and 14A, 17B leaves. Both transpiration water loss and chlorophyll leakage were significantly reduced in the leaves of 14A and 17B compared to WT leaves. The leaves of the 17B transgenic line had statistically significantly longer trichomes than wild-type leaves ( Supplementary  Fig. S4). Longer trichomes of 17B leaves were observed on both abaxial and adaxial surfaces of the leaves. In another study related to leaf control of transpiration water loss, initial studies indicated that the leaves of the 17B transgenic line have a higher trichome density (data not shown).

Decreased stomatal density in leaves of 14A and 17B transgenic lines. Analysis of stomatal
development on abaxial side of leaves in WT and the 14A and 17B transgenic line showed statistically significantly lower average stomatal density in 14A and 17B leaves (Fig. 9a). Similarly, measurements of the peel area covered by stomata in the abaxial side of 14A and 17B leaves showed statistically significantly less stomatal coverage compared to WT (Fig. 9b).

Discussion
In the Veerappa et al. 19 study, the seed yields in the 14A, 16C, and 17B soybean lines were remarkably higher than WT seed yields (31%, 45%, and 54% higher, respectively), when grown under ideal greenhouse conditions. However, positive yield results observed in the greenhouse are often not replicated under the more stressful field conditions, or are observed in only a minority of field trials 50 . What is remarkable about the data presented here is that the increased seed yields reported for three different transgenic soybean lines expressing the psNTP9 transgene were consistently observed over several growing seasons in diverse soil types and climate conditions. These increases, which typically ranged over 10% to as high as 44%, would all be considered of significant agricultural value. Each soybean line exhibited a single insertion of the transgene and there did not appear to be positional effects on expression of neighboring genes in either 16C or 17B lines, although the insertion site in 17B appears to be within the promoter of Glyma.03G074600. qRT-PCR results indicate this gene does not change expression in 17B, so the insertion site is probably not a strong cis-acting element in this gene. In 14A, qRT-PCR analysis confirmed there was strong upregulation of Glyma.11G183700, even though this gene is located ~ 10 kb downstream from the insert. The GmTGA14 product of this gene is closely-related to GmTGA17, whose overexpression confers www.nature.com/scientificreports/ drought and salt stress tolerance in soybean seedlings 52 . Thus, increased expression of Glyma.11G183700 in 14A may enhance abiotic and biotic stress tolerance, contributing to increased seed yield in this line. However, since this gene was not strongly up-regulated in 17B leaves, and in all four field trials both 17B and 14A events outperformed the WT line in seed yield and, in one of the four trials, all three events had higher seed yields than the WT line, the overall results of this study argue that the yield phenotypes observed in the transgenic lines were due to psNTP9 expression rather than to the insertion site effects. In the Shipman field trial, the higher yield of 17B over 14A correlated with higher expression of psNTP9, but this correlation would have to be tested in many other field trials to rigorously evaluate whether yield was consistently related to the level of psNTP9 expression. The seed yield increases in the greenhouse and in initial field trials in 14A, 16C and 17B were all measured in comparison to the Williams 82 variety of soybean, which is no longer a commercially valuable variety 53 . A critical question was whether transferring psNTP9 into the genetic backgrounds of elite varieties could also increase the yield of those soybeans. Although which line produced the highest yield per acre varied from field site to field site and from year to year, the 17B line more often produced the highest yield in most field studies, just as it did in greenhouse studies 19 , thus it was selected to be introgressed into elite lines. When field tests of relative yields from these introgressed events were carried out, once again events expressing psNTP9 had significantly higher yields than the null controls. Because trait integration for further commercial development is typically concentrated on the top introgression lines of targeted backcrosses, the average yield increase of 10-30% in the top two lines of 17B introgressed into elite lines would be considered of high potential value for commercial development.
Several of the phenotypic changes observed in the transgenic lines could help account for the increased seed yields. For example, the significantly elevated chlorophyll contents of leaves of all transgenic lines indicates a higher nitrogen (N) content, which is consistent with the observed up-regulation of important N-assimilation genes in 14A plants. Similarly, the increased protein contents of transgenic leaves would be supported by increased N assimilation and the remarkable number of up-regulated genes related to protein synthesis in 14A leaves supports this as a major biosynthetic activity that differentiates transgenic lines from WT. Nitrogen is the main nutrient remobilized from senescing leaves and plays a major role in determining seed yields and quality 54 . Increased N remobilization from leaves of transgenic lines during later stages of development may help Figure 9. Leaves from 14A and 17B transgenic soybean lines display a significant decrease in stomatal density and stomatal coverage compared to wild-type (WT). Analysis of stomata of WT, 14A and 17B leaves shows (a) lower stomatal density for 14A and 17B leaves (p < 0.05), and (b) smaller percent area covered by stomata for 14A and 17B leaves compared to WT (p < 0.001). Data represent means ± SE, with statistical differences based on the Student's t-test. www.nature.com/scientificreports/ to explain increased yields in these lines. Of course, higher chlorophyll contents would also support higher rates of photosynthesis leading to higher tissue mass, which had been previously observed in the transgenic lines 19 . A drought tolerance and increased yield phenotype in soybean plants ectopically expressing psNTP9 was previously reported 19 . These plants exhibited both an enhanced root system architecture for water uptake and decreased transpiration water loss in detached stems and leaves. In the present study, genome-wide expression profiling of WT and 14A leaves was used to further investigate molecular mechanisms that may regulate the drought tolerance phenotype. Gene expression data support enhanced cuticle development and decreased stomatal density in 14A leaves and these phenotypes were experimentally confirmed.
The cuticle plays a major role in regulating transpiration water losses from terrestrial plant tissues 35 . Gene expression profiles suggest an increased synthesis of cuticular wax in the transgenic line 14A. Increased expression of several lipid transfer proteins in this line may increase export of waxes into epidermal cell walls 45 , increasing cuticle thickness. Crops with more cuticular wax have improved water use efficiency (WUE) and increased drought tolerance and yields than non-waxy crops 47 . Increased cuticle thickness in 17B leaves, relative to WT or 14A leaves may contribute to the observed drought tolerance of this line.
Previous studies, however, have generally found that cuticle water permeability (CWP) is poorly correlated with cuticle thickness or total amount of wax or cutin. Cuticle substructure and wax composition appear to be more important 35,55 . A high proportion of non-polar wax constituents like alkanes, rather than nonaliphatic wax compounds appears to be associated with lower CWP 35 . In Arabidopsis, changes in CER1/CER3 expression 39,56 or CER3 activity 57 regulate alkane synthesis. Increased synthesis of alkanes resulted in increased cuticle and epidermal cell wall thickness and drought tolerance in this plant 57 . The coordinated down-regulation of CER4 and up-regulation of CER1/CER3 in 14A plants supports increased activities in the alkane-forming branch of wax synthesis. Alkanes are the major wax constituent in soybean leaves 58 and this alkane fraction increases markedly in response to drought stress in soybean and other plants 47,58 . Thus, the predicted alkane-enriched cuticles of 14A leaves would be expected to be less permeable than those of WT leaves, and this was shown experimentally for both 14A and 17B lines. Lower cuticle permeabilities in these plants likely contribute to their improved drought tolerance 19 .
Decreased stomatal density in 14A leaves may result from decreased expression of several SPCH and MUTE transcription factor genes. The mechanisms by which this occurs are unknown. However, reduction of stomatal density (SD) via targeted changes in the genetic network that regulates stomatal development has produced plants with enhanced WUE and drought tolerance 59 . This results from reduced stomatal transpiration, which is the major contributor to leaf transpiration 60 . The combination of reduced SD and increased trichome density has been shown to improve WUE in tomato plants 61 . Reduced SD may also impact cuticular transpiration indirectly. Márquez et al. 60 proposed that "pores" in the cuticle around stomata and trichomes are the main source of gas leakage across the cuticle. Thus, decreased SD in 17B leaves would be expected to decrease both cuticular and stomatal transpiration. Stomatal transpiration is also regulated by changes in stomatal aperture. Previous studies with Arabidopsis have shown that apyrases regulate stomatal aperture by modulating eATP levels in guard cells 62,63 and in the previous greenhouse study we showed that expression of psNTP9 in soybean increased the sensitivity of the stomatal response to ABA treatment 19 .
Enhanced cuticle development and decreased SD in psNTP9 transgenic lines would result in physical barriers that may further limit insect and pathogen attack 35,64,65 , perhaps, contributing to increased seed yields in field-grown plants.
It is remarkable that the drought-tolerance phenotype is observed in greenhouse-grown transgenic lines which did not experience prolonged drought stress. Decreased cuticular transpiration (lower cuticle permeability) and stomatal transpiration (decreased stomatal density) may increase yields by enabling the plant to better adapt to actual water stress conditions in the field, which may occur intermittently during the growing season. This attribute will be even more important with climate change 66 .
Many of the trait changes observed in the transgenic lines likely resulted from the gene expression changes that were induced by the ectopic expression of psNTP9 in these lines, and the probable links between specific gene expression changes and trait alterations were noted in the Results section. As to what molecular mechanisms could explain how psNTP9 expression could result in transcriptomic changes, a consideration of two subcellular locales in which the protein functions, i.e., the ECM and/or the nucleus, would be instructive. In peas, the psNTP9 protein has been immunolocalized both in the ECM and in the nucleus 67 , and, in soybean extracts, it co-purifies both with isolated nuclei and with wall preparations (Supplementary Fig. S3). Its primary function in the wall would be as an ecto-NTPDase, to limit the concentration of eATP, a signaling agent that plays a major role in controlling gene expression both in animals and in plants 68 . Its function in the nucleus is as yet undefined. However, the fact that it was originally purified from a chromatin fraction of the nucleus 24 , and that it induces genome wide changes in gene expression, would suggest that its NTPDase activity or, perhaps, protein-protein interactions in the chromatin could have a major impact on transcription.
The special significance of the field trial results reported here at multiple locations in multiple years is that the yields of the transgenic events were consistently enhanced over those of the controls, and the yields of transgenic lines derived from introgressions into various elite soybean varieties were significantly higher than those of the recurrent parents. To help explain the enhanced yields, our report offers plausible mechanisms that are supported by a unique combination of genetic, physiological, anatomical, and agronomic assays.

Methods
Genomic DNA isolation and Southern blot for detection of T-DNA copy number. Soybean genomic DNA was isolated from the 3-6 leaves of transgenic soybean lines by a modification of the method of Dellaporta et al. 69  www.nature.com/scientificreports/ Detection Starter Kit II following the manufacturer's instructions (Roche Applied Science). Fifteen micrograms of genomic DNA were digested with either XbaI or BamHI. Digested DNA was separated on 0.8% by electrophoresis and transferred to a positively charged nylon membrane (Amersham Hybond N). The membrane was hybridized at 42 °C with a digoxigenin-labeled Bar probe 42 °C generated using a PCR DIG Probe Synthesis Kit (Roche) and BarF and BarR primers specific to NOS promoter and NOS terminator sequences flanking the BAR gene in the T-DNA construct, respectively (Table. 1). The blot was washed with 2 × SSC, 0.1% sodium dodecyl sulfate twice for 5 min at room temperature, 0.5 × SSC, and 0.1% sodium dodecyl sulfate twice for 15 min at 68 °C. The signals were detected by anti-digoxigenin antibody and chemiluminescent substrate for alkaline phosphatase (Sigma) followed by exposure on X-ray film.

Identification of T-DNA insertion sites via PCR-based methods.
For the identification of the insertion site of the T-DNA, two PCR-based methods of thermal asymmetric interlaced polymerase chain reaction (TAIL-PCR) and inverse PCR were used. TAIL-PCR was used to determine the insertion loci of transgenic soybean by a modification of the method of Singer and Burke 70 . The PCR condition and program TAIL-PCR were carried out with three T-DNA vector-specific primers (LB 1-3) and six arbitrary degenerate primers (AD 1-6) (Supplementary Table S1). Three PCR reactions (primary, secondary and tertiary TAIL PCR) were performed. The tertiary TAIL-PCR product was cloned into the pGEM-T easy vector and sequenced by Sanger sequencing at the University of Texas Institute of Cellular and Molecular Biology core facility. Inverse PCR was also used to identify the insertion sites of T-DNA in transgenic soybean lines. The inverse PCR method was performed referring to Rønning et al. 71 with slight modifications. Ten micrograms of genomic DNA were digested with either BamHI or XbaI. The digested DNA was self-ligated with T4 DNA ligase. The ligated DNA fragments were used as the DNA templates for primary PCR with the primer set of LB1/Bar3R or NTP1R/NTP5F. Using the primary PCR product, secondary PCR was performed using the nested primer set of LB2/Bar2R or NTP6F/35SproR specific to the T-DNA construct. The primers used in inverse PCR are found in Supplementary Table S1. The PCR products were confirmed on the agarose gel. The confirmed secondary PCR product was sequenced by Sanger sequencing.
The sequences from TAIL PCR and inverse PCR were analyzed using Quick Wm82 Genome BLAST at Soy-Base (https:// www. soyba se. org/) to identify integration loci in soybean chromosomes 72 .

PCR validation of T-DNA insertions.
Primers were designed to generate an amplicon that contains a small portion of 5′ T-DNA end and 5′ flanking region or 3′ T-DNA end and 3′ flanking region. Primer sets of 35SproR/Gm11R, Gm03F/LB3, and LB3/GM14R were used for the insertion loci validation of soybean transgenic lines of 14A, 17B, and 16C, respectively (Supplementary Table S1). Untransformed soybean genomic DNA and distilled water were used as negative controls in PCR reactions. PCR was performed using genomic DNA of three transgenic plants and PCR SuperMix (Invitrogen, (Thermo Fisher Scientific) according to the manufacturer's instructions. The PCR program was as follows: 94 °C (4 min); 28 cycles of and 94 °C (30 Re: revised manuscript submitted; one query from editor sec), 53 °C (30 s), and 68 °C (1 min); and then 72 °C (7 min). Amplified products were analyzed on 1% agarose gels.
Transgenic events, elite varieties, and introgression lines. The parent variety, Williams 82, was developed by the USDA-ARS and the Illinois Agriculture Experiment Station and released in 1981. It is a late group III indeterminate variety (relative maturity 3.8). Williams 82 was selected because of its ability to facilitate plant genetic transformation. Williams 82 was also the cultivar used for soybean whole-genome shotgun sequence of Glycine max 73 . Three independent transgenic events of psNTP9, designated as 14A, 16C and 17B, were previously characterized 19 . Null siblings that did not contain psNTP9 gene were segregated from T2 population of each transgenic event and used as negative control. In order to integrate psNTP9 gene into newer varieties for commercial product development, lead events were backcrossed into several elite varieties differing in relative maturity. The soybean elite varieties chosen as recurrent parents for backcrossing, included A1900 (relative maturity 1.9), DSR262 (relative maturity 2.6), and A3431 (relative maturity 3.4). Field trial data acquisitions. In the course of plant growth and development, agronomic performance data were collected for gene efficacy evaluation, in addition to soybean seed yield at harvest. Agronomic data included days to 50% emergence, early plant stand, initial plant density, seedling vigor scoring, plant height (at V2-V3 and R6-R7), days to 50% flowering, days to 50% maturity, lodging score, shattering score, plant stand at R8, flower color, seed moisture (%) and test weight at harvest, grain weight, arthropod counts, pest damage and disease infestation. www.nature.com/scientificreports/ Field trial statistical analyses. Agronomic and yield data from multiple locations over multiple years were analyzed using a linear mixed model. The locations were analyzed individually as well as grouped by country and a global analysis of all locations combined. The individual location analysis used the following model:

Multiple year and multiple
• Notation: • y ij denotes the unique individual observation.
• µ denotes the overall mean.
• g i denotes the mean of the ith entry.
• r j denotes the effect of the jth block.
• Ε ij denotes the effect of the plot assigned the ith entry in the jth block (residual error).
• ~ iid N(0, σ 2 ) indicates random variables that are identically and independently distributed (iid) as normal with zero mean and variance σ 2 .
The combined location analysis used the following model: • Notation: • y ijk denotes the unique individual observation.
• µ denotes the overall mean.
• g i denotes the mean of the ith entry.
• l j denotes the effect of the jth location.
• rk (j) denotes the effect of the kth block within the jth location.
• (gl) ij denotes the interaction between the entries and locations.
• e ijk denotes the effect of the plot assigned the ith entry in the kth block of the jth location (residual error).
• ~ iid N(0,σ 2 ) indicates random variables that are identically and independently distributed (iid) as normal with zero mean and variance σ 2 .
The mixed model analyses of variance were conducted using SAS Proprietary Software version 9.4 (SAS Institute, 2015). The level of statistical significance was predetermined to be 5% (p = 0.05).
Transcriptome analyses of field-grown and greenhouse-grown WT and transgenic soybean lines. For RT and qRT-PCR analyses of mature leaf tissue from field-grown soybean lines (including Williams 82 WT, and the three transgenic lines, 14A, 16C and 17B), the harvested leaves were frozen, and leaf discs (0.5 cm in diameter) were excised from them, frozen and shipped frozen to the University of Texas. For each analysis, three leaf discs were sampled and ground frozen by mortar and pestle. Texas Crop Science, LLC contracted the field trials and had permission to collect the soybean leaves.
For RT-PCR and qRT-PCR, 1 μg total RNA was extracted using the Spectrum Plant Total RNA kit (Sigma) for semi-RT PCR. The RNA was converted to cDNAs by using the High Capacity cDNA Reverse Transcription kit (Applied Biosystems) with 10 × RT Random Primers (Applied Biosystems). For RT-PCR, reaction mixtures contained 25 μL of Quick-Load Taq 2 × Master Mix, 1 μL of 10 µM forward primer, 1 μL of 10 µM reverse primer and 2 μL of cDNA in final volume of 50 μL. The PCR protocol had an initial denaturation at 95 °C for 30 s, and 36 cycles of denaturation at 95 °C for 30 s, annealing at 53 °C for 1 min, and extension at 68 °C for 1 min (BAR, TUB) or 1.5 min (psNTP9). The final extension was at 68 °C for 5 min. After PCR, 1% agarose gel was used for electrophoresis. RT-PCR primers used are found in Supplementary Table S1 of Veerappa et al. 19 .
For RNA-seq and qRT-PCR analyses of DE genes of greenhouse-grown WT and transgenic plants, total RNA was extracted from frozen leaf tissues of 45-day-old (V5 developmental stage) 14A and WT plants grown in the greenhouse without supplemental lighting (14 h light). Care was taken to make sure that all the leaves harvested were the same size and developmental stage and that the plants were also healthy and at equivalent developmental stages and age. Total RNA was extracted from frozen tissues using Sigma Purelink RNA isolation kit. An Invitrogen DNaseI kit was used for digestion of genomic DNA, and a High capacity cDNA synthesis kit (AB Biosystems) was used for cDNA preparation. y ij = µ + g i + r j + e ij , r j ∼ iid N 0, σ 2 Rep and e ij ∼ iid N(0, σ 2 plot). www.nature.com/scientificreports/ qRT-PCR primers for psNTP9 are found in Supplementary Table S1. qRT-pCR primers for the reference gene ELF1B are found in Supplementary Table S2. qRT-PCR primers for the DE target genes are found in Supplementary Table S2. Primers were designed using NCBI Primer-BLAST 75 so that one primer of each pair spanned an exon-exon junction, preventing amplification of gDNA. Reactions (20 µL) contained 5 μL of cDNA (1 ng/µL), 0.4 µL of each primer (10 μM), 10 μL of Power SYBR Green master mix (Applied Biosystems, USA), and 4.2 μL nuclease-free water. qRT-PCR was conducted using a ViiA7 Real-Time PCR System (Thermofisher Scientific) as follows: 95 °C for 10 min, followed by 40 cycles of 95 °C for 30 s, 58 °C for 30 s, 72 °C for 30 s in 96-well optical reaction plates (Applied Biosystems, USA). Expression of reference gene ELF1B 74 was used to normalize target gene expression in three replicates. Relative expression was calculated using the DDCT method 76 . Dissociation curve analyses were used to check for amplification of homogenous products.
GO term enrichment analyses. Analyses of up-or down-regulated DEG lists for GO BioProcess term enrichment were conducted using the Soybase GO Term Enrichment Tool (https:// soyba se. org/ gosli mgrap hic_ v2/ dashb oard. php). Overrepresented processes were defined as being enriched ≥ fivefold (p ≤ 0.05, Bonferroni correction). Arabidopsis thaliana Col database resources within BioCyc (https:// biocyc. org/ ARA/) were used to retrieve genes, annotation and pathways information related to cuticle development, an overrepresented GO BioProcess. Soybase genome annotation (Wm82.a4.v1) data were used to identify soybean orthologs of Arabidopsis genes.
Western blot analysis of psNTP9 protein extracted from different tissues of 3 soybean lines. Tissue was harvested from either greenhouse-grown or field grown WT, 14A, 16C, and 17B soybeans, and frozen. Frozen tissues were homogenized with a bead beater or with mortar and pestle and immediately dissolved in 150 μL of protein isolation buffer containing 10 mM Tris-HCl pH 8.0, 1 mM EDTA, 0.5 mM EGTA, 1% Triton, 0.1% SDS, 140 mM NaCl, 5 mM β-mercaptoethanol. Protein quantity was measured with Bradford assay, using the BioRad reagent, and equal loading was assayed by actin immunostaining. Proteins were separated by SDS PAGE and transferred to a PVDF membrane by semi-Dry transfer method. Blots were incubated with 8B6 monoclonal antibody (GeneScript; 1:50 dilution) and goat anti-mouse fluorescent antibody (1:5000 dilution). Fluorescent images were detected using Odyssey infrared imaging system (Licor). Although the full length of each blot was imaged, only the portion of each blot above the 25 kDa marker is presented in the Results, since no immunostaining was detected below this portion in any of the transgenic lines.
Total chlorophyll measurement. Fresh leaves of thirty-four-day-old greenhouse grown wild type and transgenic plants at V3 stage (the third trifoliate stage) were ground using liquid nitrogen. The photosynthetic pigments were extracted using 80% acetone. Total chlorophyll content was determined using a spectrophotometer following the protocol of Fritschi and Ray 79 .
Determination of soluble leaf proteins. Fresh leaves of thirty-four-day-old greenhouse grown wild type and transgenic plants at V3 stage (the third trifoliate stage) wild type and transgenic plants were frozen in liquid nitrogen. The soluble leaf protein content was determined by the method of Bradford 80 . Cuticular transpiration and chlorophyll leaching assays. V5 stage leaves of well-watered Williams 82 (wild type) and independent psNTP9 transgenic lines (14A and 17B) were used for cuticular transpiration and chlorophyll leaching assays. For the cuticular transpiration assay, plants were placed in the dark for 6 h to facilitate stomatal closing, then leaves were detached and 0 h weights were recorded. Three replicates were placed in darkness at room temperature and 40% relative humidity. Leaves were then weighed at 1 h intervals for 7 h. Leaf weights at each time point were calculated as percentages of 0 h leaf weights.
For the chlorophyll leaching assay, V5 stage soybean leaves of well-watered plants were detached and immersed in 40 mL of 80% (v/v) ethanol in a 50 mL capped tube, then placed on a rocker platform in the dark. A series of 100 μL aliquots were taken after 3, 4, and 4.5 h immersion and extracted total chlorophyll was calculated as µmol chlorophyl = [7.93 × A 664 + 19.53 × A 647 ], according to Lee et al. 48 . Chlorophyll leaching at each time point was calculated as percentages of the total chlorophyll extracted at 24 h after immersion.
Purification of cell wall proteins and of nuclei from etiolated soybean seedlings. Wall proteins were purified from hypocotyl tissues of etiolated 6-d-old soybean seedlings, using the method of Kim et al. 51 . As a negative control to test for cytoplasmic contaminants in the wall extract, the preparation was tested by immunoblot analysis for the presence of actin. Nuclei were purified by the method of Chen et al. 24 , using hook tissue from 6-day-old etiolated soybean seedlings. www.nature.com/scientificreports/ Microscopic analyses of leaves. For determination of determination of cuticle and wall thickness transmission electron microscopy (TEM) was used. Leaves (V5 developmental stage) of approximately the same size (8-10 cm length) were cut from 45-day-old 17B and WT plants grown in the greenhouse. Small pieces approximately 1 × 1 cm were excised from each leaf and immersed overnight at room temperature in a mixture of EM-grade aldehydes containing 4% glutaraldehyde and 2% formaldehyde in 0.05 M cacodylate buffer, pH 7.4. After three buffer rinses, the pieces were cut into 1 mm strips which were then immersed overnight at room temperature in 0.5% Ruthenium Red in 0.1 M cacodylate buffer. The strips were then washed with cacodylate buffer and immersed in reduced osmium (2% osmium tetroxide and 2% potassium ferrocyanide in 0.1 M cacodylate buffer, pH 7.4). Osmium fixation of the strips was started in a preparative microwave and was completed with 2 h immersion on ice. Following dehydration in an ethanol series, the leaf pieces were embedded in Epon Hard 812 (Electron Microscopy Sciences, https:// www. emsdi asum. com), from which 70 nm thin sections were cut with a diamond knife, picked up on Formvar-coated grids, and imaged with a Tecnai T12 TEM operated at 80 kV. During thin sectioning, care was taken to position the diamond knife perpendicular to the leaf surfaces to insure accuracy of subsequent measurements. Measurements of cell wall and cuticle thickness were made from TEM images at instrumental magnification of 26,500× and 43,000× with Image J, using the scale embedded in each image to calibrate the line tool. Measurements were made approximately every 500 nm, typically three to five measurements were taken at equal intervals across each image for both the cuticle and cell wall. For each cell measured, the thickness of the cuticle and wall was measured by lines originating at and perpendicular to the cuticle-cell wall junction. It was observed that the wall thins near cross walls, so measurements were not taken from these regions.
For determination of stomatal density and stomatal coverage, leaf peels (20-25 per line) were taken from the abaxial surface of soybean leaves (V5 developmental stage) 14A, 17B and WT plants grown in the greenhouse. Peels were immediately transferred to glass slides with 100 µL of Arabidopsis leaf buffer (10 mM KCL, 25 mM MES, pH 6.15). The prepared slide was then imaged with a 40 × objective lens. Image analysis was conducted using ImageJ. First, the area in each peel with the stomatal layer both visible and in focus was calculated and then a measurement of the area of each stoma was taken with the polygon tool. The total area of stomata visible over total peel area determined the percent area of stomatal coverage and the number of stomata present over the total visible area of a particular peel determined the stomatal density.
For trichome length analyses, similar-sized leaves (8-10 cm) from the growing tips of wild-type and 17B plants were collected. The leaves were cut into 1 cm strips, perpendicular to the midrib, and the strips were stored in 70% ethanol overnight. The cleared leaf pieces were placed in 50% bleach until they were colorless and then transferred to water.
Leaf pieces were then held between two glass slides with large spring clips which flattened and collapsed the trichomes into a single image plane. Using light microscopy, pictures of the flattened trichome samples were captured. Measurements of trichome length were carried out using ImageJ. As many trichomes were measured as possible to get a representative distribution. Approximately 200 trichomes of each group were measured (total n = 817), representing about 70% of the total trichomes visible. Certain trichomes were difficult to follow from start to end through the entirety of the trichome and thus were not included. Statistical analysis was carried out using the student's t-test.