Characterization of glycerol-3-phosphate acyltransferase 9 (AhGPAT9) genes, their allelic polymorphism and association with oil content in peanut (Arachis hypogaea L.)

GPAT, the rate-limiting enzyme in triacylglycerol (TAG) synthesis, plays an important role in seed oil accumulation. In this study, two AhGPAT9 genes were individually cloned from the A- and B- genomes of peanut, which shared a similarity of 95.65%, with 165 site differences. The overexpression of AhGPAT9 or the knock-down of its gene expression increased or decreased the seed oil content, respectively. Allelic polymorphism analysis was conducted in 171 peanut germplasm, and 118 polymorphic sites in AhGPAT9A formed 64 haplotypes (a1 to a64), while 94 polymorphic sites in AhGPAT9B formed 75 haplotypes (b1 to b75). The haplotype analysis showed that a5, b57, b30 and b35 were elite haplotypes related to high oil content, whereas a7, a14, a48, b51 and b54 were low oil content types. Additionally, haplotype combinations a62/b10, a38/b31 and a43/b36 were associated with high oil content, but a9/b42 was a low oil content haplotype combination. The results will provide valuable clues for breeding new lines with higher seed oil content using hybrid polymerization of high-oil alleles of AhGPAT9A and AhGPAT9B genes.

S S S S S S S S S S P P P P P P P P P P A  A   I  I  I  I  I  I  I  I  I  I I  I  L  I  I  I  I  I  I   L  L  L  L  L  V  I  I  I  L   F  F  F  F  F  F  F  F  F  F   P  P  P  P  P  P  P  P  P  P   A  A  L  L  I  T  I  I  I  T M  M  I  F  I  I  I  I  I  I   I  I  I  I  I  I  I  I  I  I F  F   I  I  I  I  I  I  I  I  I  I P P P P P P P P P P  I  I  I  I  I  I  I  I  I  I   I  I  V  V  I  V  I  I  I  I T  T  T  T  T  T  T  T  T  T   I  I  I  I  I  I  I  I  I  I AhGPAT9A  376  AhGPAT9B  376  AtGPAT9  376  BrGPAT3  373  GmGPAT9  376  MtGPAT  376  VaGPAT3  376  ViGPAT3  376  VuGPAT9 376 CiGPAT9 I  I  I  I  I  I  I  I  I I  I  I  I  I  I  I  I  I I  I  M  M  I  I  I  I  I  I   I  I  I  I  I  I  I  I  I  AhGPAT9B  To reveal the relationships of the GPAT genes, a phylogenetic tree was constructed using GPAT proteins from different plants (Fig. 1D). All GPATs were divided into three main clades, and AhGPAT9 together with GPAT9 proteins from other plants were grouped in clade I, which consisted of ER-localized proteins, including AtGPAT9 (Arabidopsis thaliana), GmGPAT9 (Glycine max), VuGPAT9 (Vigna unguiculata), etc. The GPAT1 to GPAT8 proteins were clustered in clade II, which comprised membrane-bound proteins; while the ATS proteins were grouped in clade III, which comprised soluble proteins located in the chloroplast.
Overall, there were few differences in the primary and advanced structures of the two AhGPAT9 proteins, and they may have similar functions.
Overexpression and antisense transformation of AhGPAT9 genes. The analysis of the relative expression of AhGPAT9 genes by quantitative real-time PCR (qRT-PCR) using AhACT11 as a reference revealed that the two genes exhibited specific temporal and spatial expression patterns in different tissues and that the seeds exhibited the highest transcript accumulation (Fig. 2). The expressions of the AhGPAT9 genes reached the maximum value at 42 DAP, which was consistent with the oil accumulation rate in peanut seeds. These results suggested that AhGPAT9 may play important roles in peanut seeds.
To clarify the gene function of AhGPAT9 in oil accumulation process of peanut seeds, we constructed an overexpression vector (AhGPAT9-OE) and an anti-sense expression vector (AhGPAT9-AE) (Fig. 3A) and introduced the two constructs into Agrobacterium tumefaciens, which was subsequently used to transform the FH2 cultivar. The presence of the AhGPAT9 transgene was identified in the T 0 , T 1 and T 2 generations by PCR (Fig. 3B). In 2016, a total of 20 AhGPAT9-OE T 1 (GT1) plants and 24 AhGPAT9-AE T 1 (RT1) plants were generated. Data analysis showed that the seed oil content of 20 GT1 transgenic peanuts ranged from 48.79% to 57.38%, with a mean value of 52.42%. Compared with the oil content of wild-type FH2, which exhibited an oil content of 49.76% on average, the oil content was increased by 5.35% in GT1 plants. There were 16 GT1 plants with a higher seed oil content than the wild-type (Fig. 3C). In contrast, the seed oil content of 24 RT1 plants ranged from 40.74% to 51.62%, with a mean value of 46.43%, representing a decrease of 6.70% compared to the wild-type plants. There were 18 RT1 plants showing a lower seed oil content than FH2 (Fig. 3C).
In 2017, the seed oil contents of the GT2 and RT2 transgenic plants were measured. The average oil content of seven GT2 plants derived from G1-4 was 54.58%, and it was 53.73% for seven GT2 plants derived from G9-2. The observed contents were all significantly higher than that of the wild-type (Fig. 3D). The mean oil content of the 14 GT2 transgenic plants was 54.16%, representing an increase of 8.82% compared with wild-type FH2 (Fig. 3D). In addition, the average oil content of five RT2 plants derived from R1-1 was 45.01%, and it was 44.73% to other five RT2 plants derived from R12-3, which all showed significantly lower oil contents than the wild-type plants (Fig. 3E). The mean oil content of ten RT2 plants was 44.87%, corresponding to a decrease of 9.13% than the wild-type plants (Fig. 3E).
Two GT2 plants (G1-4-4 and G9-2-2) and two RT2 plants (R1-1-1 and R12-3-3) were selected to produce the GT3 and RT3 transgenic lines respectively, in 2018. We tested the expression of the AhGPAT9 gene in transformed peanut plants (G1-4-4, G9-2-2, R1-1-1 and R12-3-3) using qRT-PCR with gene-specific primers. As shown in Fig. 3F, AhGPAT9 expressions in the overexpression lines (G1-4-4 and G9-2-2) were much higher than that in wild-type FH2. In contrast, the expression levels of the AhGPAT9 gene in anti-sense-expressing peanut transgenic lines (R1-1-1 and R12-3-3) were lower than those in FH2 (Fig. 3F). In general, the expression levels of the AhGPAT9 gene in the overexpression plant lines (G1-4-4 and G9-2-2) were higher than those lines R1-1-1 and R12-3-3 (Fig. 3F). We also measured the total lipid contents of FH2 seeds and seeds from the www.nature.com/scientificreports/ AhGPAT9-overexpressing lines and AhGPAT9-anti-sense expressing lines. The oil content of the G1-4-4 lines was 54.21%, and it was 54.27% for the G9-2-2 lines, representing increases of 9.55% and 9.67% compared to wild-type FH2, respectively (Fig. 3G). The mean oil content of the two GT3 lines was 54.24%, which was 4.75% higher than that of the wild-type (49.49%, Fig. 3G). In contrast, the oil content of the R1-1-1 lines was 45.29%, and it was 46.31% for the R12-3-3 lines, representing decreases of 8.48% and 6.42% compared to wild-type FH2, respectively; the mean oil content of the two RT3 lines was 45.80%, which was significantly lower than that of wild-type FH2 (Fig. 3G). Some phenotypic traits of the GT3 and RT3 transgenic lines were also measured. Compared with wild-type FH2, there were no significant differences in the main stem height, lateral branch length, pod length or seed size in the transgenic lines (S2 Fig). These results indicated that overexpression or antisense expression of AhGPAT9 had little influence on plant growth, but seed oil accumulation could be promoted by the overexpression and suppressed by the antisense inhibition of AhGPAT9.
Allelic polymorphism analysis of AhGPAT9 in peanut germplasm. Peanut germplasm resources exhibit great differences in oil content. The sites of sequence polymorphism in AhGPAT9A identified in 171 peanut germplasm are summarized in S1 Table, and the allelic polymorphism information of AhGPAT9B is supplied in S2 Table. A total of 118 polymorphic sites from AhGPAT9A were identified, including 92 SNP sites and 26 InDels, with one SNP occurring every 60 bp and one InDel every 213 bp on average, and thirteen SNPs in exon regions caused amino acid changes (S1 Table). The frequency of polymorphic sites in AhGPAT9A in the 171 peanut germplasm ranged from 0.58% to 12.28%. The nucleotide diversity (i.e., the average pairwise sequence differences between two random sequences in one sample 43 ) in the sequenced region of AhGPAT9A measured by π (pairwise nucleotide diversity) was 0.00085, and the variable frequency at per site (Theta) in AhGPAT9A was 0.00364. Tajima's D statistic was estimated to test whether the SNPs were neutral mutations 43,44 , and the D value for AhGPAT9A was -2.41743, which was extremely significant (p < 0.01), indicating that the nucleotide variations in AhGPAT9A could not have been a result of the standard neutral selection. Allelic polymorphism analysis of AhGPAT9A showed that the 171 peanut germplasm could be divided into 64 types, designated haplotypes a1 to a64, among which 86 materials were of the a1 type (Fig. 4A). In AhGPAT9B, a total of 94 polymorphic sites were identified, including 72 SNPs and 22 InDels (S2 Table); on average, one SNP could be detected every 77 bp, and one InDel could be detected every 251 bp. The frequency of polymorphic sites in AhGPAT9B ranged from 0.58% to 13.45%. The nucleotide diversity π was 0.00129 among the 171 peanut germplasm, and the value of Theta was 0.00316. The Tajima's D value for AhGPAT9B was -1.87261, which was also statistically significant (p < 0.05). Allelic polymorphism analysis of AhGPAT9B showed that the 171 peanut germplasm could be divided into 75 haplotypes, designated b1 to b75, among which 71 materials were of the b1 type (Fig. 4B).
AhGPAT9-encoded amino acid sequence analysis in peanut germplasm. Many of the polymorphisms of AhGPAT9 were identified in intron regions, and many of the mutations in coding regions were synonymous. The 64 AhGPAT9A haplotypes produced 12 protein types, designated AP1 to AP12. Sequence www.nature.com/scientificreports/ analysis showed that the amino acid sequences encoded by AhGPAT9A in 157 peanut accessions were identical to AP1, which used as the wild-type. Thirteen non-synonymous substitutions detected in the exon region of AhGPAT9A caused amino acid differences in 14 peanut accessions, and those changes resulted in 11 mutant proteins (Table 1). AP10 contained three mutated amino acids, p.E13G, p.L104S and p.K130R. AP4 contained two mutated amino acids, p.R162Q and p.T184P, resulting from Ag.2874G > A and Ag.3035A > C substitutions, respectively, which were located in the acyltransferase domain. AP2, AP3, AP5, AP6, AP8, AP11 and AP12 contained only one mutated amino acid ( Table 1). The Ag.4595G > T substitution in AP6 caused coding amino acid no.346 to become a stop codon, and there were 31 fewer amino acids in the product than in AP1. Protein spatial structure prediction was performed by using the online software I-TASSER [45][46][47] . As shown in Fig. 5, compared with AP1, the spatial structures of most mutant proteins showed little difference, while AP4, AP6, AP7 and AP11 were significantly different from AP1. Detailed information showed that AP7 lacked both the binding sites for the G3P substrate and the enzyme catalytic centers, while AP11 had five more G3P binding sites (Table 1). Although their spatial structures showed little difference, AP2 lacked binding sites for substrate G3P, while AP3 had two fewer binding sites, including the typical Arg330 site, whereas AP12 had three more G3P binding sites ( Table 1). The 75 AhGPAT9B haplotypes produced seven protein types, designated BP1 to BP7. Sequence analysis showed that the amino acid sequences encoded by AhGPAT9B in 165 peanut accessions were identical to BP1, which was used as the wild-type. Six non-synonymous substitutions detected in exon regions caused amino acid differences in six peanut accessions, and those changes resulted in six mutant proteins (Table 1). BP2 contained two mutated amino acids, p.L291F and p.L296P, resulting from Bg.4328 T > C and Bg.4344 T > C substitutions, respectively, and these changes led to great differences in the protein spatial structures (Fig. 5). Due to the presence of one mutated amino acid in BP5 and BP6, their protein spatial structures were significantly different from that of BP1, and they lacked both the binding sites for substrates the G3P substrate and the enzyme catalytic centers (Table 1, Fig. 5). Although BP7 lacked G3P binding sites because of the mutation of the -p.M109T amino acid, its spatial structure showed little difference from BP1 (Table 1, Fig. 5).
Detection of oil content in peanut germplasm. The oil content of 171 peanut germplasm was tested from 2014 to 2017 (Table 2), among which the highest oil content was 57.78%, while the lowest oil content was 40.01%, showing a wide range among peanut germplasm. The broad-sense heritability (h B 2 ) of the oil content was 84.6%, and the coefficient of variation (CV) values were all around 5%, showing that the oil content was mainly determined by the genotype and was less influenced by the environment. The average oil content of the 171 peanut varieties was 49.52%, and the phenotypic distribution histograms across four consecutive years Table 1. Characteristics of different protein types encoded by AhGPAT9 in peanut germplasm. Protein types AP1 to AP12 were obtained from AhGPAT9A, while BP1 to BP7 were obtained from AhGPAT9B, and their amino acid differences compared with AP1 and BP1, respectively, were determined. p.F365S indicates a change from phenylalanine to serine at amino acid no.365 in AP2. · indicates that the amino acid changes were located in the acyltransferase domain. * indicates that the amino acid was terminal. The amino acids Phe, Glu and Arg in bold format represent typical G3P binding sites in GPAT9 proteins. www.nature.com/scientificreports/ showed near normality (Fig. 6), indicating that there were some major genes controlling peanut oil traits and that AhGPAT9 may be one of the most important genes. In the germplasm with the combined a-and b1 (a-/b1) haplotype, a5 showed a positive phenotypic effect of 9.23% (Table 3), and the oil content was 54.05%, while the oil content of the a7, a14 and a48 germplasm were significantly lower than the mean value of a-/b1, with negative phenotypic effects of 3.32%, 5.79% and 7.61% (Table 3), respectively. However, among the a1/b-germplasm, the oil content of the b57, b30 and b35 germplasm were 56.89%, 54.59% and 53.98%, respectively, which were significantly higher than the average of a1/b-germplasm, with positive phenotypic effects of 14.54%, 9.91% and 8.68% (Table 3), respectively, while the  Figure 6. Phenotypic distribution histograms of seed oil content in peanut germplasm. The x-axis shows groups with different oil content ranges, and the y-axis shows the number of lines in each group. www.nature.com/scientificreports/ b51 and b54 germplasm exhibited negative effects of 10.83% and 11.95% with oil content of 44.29% and 43.73% (Table 3), respectively. In addition, the oil content of the germplasm with haplotype combinations a62/b10, a38/b31 and a43/b36 were significantly higher than those of a1/b1 germplasm with a positive effect value of more than 5%, while the oil content of the a9/b42 germplasm was 45.93% (Table 3), which was significantly lower than that of a1/b1 germplasm. Therefore, for the AhGPAT9 genes, we speculated that a5, b57, b30 and b35 were elite haplotypes associated with high oil content, while a7, a14, a48, b51 and b54 were low oil haplotypes. Additionally, a62/b10, a38/b31 and a43/b36 were haplotype combinations associated with high oil content, but a9/b42 was a low oil combination.

Discussion
GPATs are important enzymes involved in different metabolic pathways in plants and their conserved sequences contain four acyltransferase motifs (PF01553), which are critical for catalyzing and binding to G3P substrates. It has been suggested that motifs II and III are important for substrate binding, while motifs I and IV are responsible for catalysis 48 . GPAT9 is a large glycerolipid acyltransferase family 11 . In Arabidopsis thaliana, the AtGPAT9 gene plays an essential role in the synthesis of storage lipids and membrane lipids 18,19,21,29 . In our study, peanut AhGPAT9 genes were obtained, which were homologues to the AtGPAT9 gene; the amino acid sequences of the AhGPAT9 proteins showed high sequence similarity to AtGPAT9, and they displayed much closer evolutionary relationships with mGPAT3 and mGPAT4 of mammals (Fig. 1D), which have been confirmed to play distinct roles in adipogenesis 26 . The results suggested that AhGPAT9 genes may exhibit similar functions in the synthesis of storage lipids in peanut.
The gene expression patterns across different peanut tissues showed that transcript accumulation was highest in seeds, and that the expressions of the AhGPAT9 genes reached the maximum value at 42 DAP, consistent with Chi's research 39 . These results showed that the expression of AhGPAT9 gene was in accord with the lipid accumulation rate in peanut seeds, indicating potential roles of AhGPAT9 in seed development (Fig. 2). Furthermore, the allele type a1 was used to conduct gene transformation, compared with wild-type FH2, the seed oil content of AhGPAT9-OE transgenic plants was significantly increased. In a haplotype accessions, the allele contributing for the highest oil content was a5, and the oil content was 54.05% (Table 3), while that of the over-expressing lines was 54.24% (Fig. 3G). So, the effect of gene over-expression can make the common oil content germplasm increase to a higher oil level, and the oil content of over-expressing lines can reach the highest oil content with that of the best allele. While it was significantly decreased in the AhGPAT9-AE transgenic peanut lines, which oil content was 45.80% (Fig. 3G). The allele contributing for the lowest oil content was a48 (Table 3), the oil content of the accessions carrying a48 was 45.72% (Table 3), while that of the anti-sense expressing lines was 45.80% (Fig. 3G). So, the effect of gene down-regulated expression can make the common oil content germplasm decrease to a lower oil level, and the oil content of anti-sense expressing lines was almost equal to that of the germplasm with the allele contributing for the lowest oil. The results indicated that the AhGPAT9 genes that we obtained indeed play a key role in the accumulation of seed oil in peanut. In addition, as an important part of our transgenic work, research on gene copy number and integration site of AhGPAT9 genes is still in progress now, and we hope that the gene effects on target traits can be further clarified. Recent studies have shown that the seed oil content can be decreased by 26% to 44% in Arabidopsis through the AtGPAT9 knockout method 19 , and AtGPAT9 down-regulation also causes significant decreases in oil content in GPAT9-RNAi lines 21 . Compared with the wild-type, the overexpression of AtGPAT9 in Arabidopsis not only increased the seed oil content significantly, but also increased the TAG content in the leaves of the transgenic lines by 153.3% 21 .
However, oil content is a quantitative trait with a complex underlying genetic mechanism. Lipid synthesis in plants is a complicated biological process involving multiple genes. In Arabidopsis, more than 120 enzymatic reactions and 600 genes are involved in oil accumulation 1 , which is regulated by TAG synthesis pathways, carbon metabolism, FA synthesis, and even cell differentiation 1,34 . In our study, 171 peanut germplasm were tested for oil content across four consecutive years, and their oil content varied from 40.01% to 57.78%, showing a wide range (Fig. 6). The high broad-sense heritability (h B 2 ) and low coefficient of variation (CV) values obtained suggested that oil content is mainly determined by genotype, which is consistent with previous studies 38,49 . Although the results from different researchers regarding the genetic dissection of the oil content trait differ considerably, there must be major genes controlling oil content, and GPAT9 may be one of the most important genes. The allelic polymorphism analysis of AhGPAT9 in peanut germplasm will provide us with elite alleles and valuable information for breeding new lines with higher oil content.
Peanut is an allotetraploid species (AABB); thus, two GPAT9 genes were identified among 171 peanut cultivars in our study: AhGPAT9A from the A-genome and AhGPAT9B from the B-genome. A total of 118 allelic polymorphic sites from AhGPAT9A were identified, together with 94 variation sites from AhGPAT9B, including SNPs and InDels (S1 Table and S2 Table). The Tajima's D value for the SNPs in AhGPAT9A was -2.41743, whereas it was -1.87261 for AhGPAT9B, which were both statistically significant (p < 0.05), indicating that the nucleotide variations in the AhGPAT9 genes were not caused by standard neutral selection and that AhGPAT9B may be subject to greater artificial selection pressure. Based on sequence polymorphic analysis, the AhGPAT9A genes of the 171 peanut germplasm showed 64 haplotypes (a1 to a64), while 75 haplotypes were identified in AhGPAT9B (b1 to b75). In our study, 86 of 171 peanut germplasm were of the a1 type, while 71 were of the b1 type; thus, haplotypes a1 and b1 were hypothesized to be wild-type haplotypes of the AhGPAT9A and AhGPAT9B genes, respectively (Fig. 4). For these two genes, the 171 peanut varieties could be divided into 109 combination types, among which 39 varieties were of the a1b1 types. The analysis of the deduced amino acid sequence of AhGPAT9 showed that 64 AhGPAT9A haplotypes produced 12 protein types (AP1 to AP12), while 75 AhGPAT9B haplotypes produced seven protein types (BP1 to BP7). Many allelic polymorphisms were located in the intron region, and www.nature.com/scientificreports/ many mutations in the coding regions were synonymous; thus, sequence analysis showed that the amino acid sequences encoded by AhGPAT9A in 157 peanut accessions were identical to AP1, including 52 a-haplotypes (S1 Table). There were 14 peanut accessions that showed a total of 11 mutant proteins (Table 1). Regarding the proteins encoded by AhGPAT9B, sequence analysis showed that 165 peanut accessions were identified as BP1type accessions, including 69 b-haplotypes (S2 Table), and the remaining varieties showed six mutant proteins (Table 1).
To study the relationship between the AhGPAT9 haplotypes and oil content, statistical and differential analyses were performed to explore elite high-oil or low-oil haplotypes or haplotype combinations. As a result, a5, b57, b30 and b35 showed significantly positive phenotypic effects, and they were considered elite high-oil haplotypes, while a7, a14, a48, b51 and b54 were speculated to be low-oil haplotypes. Furthermore, a62/b10, a38/b31 and a43/b36 were the best haplotype combinations for high oil content, but a9/b42 was a low-oil combination ( Table 3). The deduced protein sequence analysis also suggested some correlations. For example, the Ag.4941 T > C mutation in a7 caused a p.F365S amino acid change in AP2, which lacked binding sites for the G3Psubstrate, and enzyme activity might have been decreased to some level ( Table 1). The Ag.2874G > A and Ag.3035A > C mutations in a43 caused amino acid changes in the acyltransferase domain of the AP4 protein, for which the spatial structure was also significantly different, and the mutation resulted in the presence of more G3P binding sites in AP4 (Table 1). However, the oil content of the different haplotypes varied significantly and showed great differences even for the same haplotype. For example, the oil content of the a1 varieties varied from 43.72% to 56.89%, while it varied from 45.72% to 54.72% for the b1 varieties, but the mean values were both around 49%, which was the average level for the entire population (S3 Table). The reason may be that oil accumulation is a complex process involving many reactions and regulatory steps 1 . In addition, cultivated peanut is an allotetraploid (AABB) species comprising two genomes with a high repetitive DNA content and similar genes may have redundant functions 50,51 . A previous study suggested that the homologous ahFAD2A and ahFAD2B genes showed significant additive effects and exhibited multiple effect interactions for regulating the contents of palmitic acid, oleic acid, and linoleic acid, and the O/L ratio 52,53 . In Brassica napus, there are three functional FAD2 genes and on non-functional FAD2 gene 54 . Among the three homologous genes of wheat, TaGW2-6B has a greater effect on the one-thousand kernel weight (TKW) than TaGW2-6A; an additive effect has been identified between them, and the combination 6A-A/6B-1 is the most effective 55 . Therefore, in the process of TAG biosynthesis, two AhGPAT9 genes may exert an additive effect, or complementary effects, or only one of them may play a role. AhGPAT9 affects oil content in peanut, but it is not the sole gene determining this trait. Similar results were found in a polymorphism analysis of the TaDREB1 gene in wheat germplasm for the dissection of the drought resistance trait 56 .
The oil content is an important quality trait in peanut, but the genetic mechanism controlling peanut oil accumulation remains to be further studied. GPAT9 genes have been confirmed to be key enzymes in TAG synthesis pathways in peanut and many other plants 18,19,21,57 . In our study, two AhGPAT9 genes derived from the A-and B-genomes were obtained, allelic polymorphism analysis was conducted in 171 peanut germplasm, and primary correlation analysis was performed between alleles and seed oil content. Finally, we hypothesized the existence of high-oil or low-oil haplotypes or haplotype combinations. However, how the two AhGPAT9 genes affect the seed oil accumulation needs to be further confirmed. Different hybrid combinations were assembled based on the high-or low-oil haplotypes of the two AhGPAT9 genes. It is expected that the gene functions of AhGPAT9A and AhGPAT9B in regard to oil accumulation will be deeply analyzed and clarified in the future, and we further hope to create new germplasm with higher oil content via the hybrid polymerization of high-oil alleles of the AhGPAT9A and AhGPAT9B genes.

Materials and methods
Plant materials. A total of 171 cultivated peanut germplasm were used for the isolation and allele genotyping of AhGPAT9. All the peanut materials used in this study were planted from May to September in 2014, 2015, 2016 and 2017 in a test field at the Agricultural Experiment Station of Shandong Agricultural University (36.15°N, 117.15°E), Tai'an, China. The young leaves of each accession were collected and stored at -80 °C for DNA extraction, and the harvested seeds were used for oil content measurement.
Detection of the oil content of peanut germplasm. The oil content was measured using a DA7250 Near Infrared Reflection (NIR) analyzer (Perten Instruments, Sweden), and the reference standard curve was also used in our previous study 38 . The heritability of oil content was calculated using the equation h B 2 = V G / (V G + V E ), where V G and V E represent genetic and environmental variation, and each term was extracted from the ANOVA results 58 . The mean value of each accession across four years was used in the statistical analysis, and one-way ANOVA was calculated by the least-significant difference (LSD) method 59 .
DNA extraction, PCR amplification, and sequencing. Total genomic DNA was extracted from the leaves of each peanut accession using the hexadecyl trimethyl ammonium bromide (CTAB) method 60 . The DNA concentration and quality were estimated by using a NanoDrop2000 spectrophotometer (Thermo, USA) and 1% agarose gel electrophoresis in comparison with the relative migration and intensity of the standard 5 kb ladder (Takara Bio Inc, Japan). AtGPAT9 (AT5G60620) from Arabidopsis thaliana was used as an information probe against the peanut database for expressed sequence tags (dbESTs) by employing BLAST analysis to retrieve homologous expressed sequence tags (ESTs). Ultimately, a single homologous sequence with an open reading frame (ORF) was established. We used the homologous sequence as a probe to search the website PeanutBase (https ://peanu tbase .org/). Two peanut GPAT9 DNA sequences belonging to the wild species A.duranensis (AA) and A.ipaensis (BB) were retrieved and downloaded. We performed segment cloning, and ten primer pairs (S3 Table)  Allelic polymorphism analysis of AhGPAT9. Multiple sequence alignment analyses were carried out using DNAMAN software (https ://www.lynno n.con/). The amino acid sequences were analyzed with the ExPASy web server (https ://web.expas y.org/trans late/), and the TMDs were predicted with TMHMM software (https :// www.cbs.dtu.dk/servi ces/TMHMM /). The protein spatial structures were determined by using I-TASSER [45][46][47] . DnaSP5.0 (https ://www.ub.edu/dnasp /) 61 was used for the assessment of genetic diversity including nucleotide diversity (π) and Tajima's D 43 . Statistical analyses were based on the phenotypic data of the average oil content over four years. Variance analyses were performed with the SPSS System to determine phenotypic differences between the haplotypes both individually and in haplotype combinations, based on the analysis of variance (oneway ANOVA) according to the least-significant difference (LSD) test at the significance level of 5% (P ≤ 0.05).
Expression analysis of AhGPAT9 genes in different peanut tissues. Different peanut tissues were sampled, immediately frozen in liquid nitrogen, and stored at -80 °C for total RNA isolation, which was carried out by using a Quick RNA isolation kit (HuaYueYang Biotechnology, Beijing, China). First-strand cDNA synthesis was performed using the PrimeScript™ RT reagent kit with gDNA Eraser according to the manufacturer's instructions (Takara Bio Company). The expression analysis of AhGPAT9 genes was performed by qRT-PCR using SYBR green PCR master mix in an ABI StepPlusone Fist Real-Time PCR system (ABI, USA). The primers for GPAT9 genes were AF (5ʹ-TGT CAG TTC AGT GTT AGG -3ʹ), AR (5ʹ-TGG TGT GTC CAG AAG GTA GG-3ʹ), BF (5ʹ-GGT TCA ATC GGA CAG AGG -3ʹ) and BR (5ʹ-AAG TAC CAA ACA TCA CAC -3ʹ). The primers ACTF (5′-CAG CAG AGC GTG AAA TCG -3′) and ACTR (5′-GGA AGA GCA CCT CAG GAC AA-3′) were utilized to amplify a 146 bp fragment of the reference gene actin 41,42 . The relative gene expression of AhGPAT9 was calculated using the 2 -ΔΔCт methods 62 . Three replicates were used for each sample.
Generation of transgenic peanut and seed oil content measurement. We generated two plant transformation constructs including one AhGPAT9 overexpression vector (AhGPAT9-OE) and one anti-sense expression vector (AhGPAT9-AE). The AhGPAT9A sense coding sequence from Fenghua2 which was amplified from the cDNA clone with the forward primer BF1 (5ʹ-TTG CGG CCG CAT GAT GAG GAA GAC CAA TCC CAA GTC -3ʹ) containing a BamHI restriction site, and the reverse primer NR1 (5ʹ-CGC GGA TCC TTA CTT TTC TTC CAA GCG CCG GAGC-3ʹ), containing a NotI restriction site, was inserted between the CaMV 35S promoter and 35S terminator from the pGBVE plasmid to generate the AhGPAT9-OE constructs. A 1,128-bp fragment near the 3ʹ end of the AhGPAT9A coding region was amplified from the cDNA clone with the forward primer NF2 (5ʹ-TTG CGG CCG CAT GAT GAG GAA GAC CAA TCC CAAG-3ʹ), which contained a NotI restriction site and the reverse primer BR2 (5ʹ-CGC GGA TCC TTA CTT TTC TTC CAA GCG CCG GAGC-3ʹ), which contained a BamHI restriction site in Fenghua2 to produce the antisense copy. The AhGPAT9 anti-sense fragment and pGBVE plasmid were digested with NotI and BamHI, and then cloned into the pGBVE vector under the control of the CaMV 35S promoter in the antisense orientation to generate an anti-sense expression vector. The Bar gene in the expression vectors confers glyphosate resistance and can be used as a selectable marker for screening transgenic plants. Then, different vectors were transformed into Agrobacterium tumefaciens strain LBA4404 via the freeze-thaw method. The recombinant bacteria were selected and used for the transformation of the FH2 cultivar using the Agrobacterium-mediated method and the leaflet regeneration system for peanut. The peanut seed leaflets of FH2 were exfoliated and cultured in medium containing 4.5 mg/L 6-BA and 0.7 mg/L NAA. The culture temperature was set to 27℃ ± 2℃, and culture was performed for 4 days under a cycle of 16 h light, 8 h darkness. Then, the cells were infected with the Agrobacterium suspension (OD = 0.65) containing the expression vector for 15 min and cultured for 4 days in the dark. The calluses were transferred to medium containing 4.5 mg/l 6-BA, 0.7 mg/L NAA and 500 mg/L Cef and cultured for 28 days, then transferred to medium containing 5 mg/L 6-BA for subsequent generation. The seedlings were cultured in medium containing 1 mg/L PPT for 1 month when the seedlings had grown to 3 cm, and the surviving peanut seedlings were propagated rapidly in MS medium.
To confirm the integration of the transgenic plants, the regenerated seedlings were cultured in medium containing 1 mg/L PPT for one month. The presence of the target constructs in the transgenic plants was confirmed by PCR in DNA isolated from the leaves of herbicide-resistant seedlings using the primers BarF (5ʹ-AAA CCC ACG TCA TGC CAG -3ʹ) and BarR (5ʹ-CAC CAT CGT CAA CCA CTA C-3ʹ). The positive overexpression and antisense expression transformants were transferred to soil, and T 0 transgenic seeds were harvested. T 0 seeds were seeded as single seeds, and the presence of AhGPAT9 in T 1 was detected by PCR using the primers BarF and BarR. Independent overexpression and anti-sense expression transgenic peanut lines were selected in the T 2 generation, and single plants from the T 2 lines with a higher or lower oil content and AhGPAT9 expression level than the wild-type FH2 were used to obtain the T 3 transgenic lines. The seed oil content was determined using seeds from the T 1 , T 2 and T 3 generations of transgenic peanut. For every experiment, the transgenic experimental lines were grown in the same growth chamber at the same time as their corresponding negative control lines. To study the effect of AhGPAT9 overexpression and antisense expression on peanut plant development, we collected Scientific RepoRtS | (2020) 10:14648 | https://doi.org/10.1038/s41598-020-71578-7 www.nature.com/scientificreports/ the plant main stems and lateral branches of mature peanut plants and recorded the mature pod length, mature seed size and 100-seed weight from the transgenic lines in the T 3 generation.