Impact of sunflower (Helianthus annuus L.) plastidial lipoyl synthases genes expression in glycerolipids composition of transgenic Arabidopsis plants

Lipoyl synthases are key enzymes in lipoic acid biosynthesis, a co-factor of several enzyme complexes involved in central metabolism. Plant pyruvate dehydrogenase complex (PDH), located in mitochondria and plastids, catalyses the first step of fatty acid biosynthesis in these organelles. Among their different components, the E2 subunit requires the lipoic acid prosthetic group to be active. De novo lipoic acid biosynthesis is achieved by the successive action of two enzymes on octanoyl-ACP: octanoyltransferase (LIP2) and lipoyl synthase (LIP1). In this study, two plastidial lipoyl synthase genes from sunflower (Helianthus annuus L.) were identified (HaLIP1p1 and HaLIP1p2), sequenced and cloned in a heterologous production system (Escherichia coli). Gene expression studies revealed similar expression patterns for both isoforms, with a slight predominance of HaLIP1p1 in vegetative tissues and mature seeds. Tertiary structural models for these enzymes indicate they both have the same theoretical catalytic sites, using lipoyl-lys and 5-deoxyadenosine as docking substrates. The fatty acid profile of E. coli cells overexpressing HaLIP1p1 and HaLIP1p2 did not present major differences, and the in vivo activity of both proteins was confirmed by complementation of an E. coli JW0623 mutant in which lipoyl synthase is defective. Although no significant differences were detected in the total fatty acid composition of transgenic Arabidopsis thaliana seeds overexpressing any of both proteins, a lipidomic analysis revealed a redistribution of the glycerolipid species, accompanied with increased phosphatidylethanolamine (PE) content and a decrease in diacyglycerols (DAG) and phosphatidylcholine (PC). Depletion of the SAM co-factor caused by HaLIP1p1 and HaLIP1p2 overexpression in transgenic plants could explain this remodelling through its effects on PC synthesis.

Real-time quantitative PCR. Sunflower cDNAs from developing seeds (12,14,18,20,25, and 28 days after flowering -DAF) and other vegetative tissues (roots, stems, cotyledons, and leaves) were used to determine the levels of HaLIP1p1 and HaLIP1p2 gene expression by real time quantitative PCR (RT-qPCR). The reactions were performed in a CFX96 TM Real-Time PCR Detection System (Bio-Rad) using SYBR Green I (QuantiTect SYBR Green PCR Kit, Quiagen, Crawley, UK) and a specific primer pair designed for each gene: HaLIP1p1qpcr-F/HaLIP1p1qpcr-R; and HaLIP1p2qpcr-F/HaLIP1p2qpcr-R (Supplementary Table S1). The reaction mixture was heated to 94 °C for 10 min and then subjected to 50 PCR cycles of 94 °C for 30 seconds and 57 °C for 30 seconds, with a final extension period at 72 °C for 1 minute. Efficiency curves were drawn up using sequential of the cDNAs and the Livak method 38 was applied to calculate the relative expression of the samples. The HaActin-qpcr-F4 and HaActin-qpcr-R4 primers (Supplementary Table S1) were used to amplify the sunflower HaAct1 actin gene (GenBank Accession FJ487620) as a reference to normalize the expression values.
Expression and purification of the recombinant proteins in E. coli. HaLIP1p1 and HaLIP1p2 were cloned into the pQE-80L expression vector (Qiagen, Hilden, Germany) that contains an N-terminal (His)6-tag to facilitate protein purification. PCR primers were designed to clone the genes encoding the mature proteins without a signal peptide: HaLIP1p1-B-BamHI-F/HaLIP1p1-HindIII-R for HaLIP1p1; and HaLIP1p2-B-BamHI-F /HaLIP1p2-HindIII-R for HaLIP1p2 (Supplementary Table S1). Accordingly, BamHI and HindIII digested PCR products were cloned into the BamHI:HindIII sites of pQE-80L and the constructs (pQE-80L::HaLIP1p1 and pQE-80L::HaLIP1p2) were confirmed by sequencing. Cultures of E. coli XL1-Blue strain harbouring the recombinant plasmids were grown at 37 °C in LB media (1% Bacto Tryptone, 0.5% yeast extract, 1% NaCl, pH 7) supplemented with 50 µg/mL ampicillin and expression of the recombinant protein was induced by isopropyl β-D-1-thiogalactopyranoside (IPTG, at a final concentration of 0.5 mM) when the culture reached an OD 600nm value of 0.4. After 4 hours the cells were harvested by centrifugation (4,000 rpm for 20 min) and then resuspended in 16 mL binding buffer (20 mM sodium phosphate [pH 7.4], 500 mM NaCl, 20 mM imidazole). The cells were disrupted by sonication (70° amplitude 10 s pulses with 10 s cooling on ice, 20 cycles in total) and a cleared lysate was obtained by centrifugation of the disrupted cell solution at 13,000 rpm and at 4 °C for 10 min. The soluble fraction was used to purify the recombinant proteins using the His SpinTrap Kit (GE Healthcare, UK) according to the manufacturer's protocol.

Gel electrophoresis of His-tagged recombinant proteins and western blots.
Aliquots of the different eluted fractions (15 µL) were mixed with SDS-PAGE loading Buffer 2X(Tris-HCl 62.5 mM, DTT 50 mM, bromophenol blue 0.01%, glycerol 25%, SDS 2%) and 1 µL of β-mercaptoethanol, and then heated for 10 min at 90 °C. The proteins were separated by electrophoresis on 4-15% Mini-Protean TGX Gels (Bio-Rad) in SDS-PAGE Running Buffer (25 mM Tris-Base, 192 mM glycine, 0.1% SDS) and the gels were stained with 0.1% Coomassie R-250 in 40% ethanol and 10% acetic acid, visualizing the proteins after several washes. For western blotting, the proteins were transferred to a polyvinylidene difluoride (PVDF) membranes (TransBlot TurboTM Mini PVDF Transfer Packs Bio-Rad) in a TransBlot TurboTM Transfer Starter System (Bio-Rad) and the membranes were blocked for 1 hour with a solution of milk 5% (w/v) in Tris-buffered saline solution (TBS) with mild shaking. The membrane was then washed three times with TBS and incubated at 4 °C for 2 hours with anti-polyHistidine antibodies (1/2000 dilution: Sigma-Aldrich). The (His) 6 -tagged recombinant proteins were then visualized using the ECL Western Blotting Detection Kit (Amersham Bioscience). fatty acid analysis in E. coli. The fatty acid composition of transformed E. coli cells expressing the recombinant mature HaLIP1p1 and HaLIP1p2 proteins was analysed in 25 mL cultures of E. coli XL1-Blue cells harbouring different constructs (empty pQE-80L expression vector, pQE-80L::HaLIP1p1 and pQE-80L::HaLIP1p2), both induced and non-induced, grown at 37 °C with shaking for two hours. Then cells were harvested by centrifugation (4000 rpm for 20 min) and the pellets washed twice with distilled water (2 mL). A methylation mixture (3 mL of ~1.25 M HCl in methanol: Fluka, Germany) was added to each sample and heptadecanoic acid (17:0, 150 µg) was added as an internal standard, allowing methylation to take place for 1 hour at 80 °C. Subsequently, 1 mL heptane was added to extract the methyl esters, the upper phase was washed with 2 mL Na 2 SO 4 (6.7%) in new tubes and the resulting solvent was evaporated with nitrogen gas. Finally, methyl esters were resuspended in 200 µL heptane and gas chromatography (GC) analysis was performed as described previously 39 .
Functional complementation of an E. coli ΔlipA strain. To study the genetic complementation of HaLIP1p1 or HaLIP1p2, a kanamycin resistant E. coli lipA strain (JW0623) with a defective LA synthase was obtained from the Coli Genetic Stock Center (CGSC). Transformants with the different plasmids (empty pQE-80L expression vector, pQE-80L::HaLIP1p1 and pQE-80L::HaLIP1p2) were generated by electroporation and the cells were incubated at 37 °C overnight on LB-agar plates supplemented with 50 μg/mL ampicillin for plasmid selection and 30 μg/mL kanamycin for mutant selection. The resulting colonies were inoculated in 30 mL M9 glucose minimal medium and grown at 37 °C overnight. These bacteria were then inoculated at 0.1 OD 600nm in 5 mL tubes with M9 supplemented with ampicillin, kanamycin and 0.5 mM IPTG for plasmid induction. The mutant E. coli transformed with pQE-80L was used as a negative control, with and without LA supplementation (50 ng/mL). The OD was measured every 90 minutes over the first 9 hours, and after 24 and 30 hours, and the mean of three independent experiments was used to calculate each OD value.

Generation of transgenic plants and vector constructions.
The entire open reading frame of the sunflower plastidial lipoyl synthases was cloned into the binary pBIN19-35S vector, which contains the 35S promoter from cauliflower mosaic virus (CaMV). The construct was cloned into the restriction sites created by PCR with specific pairs of primers: HaLIP1p1-BamHI-F/HaLIP1p1-HindIII-R and HaLIP1p2-XbaI-F/HaLIP 1p2-HindIII-R (Supplementary Table S1). The CAMV-35S-F and pBIN19-R vector primers were used for PCR screening (Supplementary Table S1) and the constructs were sequenced to confirm their identity. These constructs were then used to transform the CaCl 2 competent Agrobacterium tumefaciens strain GV 3101.
A. thaliana Columbia (Col-0) ecotype plants were grown in a growth chamber under controlled growth conditions: 22 °C day/20 °C night, 60% humidity, and a 16 h, 250 micromol m −2 s −1 photoperiod. Transgenic Arabidopsis lines were generated by floral dip transformation, as described previously 40 . Seeds from transformed plants were selected by germination in MS medium supplemented with kanamycin (50 µg/mL), as described previously 41 . Insertion of the Sunflower gene into the Arabidopsis genome was screened by PCR, and the expression of both the HaLIP1p1 and HaLIP1p2 constructs in transgenic plants was confirmed by RNA extraction, cDNA synthesis and PCR. Third generation (F3) seeds of confirmed transgenic Arabidopsis were used to analyse the lipid composition.
Fatty acid analysis of transgenic Arabidopsis seeds. Three replicates of 10 mg of mature seeds from wild-type and F2 transgenic plants overexpressing HaLIP1p1 and HaLIP1p2 were used for total lipid extraction. Lipids from the seeds were extracted using glass beads in a Precellys homogenizer (Precellys 24, Ozyme) 3 cycles for 30 s at 6000 rpm, and then 1 mL hexane:isopropanol (2:1) was added. After solvent evaporation, the lipids were resuspended in chloroform:methanol (1:1) and methylated with 5 µL of TMAH solution (Sigma-Aldrich), using Decane (50 µL) to stop the reaction. Finally, 20 µL of the upper phase was used to analyse the fatty acid methyl esters (FAMEs).
Lipidomic analysis of transgenic seeds. Lipids were extracted as described elsewhere 42 with some modifications. Samples of ice-dried Arabidopsis seed (20 mg) were transferred to a 2 mL screw cap tube and the lipids were extracted with 1 mL of a chloroform/methanol (2:1) mix (1 mL) containing 1 mM BHT and 400 µL H 2 O. After a 2 hour incubation on ice, the samples were centrifuged at 13000 rpm for 5 min at 4 °C, the lower phase was transferred to new tubes and the extraction was repeated. The lower phases were dried under nitrogen and the lipids extracted were solubilized in 200 µL isopropanol. A 1:4 dilution was used to perform lipidomics based on ultra-high performance liquid chromatography (HPLC) coupled to quadrupole-time of flight mass spectrometry (LC-HRMS2), as described previously with minor variations 43 . LC was performed on an HPLC 1290 (Agilent Technologies) and 1 µL of the lipid samples was separated at 55 °C on a C18 Hypersil Gold column (100 × 2.1 mm, 1.9 μm: Thermofisher) using an elution gradient consisting of a solution of 10 mM of ammonium formate and 0.1% formic acid (ACN:H 2 O, 60:40, v/v; solvent A), and a solution of 10 mM of ammonium formate and 0.1% formic acid (IPA:ACN:H 2 0, 90:8:2, v/v; solvent B). LC-electrospray ionization (ESI)-HRMS 2 analysis were achieved by coupling the LC system to an Agilent 6538 hybrid quadrupole time-of-flight (QToF) high definition mass spectrometer (Agilent Technologies) equipped with a dual source ESI. MassHunter B.07 software was used to control the data acquisition parameters and the mass spectra were acquired using dual ESI in positive and negative ion mode.
The chromatogram builder was used applying a minimum time span of 0.10 min, a minimum height of 1.0 × 10 3 and a m/z tolerance of 5 ppm. Chromatogram deconvolution was achieved using the local minimum search algorithm. The chromatographic threshold was 30.0%, the search minimum in the retention time (rt) range of 0.05 min with a minimum relative height of 5% and a minimum ratio of peak top/edge of 2. The peak duration range was from 0.05 to 3 min. MS2 scans were paired using an m/z tolerance range of 0.05 Da and a rt tolerance range of 0.1 min. Isotopologues were grouped using the isotopic peak grouping algorithm with a m/z tolerance of 0.008 and rt tolerance of 0.3 min. A peak alignment step was performed using the join aligner module: m/z tolerance = 0.008, weight for m/z = 50, weight for rt = 50, absolute rt tolerance 2 min. The peak finder module was used with an intensity tolerance of 10%, a m/z tolerance of 0.008 and a rt tolerance of 1.0 min. The resulting peak list was filtered using the Peak list rows filter module with a minimum of 2 peaks in a row, a minimum peaks in an isotope pattern of 2 and by keeping only peaks with MS/MS spectra using MS2 peak filer. The peak list was annotated using a combination of two databases, LipidMatch 44 and LipidBlast 45 . The data were also subjected to a principal component analysis to explore the differences in lipid profiles among the seeds overexpression both sunflower seeds and the wild-type seeds.
A statistical analysis was performed using Metaboanalyst v4.0, and a Pearson-correlation analysis or an unsupervised principal component analysis (PCA) was carried out to determine the statistical significance (Supplementary Data). Heat maps were drawn up to test any significant differences between samples (p < 0.05).

Statistical analysis.
Origin Pro 8 software (OriginLab Corporation, Northampton, USA) was used for the statistical analysis. The number and nature of replicates for each figure and table are indicated in their corresponding legends. An analysis of variance (ANOVA) and a Tukey's test were used to determine the significant differences of the means at a probability level of 5% (p < 0.05).

Results
Cloning and sequence analysis of two sunflower lipoyl synthases. Two coding sequences were identified in the Heliagene database based on their homology to the Arabidopsis LIP1p protein (At5g08415). A 1077 bp (HaLIP1p1) and 1149 bp (HaLIP1p2) PCR product were amplified from 28 DAF developing sunflower seed mRNA, and both were sequenced and found to have essentially the same nucleotide sequences as those from the genome sequencing project (HaLIP1p1, HanXRQChr05g0142151; HaLIP1p2, HanXRQChr12g0358451). The minor differences detected in these sequences were the lack of three nucleotides in a non-catalytic region of HaLIP1p1 and changes in several nucleotides in the N-terminal signal peptide of HaLIP1p2. The protein sequences encoded by the cDNAs of both genes were deduced using bioinformatics tools, containing 359 (HaLIP1p1) and 382 (HaLIP1p2) residues. Chloroplast transit peptides of 61 (Lys61) and 84 (Lys84) residues at their N-terminus were identified in HaLIP1p1 and HaLIP1p2 respectively, ( Supplementary Fig. S1), with both transit peptides rich in Asn, Pro, Ser, and Thr. After proteolysis at the marked cleavage site, the mature proteins were both predicted to contain 229 residues, with a theoretical molecular mass of 33.1 kDa, and with a theoretical pI of 7.69 in the case of HaLIP1p1 and 6.19 for HaLIP1p2.
A phylogenetic tree was generated for the novel sunflower Lip1p genes based on the predicted amino acid sequences and those from other known homologous plant proteins (Fig. 1). The dendrogram shows both proteins www.nature.com/scientificreports www.nature.com/scientificreports/ in a separate group, the Asteraceae family, and very close to those from the Solanaceae family, which includes Solanum lycopersicum and S. tuberosum (tomato and potato respectively), and those from the Rosaceae that include Prunus mume (the Chinese plum) and Fragaria vesca (wild strawberry).
An alignment of the two deduced amino acid sequences with those from different phylogenetic groups (A. thaliana, Ricinus communis, Oryza sativa and Sellaginella moellendorffii) showed highly conserved domains ( Supplementary Fig. S1). Since no plant lipoyl synthases have been crystalized, we based the crystal structure of a lipoyl synthase on that from the cyanobacteria Thermosynechococcus elongatus (TeLIPA) 10 and on MtLIPA Mycobacterium tuberculosis 12 , aligning the structures so that the predicted catalytic residues in HaLIP1p1 and HaLIP1p2 were evident and marked in the previous alignment. LIP1 enzymes contain two [4Fe-4S] clusters bound by three strictly conserved Cys residues, constituting the characteristic CX 3 CX 2 C motif of this family, the "RS cluster" and the "auxiliary cluster" with a CX 4 CX 5 C motif. Within the sunflower LIP1 enzymes, the "RS cluster" is formed by: Cys123, Cys127 and Cys130 in HaLIP1p1; Cys146, Cys150 and Cys153 in HaLIP1p2 (red box in Supplementary Fig. S1); and the "auxiliary cluster" by: Cys92, Cys97 and Cys103 in HaLIP1p1; Cys115, Cys120 and Cys126 in HaLIP1p2 (green box in Supplementary Fig. S1). "RS cluster" is also ligated to a highly unusual serine residue located close to the C-terminus, which corresponds to Ser338 in HaLIP1p1 and Ser360 in HaLIP1p2 (green arrow in Supplementary Fig. S1). This serine residue forms part of a conserved C-terminal R(S/T)S motif, which is followed by an aromatic residue in MtLIPA, and that is also found in HaLIP1p1 (R 337 SSY 340 ) and HaLIP1p2 (R 359 SSY 362 ; in blue box in Supplementary Fig. S1).
Other residues characteristic of the RS protein superfamily were: (i) Asn223 and Glu225 in HaLIP1p1 and Asn246 and Glu248 in HaLIP1p2 (in orange boxes in Supplementary Fig. S1); (ii) a "GXIXGX 2 E" motif, corresponding to S 263 IMLGLGE 270 (HaLIP1p1) and S 286 IMLGLGE 293 (HaLIP1p2; orange box in Supplementary Fig. S1); and (iii) a conserved Met residue, Met265 in HaLIP1p1 and Met288 in HaLIP1p2 (as part of the "GXIXGX 2 E" motif, in Supplementary Fig. S1). Furthermore, Arg residues were also identified in the new sequences that could be involved in substrate recognition, Arg337 in HaLIP1p1 and Arg359 in HaLIP1p2 (orange arrow in Supplementary Fig. S1).
Tertiary structure prediction and molecular docking. Both lipoyl synthase tertiary structures from sunflower were modelled using the crystal X ray structure of MtLIPA as a reference 12 . MtLIPA shared 44.84% identity with HaLIP1p1 and 46.45% with HaLIP1p2, and both sunflower proteins shared 82.6% of identity such that the model generated was the same for both sequences. The predicted HaLIP1p1 and HaLIP1p2 structures consisted of a single domain with the overall folding producing a partial α8/β8 pattern TIM barrel ( Fig. 2A). In the model, the auxiliary cluster is very close to the N-terminal region and the specific RS cluster is packed between the C-and N-terminal extensions. Both clusters, the auxiliary (three coordinated Cys in green in Fig. 2B,C) and RS cluster (three coordinated Cys in red in Fig. 2B,C) are spatially separated by 13 Å (in MtLIPA), forming a gap in the structure where presumably the octanoyl chain from octanoyl-E2-PDH and the SAM interact. Furthermore, a docking model was constructed with the substrates lipoyl-lys and 5′-deoxyadenosine, substrates that were located in the gap formed between both clusters (Fig. 2B,C). The Ser residue (green in Fig. 2B,C) formed part of the R(S/T)Y motif (blue in Fig. 2B,C) and it lay close to the auxiliary cluster, enabling the ligand to access the Cys in the cluster. The residues involved in SAM recognition were located in the structure forming part of the gap where 5′-deoxyadenosine was positioned. The peptide chain of lipoyl-lys remained outside this gap, with sulphur atoms projecting into the structure and 5′-deoxyadenosine staying close the lipoyl group (Fig. 2B,C). The hypothetical residues involved in the enzyme's activity are in red (typical RS cluster involved in SAM reduction) and green (auxiliary cluster that acts as a donor of sulfur to form lipoic acid). Both substrates lipoyl-lys and 5′-deoxyadenosine are found in the gap formed between the two clusters. A Ser residue (green) forming part of R(S/T)Y motif (dark blue) is close to the auxiliary cluster, which enables the ligand to associate with the Cys of the cluster. The residues involved in SAM recognition are in orange. The peptide chain of lipoyl-lys remains outside the gap, with the sulfur atoms situated in the structure and 5′-deoxyadenosine staying close to the lipoyl group. (C) Wider angle of the active center of the protein with the substrates in a ribbon diagram.
HaLIP1p1 and HaLIP1p2, without signal peptide, could complement the lipA deficient E. coli JW0623 strain that has defective LA biosynthesis and that grows poorly in minimal medium. When this E. coli strain was transformed with HaLIP1p1 its growth rate recovered completely, initially surpassing that of the lipA deficient cells fed with LA (Fig. 4). Conversely, the HaLIP1p2 construct also complemented this auxotrophy in these experiments but with lower efficiency. In terms of the cell density in the stationary phase, the E. coli cells transformed with both LIP1p enzymes achieved similar densities to the cells that received a LA supplement and much higher than that of the lipA deficient mutant (Fig. 4). After 30 h of growth there were significant differences in the optical density of the cultures expressing sunflower LIP1p enzymes (Fig. 4), being in both cases significantly below that of the supplemented cells.
fatty acid analysis of E. coli expressing the HaLIP1p1 and HaLIP1p2 enzymes. The mature protein coding region of both sunflower lipoyl synthases was cloned into the pQE-80L inducible expression vector and the E. coli XL1-Blue strain was transformed with these vectors to achieve the heterologous expression of these enzymes. Mature HaLIP1p1 and HaLIP1p2 proteins were purified from the soluble fraction of these cell lysates using affinity columns, and both recombinant proteins had a molecular weight similar to that predicted (33.1 kDa; Supplementary Fig. S2). When the fatty acid composition of the cells expressing HaLIP1p1, HaLIP1p2 and the pQE-80L vector alone were analysed (Table 1), no significant changes in the total fatty acids per absorbance unit was obtained. However, qualitative modifications in terms of bacterial fatty acids composition were measured. Statistical significant differences were found after comparing the fatty acid profiles of the strain bearing the empty vector and the heterologous HaLIP1p2 system, though the expression of HaLIP1p1 did not produce significant changes. The expression of HaLIP1p2 changed the bacterial fatty acids distribution indeed, decreasing the contents of 14:0 and 16:0 and increasing 18:1 Δ11 , therefore modifying the ratio of unsaturated to saturated fatty acids. HaLIP1p1 (HaLIP1p1-ox) and HaLIP1p2 (HaLIP1p2-ox) was analyzed and compared to wild-type (Col-0) seeds. The overexpression of these enzymes failed to produce any significant changes in the fatty acid profile, although HaLIP1p2 produced a slight increase, no statistically significant, in the 18:3 Δ9Δ12Δ15 detected (Table 2). Hence, and in contrast to the described above modifications in transgenic bacteria, the unsaturated/saturated fatty acid ratio in both transgenic seed oils remained similar and unchanged when compared to wild-type seed oil.
Lipidomics of transgenic seeds. Lipidomic analysis of control (Col-0) and transgenic Arabidopsis seeds expressing either HaLIP1p1 or HaLIP1p2 were performed, afterwards unsupervised principal component analysis (PCA) was employed in order to determine the experimental variation. The genotypes were separated according to their score plots, providing an overview of the differences among Col-0 and the transgenic plants regarding their seed oil composition (Supplementary Fig. S3). In both analysis, Col-0 versus HaLIP1p1-ox and Col-0 versus HaLIP1p2-ox, clear separation between overexpression lines and the control highlighted their differences in terms of lipid composition. The heat maps of the statistically significant variations in lipids species were obtained, comparing the effects of HaLIP1p1 and HaLIP1p2 overexpression with the wild-type seeds. In Figs. 5 and 6 the heat maps showing the 25 species displaying the most significant differences were shown. In both cases, there    (Fig. 7). Similarly, differences have been identified in other important lipids, seed oil from HaLIP1p1-ox and HaLIP1p2-ox Arabidopsis plants presented lower percentage of PC and DAG together with total increase in PE (Fig. 8).

Discussion
Plastidial lipoyl synthase and octanoyltransferase are the enzymes involved in de novo LA biosynthesis within that organelle. LA is essential for the correct activity of the PDH complex, which catalyses the oxidative decarboxylation of pyruvate to produce acetyl-CoA, CO 2 and NADH. Acetyl-CoA is the source of carbon for many synthetic pathways, including that responsible for the de novo synthesis of fatty acids. Therefore, better understanding this primary pathway could be important to provide a basic groundwork for further biotechnological approaches to modify lipid biosynthesis and accumulation in sunflower seeds.

Sunflower lipoyl synthases.
In the present work, two lipoyl synthase genes from sunflower were isolated and cloned. Both are nuclear genes encoding proteins with a N-terminal transit signal peptide, which is presumably removed upon transport into plastids or chloroplasts (Supplementary Fig. S1). Both signal sequences are rich in Asn, Pro and in particular, Ser and Thr residues, which are very common to such domains 47,48 . Indeed, these domains can be phosphorylated to bind to 14-3-3 proteins and to form a cytosolic complex for transport towards the chloroplast with HSP70 proteins 49 . Previous studies on sunflower proteins involved in fatty acid biosynthesis identified similar phosphorylation motifs, such as in FatA-type thioesterase 50 , the HaFAD7 ω3-desaturase 51 , the condensing enzyme HaKASIII 39 , and in the two β-hydroxyacyl-[ACP]-dehydratases HaHAD1 and HaHAD2 52 .
A phylogenetic tree was generated with the deduced amino acid sequences for HaLIP1p1 and HaLIP1p2, and with the chloroplast lipoyl synthases from different taxonomic groups, using two Chlorophyta species as out-groups to root the tree (Fig. 1). The location of the sunflower LIP1p proteins within the tree revealed a common origin for both proteins and a possible recent duplication. The dendrogram shows that the LIP1p proteins closest to the sunflower enzymes were those of Solanaceae, as expected, and Rosaceae.
The alignment of amino acid sequences of HaLIP1p1 and HaLIP1p2 with homologues from different phylogenetic groups identified highly conserved residues, indicating the strong conservation of lipoyl synthases during evolution (Supplementary Fig. S1). The overall secondary structure of both sunflower lipoyl synthases adopted a α8/β8 TIM barrel pattern (Fig. 2), which with certain variations relative to the common partial (α6/β6) TIM barrel typical of the RS superfamily 53 , and with N-terminal and C-terminal extensions 12  www.nature.com/scientificreports www.nature.com/scientificreports/ lipoyl co-factor by inserting two sulphur atoms at C6 and C8 of an octanoyl chain bound to the E2 subunit of PDH. As mentioned before, the two [4Fe-4S] groups are directly involved in the reaction catalysed by this enzyme. LIP1 enzymes belongs to the radical SAM (RS) family of enzymes, which contains a [4Fe-4S] cluster bound by three strictly conserved Cys residues constituting the RS cluster characteristic of this family. Within the sunflower LIP1 enzymes, this cluster functions as a reductant in the cleavage of SAM to provide a sulphur atom and its insertion in the position Δ6 of octanoyl chain, generating 5′-deoxyadenosyl radicals. The second auxiliary cluster is specific to lipoyl synthases and it is thought to act as the direct sulphur donor for the second insertion at Δ8 [8][9][10] . In accordance with their activity and our structural model, the arrangement of the RS and the auxiliary cluster is similar in the MtLIPA and the sunflower LIP1 enzymes. The RS cluster of HaLIP1p1 and HaLIP1p2 is coordinated by a triad of Cys residues according to the CX 3 CX 2 C motif described for TeLipA 10 and MtLipA 12 . In the auxiliary cluster, three iron atoms are thought to be coordinated by three Cys residues in the CX 4 CX 5 C motif, while the fourth iron is coordinated by an unusual Ser residue 10,12 . This Ser is located near the C-terminal in the secondary structure, yet it stays close to the auxiliary cluster in our tertiary structure model. Thus, both clusters and the residues described above appear to form a structure that could nicely accommodate the substrate (octanoyl-Lys from octanoyl-E2-PDH) packed between SAM and the auxiliary cluster, the SAM lying adjacent to the RS cluster.
For molecular docking, lipoyl-lys (produced by insertion of the sulphur to form LA bound to Lys-E2-PDH) and 5′-deoxiadenosine (a radical produced after the loss of sulphur in SAM) were used as substrates (Fig. 2). Both substrates were positioned in the gap formed between the auxiliary and RS clusters. The mechanism proposed indicates that the active site is open, with the two clusters separated when no substrate is available. The octanoyl-E2-PDH substrate binds near to the N-terminal extension and the octanoyl chain is inserted into the gap between the two clusters. Subsequently, a conformational change takes place to close the two clusters, whereby the N-terminal extension and auxiliary cluster move towards the core of the partial TIM barrel. As a result, the distance between the clusters diminishes 12 . This change enables the cleavage of SAM and the first sulphur insertion at C6 of the octanoyl chain due to the action of the RS cluster. By contrast, the reductive cleavage of SAM generates methionine and a 5′-deoxyadenosyl radical. The function of the auxiliary cluster has not yet been fully elucidated but studies with 34 S-labelled lipoyl synthase from T. elongatus suggest that both the sulphur atoms are transferred from a single lipoyl synthase molecule 8 , which may indicate the auxiliary cluster acts as the sulphur donor for the second insertion at C8. This hypothesis of the scarification of the auxiliary cluster is not universal accepted 8,9,11 , although there is some evidence to support this 12 . A conserved Arg residue has been proposed to participate in the recognition of adenosyl ligands 54,55 , a phenomenon not previously observed in the RS superfamily. This Arg residue has been implicated in SAM recognition by TeLIPA 10 , and the equivalents in sunflower are Arg337 (HaLIP1p1) and Arg359 (HaLIP1p2). Furthermore,  Site-directed mutagenesis suggests this serine residue is essential for lipoyl group formation but not for the reductive cleavage of SAM 10 . An alignment of 590 lipoyl synthase sequences 56 shows that the serine ligand is conserved in over 98% of these sequences, while the remaining sequences that lack this conserved serine appear to be eukaryotic splice variants.
The RT-qPCR data revealed similar expression patterns for the two sunflower plastidial lipoyl synthases genes (Fig. 3). HaLIP1p1 is more strongly expressed in vegetative tissues and seeds from 20-28 DAF than HaLIP1p2 and thus, neither gene appears to be temporally regulated during embryo development. These results suggest that both isoforms may be redundant and neither is tissue-specific. Indeed, both genes are expressed in all the tissues analysed, in contrast to the majority of genes involved in lipid biosynthesis and oil accumulation which are normally expressed more strongly in seeds, particularly around 18-19 DAF when lipid accumulation is at its highest in developing sunflower seeds 57 . This could be explained by the fact that LIP1p genes are essential enzymes for plant development. In Arabidopsis, the process of storage takes place during stages 5-6 of seed development, a period equivalent to 12 DAF in sunflower seeds, and both plants show similar lipoyl synthase expression during seed development and maturation.

E. coli mutant complementation studies.
To confirm the activity of both sunflower LIP1 enzymes in vivo, complementation of the in E. coli JW0623 strain was assessed, which carries a null lipA mutation in the LA synthase gene (Fig. 4). This mutant grows slowly in LB medium 58 and its slow growth is further exacerbated in M9 minimal medium where no free LA is available for lipoate scavenging by LPLA. Only when LA is added to the medium can the lipA mutant recover a normal growth rate but when this mutant expressed either HaLIP1p1 or HaLIP1p2, its growth rate was also restored in M9 minimal medium without LA. Hence, both sunflower LIP1p enzymes are capable of achieving functional complementation. Despite this functional complementation, the heterologous expression of both sunflower lipoyl synthases in E. coli produced only minor, although statistically significant, changes in fatty acid composition, with only a slight increase in saturated fatty acids when HaLIP1p2 was expressed. The expression of both enzymes reduces the growth rate of bacteria (data not shown), which could be due to altered LA metabolism by the bacteria, sequestering lipoate moieties in the E2 subunit of PDH and limiting their availability to other enzymes. There are several lipoylated proteins in E. coli other than PDH, such as α-ketoglutarate dehydrogenase (2OGDH), an enzyme essential for aerobic growth and a subunit of the glycine cleavage system of single carbon metabolism 59,60 . In bacteria, LA synthesis proceeds through a two-step reaction catalysed by the LIPA lipoyl synthase and the LIPB octanoyltransferase, both homologues of sunflower lipoyl synthase (HaLIP1p1 and HaLIP1p2) and octanoyltransferases. However, LIPB can be by-passed by a LPLA in E. coli, which can use free lipoate or octanoate in an ATP-dependent reaction to activate apo-subunits 6 . Therefore, excess PDH lipoylation without an increase in fatty acid synthesis could provoke a depletion of that co-factor, thereby impeding the synthesis of other enzymes and impairing bacterial growth.
In plastids, where no ligases have been identified, lipoylation of E2-PDH proceeds exclusively through the joint activity of LIP1p and LIP2p, both enzymes specific to plastidial E2-PDH. Arabidopsis has only one lipoyl synthase gene (AtLIP1p) and its deletion is embryonic lethal, indicating that there is no alternative activity 24 . By contrast, two redundant octanoyltransferases have been identified that are simultaneously expressed in vegetative tissues and seeds, albeit in different ratios (AtLIP2p1 dominates in leaves and roots and AtLIP2p2 in siliques and flowers). Given the expression patterns observed here, we suggest that both lipoyl synthases are important not www.nature.com/scientificreports www.nature.com/scientificreports/ only for oil deposition in the seed but also, in vegetative tissues (Fig. 3), with some possible redundancy as seen for the two Arabidopsis octanoyltransferases. The strong identity between the two protein sequences (82.6% identity) would favour such redundancy.

Overexpression of sunflower lipoyl synthases in transgenic Arabidopsis plants.
Arabidopsis transgenic plants overexpressing HaLIP1p1 or HaLIP1p2 did not affect either plant development ( Supplementary  Fig. S4) or seed germination rates. Their seed fatty acid composition was not altered either (Table 2). Interestingly, lipidomic studies of transgenic seeds revealed a significant remodelling of TAG, diacylglycerides (DAG), phosphocholine (PC) and phosphoethanolamine (PE) species in Arabidopsis expressing the sunflower lipoyl synthase genes when compared with control plants. This was first show by the heat maps of variation of different species, in which were selected the 25 species displaying higher variations for HaLIP1p1 and HaLIP1p2 (Figs. 5 and 6). These figures clearly shown changes in lipid composition induced by the genes but it need further elaboration to reach general conclusions, so the most remarkable changes regarding TAG species are shown in Fig. 7 and those concerning other lipid species were summarized in Fig. 8. The expression of those genes altered the proportion of 7% of the TAG species, with important decreases of (16:0/16:0/18:2) and (18:2/18:2/18:3), specially the first one, which was barely detectable in the transgenic lines. The second one showed a decrease around 50% of the control value. The species increasing their proportion were (16:0/18:3/18:3), (16:0/18:2/20:1) and (18:2/20:1/20:1), which in all cases increased by 3 to 4-fold with respect to the control line (Fig. 7). The expression of both lipoyl synthases genes also induced variations on the content of DAG, PC and PE species. Thus, the content of DAG decreased by 45 to 70% and the PC 55 to 70%, with a concomitant increase of PE, which rose about 70-80% in both transformed plants (Fig. 8).
The changes in lipid species composition induced by sunflower plastidial lipoyl synthase enzymes in Arabidopsis are not easy to interpret. Thus, the expected effect of the overexpression of those genes is the increment of the activity of the PDH complex due to a higher availability of LA cofactor. This increase would lead to higher availability of carbon and reduction equivalents for intraplastidial fatty acid synthesis and also a depletion of the substrates necessary for the synthesis of LA, such as the SAM. The higher availability of intraplastidial reducing equivalents would favour the desaturase reaction, which is dependent of it. This was observed in the E. coli expressing HaLIP1p1( Table 1), but it was not evident when they were expressed in Arabidopsis, where there were not important changes in the desaturation ratios observed in transgenic plants with respect to control ( Table 2). The modifications detected in glycerolipid species mostly involved changes in the proportion of neutral and polar lipid species (DAG, PC and PE) and different distribution of oleic acid derivatives in TAG species. In this regard, later changes could be caused by the reduction of PC and DAG, which species closely related with TAG synthesis 1 and could provoke modifications in the contribution of the different enzymes involved in the channelling of fatty acids from PC and acyl-CoA pools to TAG, like PDCT, PDAT or LPCAT to the distribution of fatty acids in TAG species during their assembly. It should be noted that the glycerolipid species altered in the transgenic plants (DAG, PC and PE) are related. DAG is a common precursor in the biosynthesis of both PC and PE by the amino alcohol phosphotransferase enzyme (AAPT). AAPT catalyse PC biosynthesis from CDP-choline as a substrate (Diacylglycerol Cholinephosphotransferase, DAG:CPT) and PE, through CDP-ethanolamine activity (Diacylglycerol Ethanolaminephosphotransferase, DAG:EPT) (Fig. 9). Both CDP derivatives originate from phosphocholine or phosphoethanolamine due to choline-phosphate cytidyltransferase (CCT) or CTP:phosphoethanolamine cytidyltransferase (PECT), respectively. Even more, this phosphoethanolamine is converted to phosphocholine by the phosphoethanolamine N-methyltransferase, which consumes three SAM molecules, the co-factor also used by lipoyl synthases. In this context, as the physiological requirement for LA is extremely low, an excess of lipoyl synthase activity could provoke a temporal depletion of SAM when sunflower lipoyl synthases are overexpressed within the plastid. This essential co-factor is not produced in this organelle but in the cytosol using the methionine biosynthesized in plastids. Indeed, the influx of SAM into plastids could therefore reduce the pool of cytoplasmic SAM, dampening the synthesis of PC from PE (Fig. 9). This would provoke a reduction in PC with a concomitant effect on fatty acids assembly in TAG carried out by different transferases (PDCT, PDAT, LPCAT and others) and favouring the synthesis of PE from DAG. It is also remarkable that a similar phenotype has not been depicted in Arabidopsis to date.

conclusions
Lipoyl synthases are essential enzymes in LA biosynthesis, a key co-factor of several enzyme complexes involved in central metabolism, including PDH regulation. Two sunflower lipoyl synthases were cloned and studied here, containing both the highly conserved RS and auxiliary [4S-4Fe] clusters. Heterologous expression of these enzymes in E. coli did not significant affect bacterial fatty acid composition, yet both enzymes complemented the lipoyl synthase activity in a defective mutant E. coli. Arabidopsis seeds overexpressing these enzymes do not experience significant differences in their total fatty acid composition but an unexpected remodeling of their glycerolipid species. While this excess lipoyl synthase activity does not affect plastidial PDH lipoylation it may sequester SAM, which is also required for PC synthesis from DAG. Accordingly, the reduced availability of SAM could diminish the PC generated and lead to an accumulation of PE species and this unbalance content of PC and DAG affect the enzymes using these substrates (PDCT, PDAT or LPCAT) for the incorporation of fatty acids in TAG modifying their distribution.