Development of β-carotene, lysine, and tryptophan-rich maize (Zea mays) inbreds through marker-assisted gene pyramiding

Maize (Zea mays L.) is the leading cereal crop and staple food in many parts of the world. This study aims to develop nutrient-rich maize genotypes by incorporating crtRB1 and o2 genes associated with increased β-carotene, lysine, and tryptophan levels. UMI1200 and UMI1230, high quality maize inbreds, are well-adapted to tropical and semi-arid regions in India. However, they are deficient in β-carotene, lysine, and tryptophan. We used the concurrent stepwise transfer of genes by marker-assisted backcross breeding (MABB) scheme to introgress crtRB1 and o2 genes. In each generation (from F1, BC1F1–BC3F1, and ICF1–ICF3), foreground and background selections were carried out using gene-linked (crtRB1 3′TE and umc1066) and genome-wide simple sequence repeats (SSR) markers. Four independent BC3F1 lines of UMI1200 × CE477 (Cross-1), UMI1200 × VQL1 (Cross-2), UMI1230 × CE477 (Cross-3), and UMI1230 × VQL1 (Cross-4) having crtRB1 and o2 genes and 87.45–88.41% of recurrent parent genome recovery (RPGR) were intercrossed to generate the ICF1-ICF3 generations. Further, these gene pyramided lines were examined for agronomic performance and the β-carotene, lysine, and tryptophan contents. Six ICF3 lines (DBT-IC-β1σ4-4-8-8, DBT-IC-β1σ4-9-21-21, DBT-IC-β1σ4-10-1-1, DBT-IC-β2σ5-9-51-51, DBT-IC-β2σ5-9-52-52 and DBT-IC-β2σ5-9-53-53) possessing crtRB1 and o2 genes showed better agronomic performance (77.78–99.31% for DBT-IC-β1σ4 population and 85.71–99.51% for DBT-IC-β2σ5 population) like the recurrent parents and β-carotene (14.21–14.35 μg/g for DBT-IC-β1σ4 and 13.28–13.62 μg/g for DBT-IC-β2σ5), lysine (0.31–0.33% for DBT-IC-β1σ4 and 0.31–0.34% for DBT-IC-β2σ5), and tryptophan (0.079–0.082% for DBT-IC-β1σ4 and 0.078–0.083% for DBT-IC-β2σ5) levels on par with that of the donor parents. In the future, these improved lines could be developed as a cultivar for various agro-climatic zones and also as good genetic materials for maize nutritional breeding programs.

Maize, an important cereal, is life for millions in the global population, as a source of protein, vitamins, minerals, oils, and dietary fibre. The crop is cultivated widely in diverse agroecology across the globe and has the highest genetic yield potential among the cereals. It is grown in more than 160 countries with a total production of 1.05 million thousand tonnes and 28.90 million tonnes in India for the year 2019 1 . Maize is a rich source of provitamin A and non-provitamin A carotenoids. The carotenoids are synthesized in the maize endosperm via the carotenoid biosynthesis pathway that originates from the isoprenoid precursor, geranyl pyrophosphate, supplied by the MEP pathway 2 . Through a series of enzyme-mediated reactions, phytoene, the first carotenoid compound, is synthesized and enzymatically converted to lycopene. This is the branch point of the pathway, and further conversion depends on the cyclization of the lycopene molecule. An asymmetric cyclization would produce an α-carotene molecule, and a symmetric cyclization would yield a β-carotene molecule, forming the provitamin A carotenoids in maize 3 .
Among the provitamin A carotenoids, β-carotene has the highest provitamin A potential due to the presence of two β-ionone rings. β-carotene is further hydroxylated to produce β-cryptoxanthin and further to zeaxanthin and ABA which are non-provitamin A carotenoids 4 . Hence, in normal maize, due to the conversion of β-carotene to non-provitamin A carotenoids, a micronutrient deficiency occurs, particularly the vitamin A deficiency (VAD). Maize is also a staple food in many of the sub-Saharan and Latin American countries, and hence, VAD would pose an important threat to the population, specifically the pregnant women and infants, resulting in complications such as blindness and growth retardation 5,6 . In 2018, a study conducted by UNICEF revealed that children aged between 6 and 59 months from East Asia and the Pacific regions received the highest two-dosage of vitamin A supplements with 75% from the African countries and 59% from the South Asian countries 7 . Therefore, there is a pressing need for alleviating this micronutrient complication, and since the carotenoid compounds are naturally accumulated in the edible part of the maize endosperm, it becomes an ideal crop for biofortification.
Several studies have identified various genes that are directly involved in the variation of the β-carotene pathway by directly or indirectly modifying the carotene biosynthesis pathway. The LcyE and the crtRB1 genes were shown to be directly involved in influencing the beta carotene levels in the maize endosperm 8,9 . The precise manipulation of the crtRB1 gene has shown to favorably increase the beta carotene concentration in previous studies 10,11 . Yan et al. identified the crtRB1 gene responsible for this conversion and also three polymorphisms that influence the variation in the carotenoid concentration. The polymorphism in the 3'TE region with the favorable allele (543 bp) increases the carotene concentration in maize [9][10][11] .
Maize also contains two protein fractions viz., zein and non-zein, where zein proteins are predominant. However, these zein proteins lack essential amino acids like lysine and tryptophan and hence induce Protein Energy Malnutrition (PEM). Several natural mutants (i.e., opaque 2 (o2) 12 , floury 2 (fl2) 13 , opaque 7 (o7) 14 , opaque 6 (o6) 15 , floury 3 (fl3) 14 ) have shown to increase these essential amino acids in maize of which o2 has been widely studied. The o2 mutant is known to decrease the zein fraction and increase the non-zein fraction which is naturally high in essential amino acids [16][17][18] . The large genetic variation present in maize makes it an ideal crop for nutritional improvement specifically in regard to micronutrient deficiencies. Marker-assisted backcross breeding (MABB) has been shown to be a promising technique to introgress several nutritionally important genes in many crops including maize 19 . Nutritional traits viz., provitamin A, higher protein content, high Zn, Fe, and Se content have been improved in maize through the MABB technique 8,[20][21][22][23] .
Several studies in India and other parts of the world have successfully introgressed either crtRB1 or o2 into popular elite lines and improved the β-carotene, lysine, and tryptophan contents [24][25][26][27][28][29][30] , the time required to improve the plants individually is reduced and would also provide a  superior genotype with several favourable nutritional traits. This has now become possible due to the advances  made in technology as well as the identification of new molecular markers and integrated techniques developed  for efficient selection 26,28,[31][32][33][34] . Considering these, this study is planned to develop an intercross population and pyramid the crtRB1 and o2 simultaneously in the background of elite genotypes.

Results
Transfer of crtRB1 and o2 genes into UMI1200. A total of 27 and 23 F 1 s were produced in cross-1 and cross-2, and their heterozygosity was confirmed via foreground markers associated with crtRB1 and o2 genes. The healthy F 1 s from both crosses were backcrossed with a recurrent parent to produce 106 and 232 BC 1 F 1 lines, and again heterozygous conditions were confirmed in BC 1 F 1 lines using foreground markers. All the heterozygous positives were subjected to background selection with 112 and 106 polymorphic markers.  Tables S1, S2). The two BC 3 F 1 lines, (DBT 2-1-4-7-1-9) and (DBT 5-1-14-5-8-7) from cross-3 and 4 having maximum RPGR, were used to develop the intercross population (designated as DBT-IC-β 2 σ 5 ).
In all the IC generations (ICF 1 -ICF 3 ), the same markers were used to ensure that the final products were double homozygotes for both the crtRB1 and o2 genes. In the ICF 2 generation, generated lines from both DBT-IC-β 1 σ 4 and DBT-IC-β 2 σ 5 populations were subjected to the chi-square test. The results revealed that the population segregated in the expected Mendelian ratio of 1:2:1 without any significant distortion for both the markers. www.nature.com/scientificreports/ Similar results were also obtained by Veldboom and Lee. 35 and Lu et al. 36 . The selected double positive lines were then used to produce the ICF 3 generation wherein the double homozygotes were ensured using the crtRB1 3'TE gene-specific and umc1066 markers. In this way, we were able to stack the nutritionally important genes and develop lines that were improved for β-carotene, lysine, and tryptophan levels. Similar studies were reported by other researchers 26,28,31,33,34,37 . However, in our study, we were able to achieve gene stacking by intercrossing homogenous lines that already had enhanced levels of β-carotene, lysine, and tryptophan thereby reducing the breeding cycle due to which we were able to produce a homogenous population that was highly similar to that of the recurrent parent in a short amount of time. Moreover, we were able to improve UMI1200 and UMI1230 which are the parents of a popular maize hybrid CO6 that is most suited to the climatic regions of South India. Recovery of recurrent parent genome was also achieved in both ICF 2 and ICF 3 generation using a total of 148 polymorphic SSR markers. A high RPG% was obtained in the ICF 2 generation itself due to the initial improved lines used to produce the intercross population having low levels of unwanted linkage drag. Once the ICF 3 generation was developed we were able to identify three lines in both cross combinations that were double homozygotes and had a high recovery of recurrent parent genome. These results are in accordance with earlier reports 19,26,32 . The analysis of the opaqueness in the ICF 2 generation showed that all the seeds showed 25% opaqueness for both the cross combinations. This was achieved because the lines that were used to produce the intercross population were already established for the 25% opaqueness using the lightbox screening method. Therefore, the progenies of the ICF 3 generation also showed only 25% opaqueness. These results are in accordance with the previous findings 24,29,33,38,39 .
Morphological trait evaluation in the ICF 3 generation for both DBT-IC-β 1 σ 4 and DBT-IC-β 2 σ 5 populations revealed that the improved lines were having more than 90% similarity with that of the recurrent parent without any major differences in important yield characters like SPY and EW. It showed that complete recovery of important phenotypic and yield characters of the recurrent parent was attained in the pyramided lines along with the desired genes. The lines DBT-IC-β 1 σ 4 -10-1-1 and DBT-IC-β 1 σ 4 -9-21-21 from the DBT-IC-β 1 σ 4 population and the lines DBT-IC-β 2 σ 5 -9-51-51 and DBT-IC-β 2 σ 5 -9-52-52 from the DBT-IC-β 2 σ 5 population were found to have the highest similarity to the respective recurrent parents as far as the yield characters were concerned. Similar results were also reported by former researchers 28,34,40 .
The evaluation of nutritional contents proved that the ICF 3 lines had improved levels of β-carotene, lysine, and tryptophan levels in comparison with their normal recurrent parents. In the DBT-IC-β 1 σ 4 population, DBT-IC-β 1 σ 4 -9-21-21 and DBT-IC-β 1 σ 4 -4-8-8 had the highest levels of β-carotene, lysine, and tryptophan respectively. Whereas, in the DBT-IC-β 2 σ 5 population, DBT-IC-β 2 σ 5 -9-51-51 and DBT-IC-β 2 σ 5 -9-53-53 had the highest levels of β-carotene, lysine, and tryptophan respectively. Similar results were also obtained by earlier studies 26,28,33,37 . The improved lines in both cross combinations obtained from the ICF 3 generation not only have the target donor genes with elevated nutrition levels but also has the high recovery of recurrent parent genome as well as highest phenotypic similarity to that of the recurrent parents rendering them crucial genetic materials for further hybrid synthesis and other genetic studies.
The present study has resulted in the development of improved lines possessing two genes (crtRB1 and o2) responsible for β-carotene, lysine, and tryptophan by marker-assisted gene pyramiding (MAGP) strategy. Thus, the pyramided inbred lines (UMI 1200 and UMI 1230) recorded a higher level of β-carotene, lysine, tryptophan, and better agronomic performance on par to donor parent and recurrent parents respectively. In the future, the promising improved lines could be developed as a cultivar for various agro-climatic zones and also as good genetic materials for maize nutritional breeding programs.

Materials and methods
Plant genetic materials. Maize inbreds, UMI1200, and UMI1230, well-adapted to tropical and semi-arid regions in India were selected as the recurrent parents. Because of their good combining ability, both were utilized to develop the CO6 hybrid. The inbreds seeds were obtained from the Department of Plant Genetic Resources, Centre for Plant Breeding and Genetics, Tamil Nadu Agricultural University, Coimbatore. VQL1 (Possessing o2 associated with high lysine and tryptophan contents) and CE477 (Possessing crtRB1 associated with high β-carotene content) were selected as donor parents. VQL1 was obtained from Vivekananda Parvatiya Krishi Anusandhan Sansthan (VPKAS), Almora, India, whereas CE477 was obtained from International Maize and Wheat Improvement Center, Mexico.
Foreground and background selection. Foreground selection was done using closely linked markers to crtRB1 and o2 genes. The crtRB1 gene located in chromosome 10 was selected using InDel marker crtRB1 3'TE 9 , whereas the o2 gene located in chromosome 7 was selected using the simple sequence repeat (SSR) marker umc1066 41 . The background selection was done to examine the recurrent parent genome recovery (RPGR). It was performed using 248 SSR markers with known chromosomal positions distributing all ten maize chromosomes. All primer sequences were obtained from the maize genome database (www. maize gdb. org) and synthesized by Eurofins Ltd., Bangalore, India.
DNA extraction and PCR amplification. Genomic DNA was isolated from a two-week-old plant following the method by Murray and Thompson 42 . The PCR analysis for the crtRB1 gene-specific marker crtRB1 3′TE (65F: ACA CCA CAT GGA CAA GTT CG) and (62R: ACA CTC TGG CCC ATG AAC AC, 66R: ACA GCA ATA CAG GGG ACC AG) was carried out in a 10 μl reaction containing 2 μl of 20 ng template DNA, 2 mM of MgCl 2 , 1 mM of dNTPs, 2 μM of primer pair and 1.5U of Taq polymerase. The screening followed the 'touch down' technique of an initial denaturation for 5 min at 94 °C, followed by 19 cycles of denaturation for 45 s at 94 °C, annealing for 30 s at 62 °C with a reduction of 0.5 °C in every cycle down to 54 °C and extension for 1 min at 72 °C followed by Marker aided transfer of crtRB1 and o2 genes. Four crossing programs, UMI1200 × CE477 (Cross-1), UMI1200 × VQL1 (Cross-2), UMI1230 × CE477 (Cross-3), and UMI1230 × VQL1 (Cross-4) were initiated to develop the nutrients rich lines using UMI1200 and UMI1230 (Recurrent) and CE477 and VQL1 (Donor) (Fig. 2). The F 1 s from all the four crosses were verified for the existence of crtRB1 and o2 genes in heterozygous form with foreground markers and then backcrossed with UMI1200 or UMI1230 to produce BC 1 F 1 . The BC 1 F 1 lines having crtRB1 (Cross-1and 3) and o2 (Cross-2 and 4) in heterozygous form were selected with foreground markers. The foreground positives from BC 1 F 1 were subjected to background selection to identify the plants with maximum recovery of recurrent parent genome using polymorphic SSR markers. Similarly, another two rounds of backcrossing followed by foreground and background selection generated BC 3 F 1 lines having crtRB1 (Cross-1 and 3) and o2 (Cross-2 and 4) with maximum recovery of recurrent parent genome. The final lines were crossed to produce intercross F 1 s (ICF 1 ) to combine the crtRB1 and o2 genes into a single plant. The heterozygous form in ICF 1 was confirmed by foreground markers and then selfed to two generations to produce ICF 3. The ICF 2 and ICF 3 generations were subjected to the foreground and background selection.
Observation of kernel modification via lightbox screening. The o2o2 allele that is associated with the increased lysine and tryptophan content is also associated with an undesirable character of kernel softness that can be visualized as opaqueness in the kernels. Based on the opaqueness, the kernels can be categorized into five levels: 0%, 25%, 50%, 75%, and 100%. Usually, 25% and 50% kernels are selected since they are certain to contain the o2o2 gene in a homozygous recessive state. Whereas, the other categories contain the o2 gene in either heterozygous or homozygous dominant condition and are heavily susceptible to unfavourable irregularities. A lightbox apparatus is used to differentiate the level of kernel opaqueness as an indirect measure of the kernel softness. Hence, by the dual selection technique of lightbox screening and foreground selection, the o2o2 allele is guaranteed in the population. The ICF 2 and ICF 3 generation lines were subjected to the lightbox screening and the lines exhibiting 25% opaqueness are selected to fix the o2 allele in the homozygous recessive state.

Characterization of ICF 3 lines for morphological traits. The newly developed intercross lines from
both the cross combinations were planted along with the donor and recurrent parent. The plants were maintained with a distance of 20 cm, row spacing of 60 cm, and a row length of 3 m. Good agronomic practices were maintained during the growing period of the crop. Randomized Block Design (RBD) was performed with three replication. Randomly five plants were selected for the morphological trait evaluation. The recovery percentage of the recurrent parents was calculated according to the previous researchers 29,33 . The plants were examined for the agronomic performance by measuring 14 morphological characters viz., days to tasseling (DT, in days), days to silking (DS, in days), plant height (PH, cm), ear height (EH, cm), tassel length (TL, cm), number of tassel branches (NTB). leaf length (LL, cm), leaf breadth (LB, cm), ear length (EL, cm), number of kernels rows per ear (NKRE), number of kernels per row (NKR), ear weight (EW, g), 100 kernel weight (KW, g) and single plant yield (SPY, g). All the characterizations were done according to the descriptors suggested by the International Board for Plant Genetic Resources 43 .
Analysis of β carotene, lysine, and tryptophan. The kernels from the ICF 3 generation were examined for β-carotene, lysine, and tryptophan. The extraction of β-carotene was done following the method given by Kurilich and Juvik 44 and measured with the help of High-Performance Liquid Chromatography (HPLC). The final samples were eluted in a C30 column using a mobile phase consisting of acetonitrile: dichloromethane: methanol in the ratio of 75:20:5, and the flow rate was found to be 0.4 ml/min. The standard curve was constructed based on three different dilutions (1, 10, and 100 ppm) of standard beta carotene obtained from M/s Sigma Aldrich, USA. The lysine and tryptophan contents were measured following the colorimetric method 45 .        Table 4. Lysine, tryptophan, and β carotene levels of the ICF 3