Introduction

Genome-wide association (GWA) mapping is a powerful approach to identify genetic loci associated with complex traits in natural populations. The approach has been successfully applied in plants such as Arabidopsis thaliana (Atwell et al. 2010; Baxter et al. 2010), maize (Kump et al. 2011) and rice (Huang et al. 2010, 2012; Zhao et al. 2011; Norton et al. 2014; McCouch et al. 2016; Bettembourg et al. 2017) for identifying important agronomic, disease resistance and ionomic (the elemental composition of biological samples) loci. GWA mapping for ionomic traits in plants has been commonly used to perform QTL analysis related on a single trait in individual experiments (Atwell et al. 2010; Baxter et al. 2010; Li et al. 2010; Norton et al. 2014, 2018; Dimkpa et al. 2016; Yang et al. 2018). However, these studies have not always identified QTLs for a trait from several experiments. There are several reasons why a number of ionomic QTLs have not been consistently detected, including a different range of phenotypic values in experiments due to large environmental effects such as different geographic locations and climate data (temperature and humidity) (Pinson et al. 2015; Huang and Salt 2016), and complex genetic architecture such as distributed allelic variants, each of which have small effects (Korte and Farlow 2013). These reasons may reduce the power of statistical association tests in GWA mapping using single-trait analysis in only one experiment.

Recently, multi-trait approaches have been developed to improve QTL detection by increasing the statistical power with correlated traits and multiple experiments (Korte et al. 2012; Lippert et al. 2014; Zhou and Stephens 2014; Loh et al. 2015). Zhou and Stephens (2014) developed a genome-wide efficient mixed-model association (GEMMA) software for testing multiple traits for each single genetic marker with a multivariate linear mixed model (mvLMM), which controls population stratification and accounts for relatedness between individuals. It was developed from the efficient mixed-model association (EMMA) algorithm for single-trait analysis (Kang et al. 2008) with which identified QTLs could be compared with multi-trait analysis based on the EMMA algorithm.

Central to effective GWA mapping is the population that is used. The Rice Diversity Panel 1 (RDP1) is a rice panel representing the broad range of rice varieties from >70 countries (Eizenga et al. 2014). This panel was initially genotyped using 44,100 SNPs (Zhao et al. 2011), subsequently it was genotyped with 700,000 SNPs (McCouch et al. 2016) and in the latest iteration 5,231,435 SNPs have been imputed on this panel by comparing the 700,000 SNPs with whole-genome sequence data of the 3000 sequenced rice cultivars (Wang et al. 2018).

Manganese (Mn) is an essential trace element for plants and humans. It is an important cofactor or activator of many enzymes, and is involved in photosynthesis in plants (Marschner 1995; Soetan et al. 2010). Mn deficiency in plants can cause a reduction in growth and yield (Marschner 1995; Hebbern et al. 2005), whereas if the Mn concentration is elevated, it can become toxic to plants (Ducic and Polle 2005; Millaleo et al. 2010). Mn homoeostasis in the embryo is required for efficient seed germination (Eroglu et al. 2017). Mn deficiency in humans is rare; however, it can lead to a range of health impacts, including severe birth defects, impaired reproductive functions, skeletal defects and asthma (Bashir et al. 2013), while overexposure can lead to neurological disorders (Crossgrove and Zheng 2004; O’Neal and Zheng 2015). The recommended daily intake of Mn for an adult is 2.3 mg day−1, while the recommended tolerable levels are 11 mg day−1 (Institute of Medicine 2001). Rice grain concentrations of Mn are variable, but a recent dietary study of rice consumers in West Bengal, India, demonstrated that rice alone can contribute between 0.82 and 4.21 mg day−1 for an adult, 35.7–183% of the recommend daily amount of Mn (Halder et al. 2020).

The mechanisms of Mn uptake, transport, accumulation and detoxification have been studied in plants (Ducic and Polle 2005; Millaleo et al. 2010; Socha and Guerinot 2014). For rice, a number of Mn-transporter genes have been identified such as natural resistance-associated macrophage protein 3 (OsNRAMP3), OsNRAMP5 and OsNRAMP6 (Ishimaru et al. 2012; Sasaki et al. 2012; Yang et al. 2013, 2014; Peris-Peris et al. 2017), yellow stripe-like protein 2 (OsYSL2) and OsYSL6 (Koike et al. 2004; Ishimaru et al. 2010; Sasaki et al. 2011) and cation diffusion facilitator/metal tolerance protein 8.1 (OsMTP8.1), OsMTP8.2, OsMTP9 and OsMTP11 (Chen et al. 2013, 2016; Ueno et al. 2015; Takemoto et al. 2017; Zhang and Liu 2017; Ma et al. 2018). In addition to these genes, a number of studies have identified QTLs of grain Mn concentration in rice grains based on biparental mapping (Stangoulis et al. 2007; Lu et al. 2008; Ishikawa et al. 2010; Norton et al. 2010, 2012a; Du et al. 2013; Zhang et al. 2014). For example, QTLs have been detected on chromosomes 1 (Stangoulis et al. 2007; Lu et al. 2008), 1, 2, 7 and 12 (Ishikawa et al. 2010), 10 and 11 (Norton et al. 2010), 3, 5, 7, 8 and 9 (Norton et al. 2012a), 3, 6, 8 and 9 (Du et al. 2013) and 2, 3, 4, 6, 7, 8, 11 and 12 (Zhang et al. 2014).

Knowledge of natural genetic variation that regulates Mn concentration in grains among rice landraces and cultivars is limited. To address this, we conducted GWA mapping of grain Mn concentration in the RDP1 from four-field experiments in Arkansas and Texas, USA. The aims of this study were (1) to compare the impact of increasing marker density on detecting loci in GWA mapping for grain Mn concentration, and (2) to identify QTLs and candidate genes associated with grain Mn concentration across experiments in multiple locations and years using both single-trait and multi-experiment GWA analyses.

Materials and methods

Sample data

A total of 389 rice accessions from the RDP1 (Zhao et al. 2011; McCouch et al. 2016) consisting of 57 aus (AUS), 78 indica (IND), 100 temperate japonica (TEJ), 96 tropical japonica (TRJ) and 14 aromatic (ARO) as well as 44 admixtures were used in this study (Supplementary Table S1). There were two major varietal groups, Indica (AUS and IND) and Japonica (TEJ, TRJ and ARO) (McCouch et al. 2016).

The experimental design, planting methods and rice growth conditions were described in Norton et al. (2012b). Briefly, the RDP1 was grown in two locations under either flooded or unflooded cultivation. The locations were Stuttgart, Arkansas (USDA-ARS Dale Bumpers National Rice Research Center) and Beaumont, Texas (Texas A&M AgriLife Research Center), USA. In Arkansas, the rice cultivars were grown in the same location using nearby fields in 2006 and 2007, the field layout in both years was a randomised complete block design (RCBD) with two replications, with identical field management practices wherein fields were flooded when plants were at the five-leaf stage then drained before harvest (datasets referred to as ArFl06 and ArFl07, respectively). Seeds of each cultivar were sown with a seed drill to ~2-cm depth in a single row 5-m long with spacing of 25 cm between the plants and 50 cm between the rows. Fields were flush-irrigated twice before a permanent flood was applied to the fields ~2–3 weeks after seedling emergence. In Texas, three replications of the RDP1 accessions were grown in 2009 using two different water-treatment conditions: flooded and unflooded (datasets referred to as TxFl09 and TxUnfl09, respectively), with all other field management practices remaining the same. The experiment was set up in a RCBD. Five seeds per cultivar were drill-seeded 2-cm deep into 13-cm length lines, hereafter called hillplots. Five hillplots were planted per row with 61 cm between hillplots within each field row, and 25 cm between rows.

Accessions were represented by one hillplot per replication. The 10-cm depth flood was applied when the average plant height was ~18 cm and maintained until harvest, whereas the unflooded treatment received regular flush irrigations (once or twice a week) to keep the root zone damp but aerated.

For the Arkansas field experiments, rice grains for three plants per row for each of the two replications were collected. Seed collection was done by hand and threshed with an Almaco small-bundle thresher to obtain the seeds for the grain Mn determination. For the Texas field experiment, 20 fully mature seeds per hillplot were dehulled, from which three seeds were randomly selected for analysis of grain Mn.

The concentrations of Mn were determined in the harvested grains using inductively coupled plasma mass spectrometry (ICP-MS) described in Norton et al. (2012b, 2014) and Pinson et al. (2015). In brief, three whole grains of dehusked rice (c. 0.05 g) were digested with 1.0 ml of concentrated nitric acid and heated. The temperature was ramped up from ambient to 110 °C over a period of 12 h. An internal standard of indium (final concentration of 20 µg l−1) was added to each sample. Samples were diluted to 10.0 ml and analysed on a PerkinElmer (Waltham, MA, USA) Elan DRCe ICP-MS for Mn. To control for drift, the samples were combined and used as a matrix-matched standard and measured every nine samples.

Phenotypic analysis

Phenotypic variances for Mn concentrations were calculated and parsed using two-way ANOVA conducted in R (version 3.3.0) (R Core Team 2016). Across the two field locations (Arkansas and Texas), four-field experiments were conducted, designated as ArFl06, ArFl07, TxFl09 and TxUnfl09. Across the Arkansas experiments (ArFl06 and ArFl07), the phenotypic variance was parsed into proportions estimated by genotypes, years, and interaction between genotype and year. For the Texas experiments (TxFl09 and TxUnfl09), the phenotypic variance was parsed into genotypes, water treatments and genotype-by-water-treatment interaction effects.

The average Mn concentration (Supplementary Table S1) of each accession per experiment and treatment was used for the GWA mapping. Prior to GWA mapping, the trait data were visualised to assess normality.

Genotypic data and analysis

The rice accessions in the RDP1 have three publicly available SNP datasets consisting of 36,901 (44 K) SNPs (Zhao et al. 2011), 700,000 (700 K) SNPs (McCouch et al. 2016) and 5,231,435 (5.2 M) SNPs (Wang et al. 2018). The 44 K and 700 K SNP datasets were generated by genotyping using 44 K SNP array and High-Density Rice Array (HDRA), respectively (Zhao et al. 2011; McCouch et al. 2016), whereas the 5.2 M SNP dataset, which contains no missing data, was generated by imputing from the set of the intersection of 700 K and 18 M SNPs (missing data <5% and minor allele frequency (MAF) > 1%) with 4.8 M SNPs of the 3000 Rice Genome Project (Wang et al. 2018).

The SNPs in each dataset were initially filtered using PLINK version 1.9 (Chang et al. 2015), whereby SNPs were removed when the percentage of missing genotype data for a single SNP exceeded 20% (the 5.2 M SNP dataset had no missing data, due to being imputed) and MAF was less than 5%.

GWA mapping with single-trait analysis

GWA mapping was performed using the three SNP datasets based on LMMs from EMMAX (version beta-7Mar2010) (Kang et al. 2010) using the PIQUE (Parallel Identification of QTLs using EMMAX) pipeline (https://github.com/tony-travis/PIQUE). Phenotype–genotype association was analysed for all accessions (ALL) and four subpopulations (AUS, IND, TEJ and TRJ) in the four-field experiments. Due to low accession numbers (<30) from the aromatic (ARO) subpopulation and the mixed genetic background of admixtures, these accessions were not analysed as separate subpopulations. Population structure was estimated by performing a principal component analysis on the informative SNP data and the eigenvectors for the first four principal components were included in the model as fixed effects for the analysis of the whole (ALL) population (Price et al. 2010) (note: population structure was not included in the analysis of the subpopulations). Relatedness (K matrix) between accessions was estimated by calculating pairwise identity-by-state (IBS) using the SNP data and was included in the models as random effects. For the ALL population GWA analyses, relatedness was estimated using the accessions that had phenotype data in each experiment. For the subpopulation analyses, relatedness was estimated using accessions from each subpopulation for which phenotype data were collected. The significance threshold for the association between SNP and traits was set at P value < 0.0001, a value previously used for this population (Famoso et al. 2011; Norton et al. 2014). To further filter these SNPs for false-discovery rates (FDR), the P values calculated by the GWA mapping analysis were adjusted using the Benjamini-Hochberg method (Benjamini and Hochberg 1995). To be reported as a SNP significantly associated with the trait, the SNP had to both meet the P value < 0.0001 and meet the criteria of a 5% FDR. Manhattan plots were used to visualise SNP positions on chromosomes with –log10(P) and Q–Q plots were used to visualise observed versus expected value probabilities using the qqman package in R version 3.3.0 (Turner 2014).

GWA mapping with multi-experiment analysis

Multi-experiment analysis of GWA mapping for grain Mn concentration for the three flooded-field experiments (ArFl06, ArFl07 and TxFl09) was performed. For this analysis, each environment was viewed as one trait. A total of 303 rice accessions (all accessions common among the three experiments) were used for the analysis with the 3,430,260 filtered SNPs (MAF > 0.05%) using the mvLMM in the GEMMA version 0.97 (Zhou and Stephens 2014). The mvLMM accounts for both population stratification and relatedness among samples to control confounding factors. The eigenvectors of the first four principal components were calculated using the smartpca programme in EIGENSOFT (Patterson et al. 2006) and included in the model as fixed effects. One eigen decomposition of the centred relatedness matrix (the n-by-n relatedness matrix; n = the number of samples) for random effects was computed from all filtered SNPs using the relatedness matrix function in GEMMA. The null hypothesis is that SNP effects of a single SNP in all experiments are zero, whereas the alternative hypothesis is nonzero effects of at least one SNP tested by a Wald test. P values of all association tests were presented with Manhattan plots and observed P values against expected P values were presented by Q–Q plots using the qqman package in R (Turner 2014). The guideline of reliability for significant SNPs was 0.0001 (Famoso et al. 2011; Norton et al. 2014). SNPs were also tested to a 5% FDR based on the Benjamini-Hochberg procedure (Benjamini and Hochberg 1995), as previously described.

Clustering significant SNPs and comparing QTLs on rice chromosomes

The grouping function CLUMP was used in PLINK to define candidate regions in the ALL analysis based on the 5.2 M SNP. Index SNPs were identified with P value < 0.0001 (Norton et al. 2014) and neighbouring SNPs were clumped with P value < 0.01 (default value) and squared allele frequency correlation (r2) >0.5 (applying the criteria from Butardo et al. (2017) based on the 700 K SNP dataset) with the index SNPs of each peak within 500 kb, which was the LD-decay average of all accessions in the RDP1 (Zhao et al. 2011). The candidate regions/QTLs were then mapped and compared with previously reported QTLs based on physical genome positions on the 12 rice chromosomes.

Local linkage disequilibrium decay analysis

To determine LD blocks in subpopulations that supported the significant peaks in the ALL analysis, a subset of the 5.2 M SNP data surrounding (1 Mbp) a significant peak was extracted using PLINK. Two methods were used: (1) local LD decay was estimated at r2 = 0.2, where r2 values were calculated using PLINK and estimated by binning the average r2 values of 10-kb windows (Biscarini et al. 2016; Norton et al. 2018); (2) r2 values in each SNP pair in each region 500 kb upstream and downstream were calculated and visualised as a local Manhattan plot against a LD heatmap using the LD.heatmap package in R version 3.3.0 (Shin et al. 2006), and then LD blocks were estimated using r2 ≥ 0.6 (high LD) (Ripke et al. 2014; Yano et al. 2016).

Candidate gene identification

Within each candidate region, positional genes were identified based on genes identified in the Rice Genome Annotation Project (version 7, http://rice.plantbiology.msu.edu). Retrotransposons and transposon genes were excluded. Genes located within candidate regions were examined and used to identify potential positional functional candidate genes, e.g., genes involved in the uptake, transport and accumulation of elements, associated with Mn. In addition, protein sequences (http://rice.plantbiology.msu.edu) of the list of candidate genes that were not matched with genes previously related to Mn were compared with protein database using BLASTp (https://blast.ncbi.nlm.nih.gov) to investigate gene-sequence homology with other species, in which genes were reported and characterised with functions involving Mn. In addition to gene validation, the gene expression profiles across a range of rice organs and tissues of all identified candidate genes obtained from RiceXPro (http://ricexpro.dna.affrc.go.jp, Sakai et al. 2013) were used to confirm the validity of candidate genes.

Differential gene expression of candidate genes was determined based on the gene expression analysis conducted by Campbell et al. (2020). This dataset is transcriptomic data from shoots of young plants from 91 accessions from the RDP1. Initially, the data were screened to identify which of the proposed candidate genes were expressed. Low- and high-grain Mn accessions were identified based on being in the highest 20% and lowest 20% for grain Mn concentration, for the three flooded experiments. Then only low- or high-grain Mn concentration accessions were selected for further analysis if they were low or high in at least two of the experiments. A total of 14 accessions were identified as high and 18 identified as low-grain Mn accessions for which transcript data were available (Supplementary Table S2). The expression of candidate genes was examined for evidence of differential expression based on this grouping. An ANOVA was used to determine if the gene expression was different between the two groups.

Estimation of phenotypic variance explained by significant SNPs

To determine the effect size of the QTLs, two approaches were taken. Either the smallest P-value/index SNPs or the most significant SNP located in candidate genes based on the 5.2 M SNP dataset was analysed. The proportion of phenotypic variance explained by each SNP was estimated using linear models, correcting for population structure and contrasting with the population-structure effects for all accessions (Zhao et al. 2011). ANOVA was used to contrast the linear models. For subpopulations, the phenotypic variance distribution of a significant SNP was estimated using a simple linear model without correcting for population structure.

Effect sizes by index SNPs in multi-experiment analysis

Effect sizes in each index SNP of QTLs newly identified in multi-experiment analysis were observed in the individual experiments estimated by the mvLMM model.

Results

Variation of grain Mn concentration in the RDP1

In Arkansas, grain Mn concentration for the accessions in 2006 and 2007 ranged from 21.4 to 62.7 mg kg−1 and from 20.6 to 68.5 mg kg−1, with means of 34.6 and 40.8 mg kg−1, respectively (Fig. 1a and Table 1). There were significant differences (P < 2 × 10−16, df = 321) in grain Mn concentration among genotypes, years and a significant interaction between years and genotypes that explained 39, 15 and 16% of the phenotypic variance, respectively. In Texas, grain Mn concentration for the accessions in 2009 under flooded and unflooded conditions ranges from 10.6 to 33.5 mg kg−1 and from 16.4 to 63.8 mg kg−1, with means of 20.9 and 34.8 mg kg−1, respectively (Fig. 1b and Table 1). The grain Mn concentration under the unflooded condition was significantly higher (1.7 times, P < 2 × 10−16, df = 367) than the concentration under the flooded condition. The rice grain Mn concentrations were affected by genotypes, water treatments and their interaction, which explained 14%, 61% and 11% of the phenotypic variance, respectively.

Fig. 1: Grain Mn distributions in all rice accessions.
figure 1

a Distribution of grain Mn concentration in Arkansas under flooded condition in 2006 and 2007. b Distribution of grain Mn concentration in Texas under flooded and unflooded conditions in 2009.

Table 1 Grain Mn concentration (mg kg−1) in the RDP1 accessions at each field experiment.

To compare the grain Mn accumulation among subpopulations, only those subpopulations with at least 30 accessions (AUS, IND, TEJ and TRJ) were studied. There was a significant difference in grain Mn concentration among the subpopulations (Fig. 2). In the three flooded-field experiments, the Japonica (TEJ and TRJ) subgroups had higher average grain Mn concentration than the Indica (AUS and IND) subgroups. In contrast, the TRJ subpopulation had the lowest average grain Mn concentration in TxUnfl09.

Fig. 2: Distribution of grain Mn concentration in rice in four subpopulations in four-field experiments.
figure 2

The horizontal black bar is the median of grain Mn concentration.

The accessions screen at these field sites are known to vary in the length of time to heading (Norton et al. 2012b); therefore, a correlation analysis was conducted to determine if there was a relationship between heading date and grain manganese concentration. For Arkansas 2007 and flooded experiment in Texas, there was no correlation between flowering time and grain Mn concentration. However, at the Arkansas 2006 experiment, there was a significant weak positive correlation (r = 0.235, P < 0.001) between grain manganese and flowering time, while at the Texas unflooded field site, there was a significant weak negative correlation (r = −0.278, P < 0.001) between grain manganese concentration and flowering time.

Density of SNPs among all accessions and subpopulations

To obtain high-quality SNPs in each SNP dataset, SNPs were filtered with genotype missing >20% and MAF < 0.05 (Supplementary Table S3). After SNP filtering, for example, average SNP density of 11.40, 0.99 and 0.11 kb per SNP was observed for the 44 K, 700 K and 5.2 M SNP datasets, respectively, for the ArFl06 dataset.

For subpopulations, it is noteworthy that the final number of filtered SNPs was lower in the TEJ and TRJ subpopulations compared to the AUS and IND subpopulations (Supplementary Table S3). For example, the SNP density in the TEJ subpopulation was 1 SNP per 0.41 kb, whereas the SNP density in the IND subpopulation was 1 SNP per 0.17 kb, when using the 5.2 M SNP dataset.

Single-trait GWA mapping for grain Mn concentration

Using the three SNP datasets, GWA mapping for grain Mn concentration was performed for all accessions (Fig. 3a and Supplementary Figs. S1–S3) and for the four subpopulations using the 5.2 M SNP dataset only (Supplementary Figs. S4–S7) in the four-field experiments.

Fig. 3: Genome-wide association-mapping results for grain Mn concentration in rice using single-trait analysis in all accessions grown in Arkansas under flooded condition in 2006.
figure 3

a Manhattan (left) and Q–Q (right) plots are presented for the 44 K, 700 K and 5.2 M SNP datasets. The blue horizontal line represents the –log10(P) threshold at 4. The red dot indicates SNP loci that passed 5% FDR. b Location of QTLs associated with grain Mn concentration in rice for four-field experiments based on the 5.2 M SNP dataset using single-trait analysis and previously reported QTLs. The four-field experiments are ArFl06—light blue, ArFl07—blue, TxFl09—orange and TxUnfl09—brown. Previous reported QTLs are displayed in purple with the letter indicating the study they were detected in; a = Stangoulis et al. (2007), b = Lu et al. (2008), c = Norton et al. (2010), d = Ishikawa et al. (2010), e = Norton et al. (2012a), f = Du et al. (2013) and g = Zhang et al. (2014). Known Mn-transporter locations are indicated by horizontal black lines, whereas the locations of candidate genes are indicated by horizontal red lines.

Increasing the SNP density increased the number of significant SNPs associated with the trait in analyses of all accessions and in subpopulation analysis (Fig. 3a, Supplementary Figs. S1–S7 and Supplementary Table S4). For example, no significant SNPs for grain Mn in the ALL analysis in ArFl06 were identified using the 44 K dataset, while 6 and 16 SNPs were significant using the 700 K and 5.2 M SNP datasets, respectively (Supplementary Table S4). In addition, there were no significant SNPs associated with Mn accumulation in several subpopulations based on the 44 K SNP dataset, whereas a number of significant SNPs were identified based on the 700 K and 5.2 M SNP datasets. For example, in the TEJ subpopulation in ArFl07, no significant SNPs were detected using the 44 K SNP dataset, while 6 and 11 significant SNPs for grain Mn were detected using the 700 K and 5.2 M SNP datasets, respectively (Supplementary Table S4).

Identification of grain Mn QTLs and candidate genes based on single-trait analysis

Based on the high-density SNP dataset (5.2 M SNPs), a number of candidate regions/QTLs in the four experiments (Supplementary Table S5) were mapped on rice chromosomes and compared with previously reported QTLs (Fig. 3b). QTLs were further focused on when SNPs within QTLs passed the 5% FDR (Table 2). Consequently, there were three QTLs on chromosome 3 and two QTLs on chromosome 7 that were significantly associated with grain Mn concentration under flooded and unflooded conditions that met the criteria (Table 2). Based on overlap regions from the CLUMP analysis in experiments, these QTLs on chromosome 3 were at 5.33–6.14, 6.39–7.23 and 7.02–7.87 Mbp. For two of these QTL regions, there are a number of candidate genes, including LOC_Os03g11010 (OsNRAMP2), LOC_Os03g11734 (OsFRDL1) and LOC_Os03g12530 (OsMTP8.1). On chromosome 7, the two overlapping QTL regions were at 7.21–8.06 and 7.78–8.57 Mbp. For the first of the two QTL regions, there was a good candidate gene: LOC_Os07g12900 (OsHMA3) but this gene is outside the candidate region for the second. The expression profiles of all candidate genes under normal growth conditions were obtained from the RiceXPro database (Supplementary Figs. S8–S11).

Table 2 Significant QTLs on chromosomes 3 and 7, and candidate genes for grain Mn concentration in all rice accessions under flooded and unflooded conditions based on P < 0.0001 and passing 5% FDR using the 5.2 M SNP dataset.

All four candidate genes mentioned above were identified as being expressed in shoots (Campbell et al. 2020). Of these four genes, two of the candidate genes (OsMTP8.1 and OsHMA3) were found to be differentially expressed between the low-grain Mn and high-grain Mn accessions (Supplementary Fig. S12a, b). The expression of LOC_Os03g12530 (OsMTP8.1) was higher in the accessions with low-grain Mn compared to the accessions with high-grain Mn, while the expression of LOC_Os07g12900 (OsHMA3) was higher in the accessions identified as having high-grain Mn compared to the low-grain Mn accessions.

Identification of grain Mn QTLs in subpopulations and candidate genes based on single-trait analysis

Due to the complex population structure in the RDP1, the estimation of linkage disequilibrium (LD) decay for single QTL across the whole panel is difficult. Therefore, to estimate the size of QTL regions based on LD, QTL analysis was conducted in the individual subpopulations (AUS, IND, TEJ and TRJ). In QTLs that were detected for the whole population and one of the subpopulations, the subpopulation analysis was used to estimate local LD.

For grain Mn QTLs in subpopulations, one significant QTL on chromosome 7 was identified in only the TEJ subpopulation based on the 5.2 M SNP dataset (Fig. 4a, Supplementary Figs. S4–S7 and Supplementary Table S6) that was concordant with significant SNPs on chromosome 7 in all analyses at the 5% FDR. To determine the accurate genomic position of the QTLs, local LD was analysed with two approaches, LD decay and LD heatmap. The QTL was identified at ~8.26 Mbp in the TEJ subpopulation (Fig. 4a). The average local LD decay between 7 and 9 Mbp on chromosome 7 was high at >1 Mbp (r2 > 0.2) (Fig. 4b). The result was concordant with LD heatmap that showed a large LD block at approximately 1.23 Mbp from 7.64 to 8.87 Mbp (r2 ≈ 0.6) (Fig. 4c). One candidate gene, OsNRAMP5 (~8.87 Mbp), was found to be located within the QTL. OsHMA3 at 7.40 Mbp, which was identified as a candidate gene for the QTL detected here in the ALL analysis, is just before this block, while OsNRAMP1, which is at 8.97 Mbp, is just after it (Fig. 4c). In this QTL, the significant SNP mlid0048878287 (8.78 Mbp, P = 8.11E−07), which located close to OsNRAMP5 and OsNRAMP1, explained approximately 8 and 29% of phenotypic variance in ALL and TEJ, respectively. Rice accessions with the TT genotype at this SNP had high Mn accumulation in grains compared to the rice accessions with the CT and CC genotypes (Fig. 4d). The expression profile of OsNRAMP5 and OsNRAMP1 under normal growth conditions was obtained from the RiceXPro database (Supplementary Figs. S13, S14). Both OsNRAMP5 and OsNRAMP1 were identified as being expressed in shoots of rice plants (Campbell et al. 2020). Of these two genes, OsNRAMP1 was found to be differentially expressed between the low-grain Mn and high-grain Mn accessions (Supplementary Fig. S12c). The expression of LOC_Os07g15460 (OsNRAMP1) was higher in the accessions identified as having grain Mn compared to the low-grain Mn accessions.

Fig. 4: Genome-wide association-mapping results for grain Mn concentration in the temperate japonica subpopulation based on the 5.2 M SNP dataset, as well as local linkage disequilibrium analysis and SNP allele effects.
figure 4

a Manhattan (left) and Q–Q (right) plots are presented in four-field experiments as ArFl06: Arkansas flooded 2006, ArFl07: Arkansas flooded 2007, TxFl09: Texas flooded 2009 and TxUnfl09: Texas unflooded 2009. The blue horizontal line represents the –log10(P) threshold at 4. The red dot indicates SNP loci that passed 5% FDR. b LD decay and c local Manhattan plot (top), as well as LD heatmap (bottom) for grain Mn concentration in ArFl07 on chromosome 7 at 7–9 Mbp and 6.5–9.5 Mbp, respectively. d The effect of SNP alleles on grain Mn concentration for the QTL on chromosome 7 with the SNP mlid0048878287 (8,781,883 bp) in all accessions (left) and the temperate japonica subpopulation (right) in ArFl07.

Multi-experiment GWA mapping for grain Mn concentration and candidate genes

To increase the power of GWA mapping, a single GWA mapping was conducted for grain Mn concentration of 303 accessions for the three flooded-field experiments (ArFl06, ArFl07 and TxFl09) based on the 5.2 M SNP dataset (MAF > 0.05, 3,430,260 filtered SNPs) using the mvLMM in the GEMMA software. A total of 64 SNPs were significantly associated with grain Mn concentration. Eight QTLs across the 12 rice chromosomes were identified (Fig. 5a and Supplementary Table S7). Two of these QTLs on chromosome 3, 5.97–6.95 and 6.63–7.51 Mbp, including OsFRDL1 and OsMTP8.1 (Figs. 3b, 5b and Supplementary Table S7), were consistent with the QTLs identified based on single-trait analysis. However, a total of 6 QTLs for grain Mn not detected by single-trait analysis were identified using multi-experiment analysis (Fig. 5 and Supplementary Table S7). The six QTLs of interest were at 1.16–1.38 Mbp on chromosome 3, 2.40–3.33 and 3.41–4.27 Mbp on chromosome 4, 0.39–1.00 Mbp on chromosome 9 and 11.39–12.30 and 25.61–25.62 Mbp on chromosome 11. All of these QTLs were novel for grain Mn concentration.

Fig. 5: Genome-wide association-mapping results for grain Mn concentration in rice based on the 5.2 M SNP dataset using multi-experiment analysis.
figure 5

a Manhattan (left) and Q–Q (right) plots are presented. The blue horizontal line represents the –log10(P) threshold at 4. The red dot indicates SNP loci that passed 5% FDR. The blue arrows point to new QTLs based on multi-experiment analysis. b Location of QTLs associated with grain Mn concentration in rice based on the 5.2 M SNP dataset using multi-experiment analysis and previously reported QTLs. QTLs in ArFl06–ArFl07–TxFl09 are presented in red. Previous reported QTLs are displayed in purple with the letter indicating the study they were detected in; a = Stangoulis et al. (2007), b = Lu et al. (2008), c = Norton et al. (2010), d = Ishikawa et al. (2010), e = Norton et al. (2012a), f = Du et al. (2013) and g = Zhang et al. (2014). Known Mn-transporter locations are indicated by horizontal black lines, whereas the locations of candidate genes are indicated by horizontal red lines.

Comparison of the effect sizes of index SNPs for the putative QTLs in each experiment estimated by the mvLMM showed that they were various (Table 3). For example, the QTL on chromosome 3 had similar small positive SNP effects in all experiments, whereas the two QTLs on chromosome 4 had negative SNP effects in ArFl07 compared to other experiments.

Table 3 New putative QTLs for grain Mn concentration based on the 5.2 M SNP dataset using multi-experiment analysis.

Discussion

This study has identified QTLs for grain Mn in rice. Some of those co-localise with previously identified QTLs and known genes involved in Mn accumulation in rice, while some are novel putative QTLs. One of the key objectives of QTL mapping is the identification of stable QTLs (e.g., those are detected in multiple environments). Using a multi-experiment GWA mapping approach, we have been able to identify these stable QTLs.

The environmental factors (different years, locations and water-management treatments) and genetic composition of the accessions affected the concentration of Mn in rice grains. In Arkansas, the average grain Mn concentrations between 2006 and 2007 were significantly different and year explained ~15% of the phenotypic variance. In Texas, the Mn concentration in grains under non-flooded condition significantly increased when compared to the rice cultivation under flooded condition with flooding explaining ~61% of the variation. This is in agreement with Pinson et al. (2015) who reported that water-treatment effects had a higher impact for element accumulation in rice grains than year effects, and the average grain Mn concentration under unflooded condition was greater than the average grain Mn concentration under flooded condition among 1763 rice accessions grown in Texas in 2007 and 2008. Senewiratne and Mikkelsen (1961) found that Mn concentration in rice leaves was 7.7-fold higher under unflooded condition as compared with flooded condition. One genetic factor that could have an influence on grain element concentrations is flowering time. As this population comprised a wide range of different accessions, the flowering window (the time from the first accession flowering to the last) is quite large (Norton et al. 2012b). During this time, the environmental conditions can change, which may affect the availability and therefore the accumulation of manganese. However, in this study, only at two sites were relationships between flowering time and grain manganese concentration overserved, and in both cases, the relationships explained only a small component of the variation.

The genetic differences among subpopulations also affected grain Mn concentrations such as higher grain Mn concentration in the TEJ and TRJ subpopulations grown under flooded conditions compared with the AUS and IND subpopulations (Fig. 2). In another study under flooded conditions, Japonica subgroup accessions had higher Mn concentrations in their rice grains than Indica subgroup accessions (Yang et al. 2018). Pinson et al. (2015) have shown that although water-management treatments had a high impact, genetic backgrounds in the 1763 rice accessions were a major factor for grain element accumulation in both flooded (average broad-sense heritability (H2) of 16 elements: 0.49, Mn: 0.58) and unflooded (average H2: 0.57, Mn: 0.70) conditions.

The efficiency of GWA mapping depends on several factors such as the proportion of variation explained by genotype (heritability), the underlying population structure within the panel, sample size and marker density. McCouch et al. (2016) suggested that increasing SNP density increases the ability to detect genetic loci. In this study, we tested the impact of marker density while using the same rice accessions, the same phenotype data and the same statistical modelling approach to account for population structure and kinship (Kang et al. 2010). We demonstrated that the number of markers covering the genome affects the efficacy of GWA mapping (Fig. 3a and Supplementary Figs. S1–S3). Our results revealed that higher marker density increases the number of significant loci associated with the trait. A similar observation has been made in a recent study, where GWA mapping for root cone angle in rice was conducted using 15,000 and 300,000 SNPs (Bettembourg et al. 2017). Wang et al. (2018) also showed that increasing from 700 K to 4.8 M SNPs in GWA mapping for the grain amylose content in 326 indica accessions provided increased confidence in QTLs, as well as revealing new ones. In the present study, increasing marker density improved the identification of genetic loci using GWA mapping. However, at this stage, it is unknown what the optimal marker density for GWA mapping in rice is.

From single-trait analyses using 5.2 M SNPs, five QTLs on chromosomes 3 and 7 were found to affect grain Mn concentration, and the QTL sizes ranged from 789 to 852 kb (Table 2). Some of these QTLs co-localise with QTLs previously identified in rice (Fig. 3b). For example, Norton et al. (2012a) detected grain Mn QTLs using the Bala × Azucena mapping population, located similarly with QTLs detected in this study (chromosome 3 at approximately 3.49–6.65 Mbp and chromosome 7 at ~7.12–9.14 Mbp). Zhang et al. (2014) detected QTLs for grain Mn under flooded growing conditions in a TeQing-into-Lemont backcross introgression population on chromosome 3 (4–6 Mbp) and chromosome 7 (10–14 Mbp), with the chromosome 7 QTL being identified also in an independent population of Lemont × TeQing recombinant inbred lines. To further narrow down a QTL on chromosome 7, Liu et al. (2017) characterised a major QTL for grain Mn accumulation in recombinant inbred lines from the cross of 93-11 (low-grain Mn) with PA64s (high-grain Mn) grown in two environments. A major QTL located on the short arm of chromosome 7 was fine-mapped between two markers (L8857 and L8906), a 49.3-kb region encompassing the known Mn transporter, OsNRAMP5 (Liu et al. 2017). Recently, Shrestha et al. (2018) conducted GWA mapping for shoot Mn toxicity in 271 RDP1 accessions based on the 700 K SNP dataset. Numerous significant SNPs were identified in a large region on the top of chromosome 7. Although they did not report an exact QTL size, both OsNRAMP5 and OsNRAMP1 were identified as candidate genes. These results reveal that our identified QTLs based on GWA mapping with the high SNP density were smaller than the comparable genomic regions when using other mapping approaches. As a result, the identified QTLs contained a smaller number of positional candidates, which means the identification of genes underpinning the QTLs should be easier.

Within the grain Mn QTLs on chromosomes 3 and 7, six genes are proposed as contributing to the natural variation observed in grain Mn concentration in the RDP1. The candidate genes were highly expressed in roots, shoots, reproductive organs or embryo and endosperm tissues (Supplementary Figs. S8–S11 and S13, S14). On chromosome 3, three candidate genes were identified as OsNRAMP2 (LOC_Os03g11010), OsFRDL1 (LOC_Os03g11734) and OsMTP8.1 (LOC_Os03g12530). While the function of OsNRAMP2 is unknown in rice, OsNRAMP2 has high structural similarity with an Mn transporter from Eremococcus coleocola (Mani and Sankaranarayanan 2018). OsNRAMP2 in rice is also an orthologous gene with AtNRAMP2 in Arabidopsis (Thomine et al. 2000) that is a trans-Golgi network-localised Mn transporter in roots under Mn deficiency (Gao et al. 2018). OsFRDL1 is a good candidate gene as a knockout of OsFRDL1 in rice resulted in lower leaf Fe concentration, and higher accumulation of Zn and Mn in leaves of rice (Yokosho et al. 2009). OsMTP8.1 has been shown to be involved in Mn homoeostasis achieved by sequestering excess Mn into vacuoles of rice (Chen et al. 2013, 2016), and to be an orthologous gene with AtMTP8 involving in the localisation of Mn and Fe in Arabidopsis seeds (Chu et al. 2017). On chromosome 7, there were three candidate genes: OsHMA3 (LOC_Os07g12900), OsNRAMP5 (LOC_Os07g15370) and OsNRAMP1 (LOC_Os07g15460). OsHMA3 is a known tonoplast-localised transporter for Zn and Cd in rice roots, but it is reported that the overexpression of OsHMA3 affected Mn concentration in roots and shoots (Sasaki et al. 2014). OsNRAMP5 is a major transporter for Mn as well as for Fe and Cd in rice (Ishimaru et al. 2012; Sasaki et al. 2012; Yang et al. 2014; Liu et al. 2017). Although OsNRAMP1 is an Fe transporter that is involved in Cd accumulations in rice (Takahashi et al. 2011), a phylogenetic analysis of NRAMP sequences in plants showed that OsNRAMP1 was most similar to OsNRAMP5 (Vatansever et al. 2016). Sheartha et al. (2018) also identified a QTL for Mn toxicity in rice using GWA mapping that encompassed both OsNRAMP1 and OsNRAMP5. Therefore, OsNRAMP1 is possibly involved in Mn transport or cross-talk between Fe and Mn homoeostasis (Vatansever et al. 2016).

Due to genetic similarity within the TEJ subpopulation, local LD for the identified QTL on chromosome 7 was analysed and estimated to define their candidate regions. The average LD decay from 7 to 9 Mbp in the TEJ subpopulation was high at >1 Mbp with the threshold of r2 = 0.2 (Fig. 4b). To determine if this large LD decay was specific to the TEJ subpopulation, the LD decay in the other subpopulations was determined (Supplementary Fig. S15). The LD decays across the other subpopulations with only the average LD decay in the IND subpopulation being lower. In addition to LD heatmap, the estimated LD distance in the region (9017 SNPs at 6.5–9.5 Mbp) in the TEJ subpopulation was 1.23 Mbp from 7.64 to 8.87 Mbp, indicating few historical recombination events. It was similar to a large LD block in the AUS (23,041 SNPs) subpopulation, whereas several LD blocks in the IND (13,731 SNPs) and TRJ (6513 SNPs) subpopulations were observed (Supplementary Fig. S15).

For multi-experiment analysis, conducting GWA mapping for grain Mn concentration with the phenotypic values of the three flooded-field experiments (ArFl06, ArFl07 and TxFl09) using the mvLMM, there were two QTLs that had previously been detected in the single-site analysis and six newly identified QTLs (Fig. 5). Similarly, Korte et al. (2012) reanalysed the flowering time data of Li et al. (2010) in 459 A. thaliana accessions grown over two seasons in each of two different locations using MTMM (Multi-trait mixed model) to reveal new QTLs. Three detected loci were involved in the differential flowering response to different environments that were not detected in the individual screens. Indeed, multi-trait analysis is an efficient tool for detecting loci/QTLs associated with multiple traits, because of the increased power obtained from additional data from correlated traits or a single trait in multiple experiments (Korte et al. 2012; Zhou and Stephens 2014). Thus, this approach should be used to identify stable QTLs, and is potentially beneficial in terms of GWAS of complex traits. The validation of the new QTLs could be further studied for identification of candidate genes underlying these QTLs that may contribute to the ultimate grain Mn concentration in rice.

While gene expression data were not collected for the plants grown in this experiment, recently transcriptomic analysis for 91 of the RDP1 accessions was conducted (Campbell et al. 2020). This database consists of gene expression data from shoots, and can be used to determine if genes are differentially expressed between accessions. For candidate genes discussed, all were found to be expressed in shoots with OsMTP8.1, OsHMA3 and OsNRAMP1 differentially expressed between the low- and high-grain Mn accessions (Supplementary Fig. S12). Differential gene expression means that these genes are very good candidates for the trait as this expression difference could be driving the QTLs. However, future analysis of gene expression between low and high Mn-accumulating accessions during grain filling will give a further insight into the role these genes play in the Mn accumulation in the grain.

Conclusion

This study uses data from multiple field experiments (locations, years and irrigation treatments) to conduct GWA mapping for a grain elemental trait, Mn concentration, in rice. We have demonstrated that multi-experiment analysis has a number of potential benefits, including the identification of QTLs not detected in individual analyses. Future study would be required to validate these genes, and identify the alleles that are responsible for variation in Mn accumulation in rice grains.