Genome-wide discovery of DNA polymorphisms by whole genome sequencing differentiates weedy and cultivated rice

Chai, Chenglin; Shankar, Rama; Jain, Mukesh; Subudhi, Prasanta K.

doi:10.1038/s41598-018-32513-z

Download PDF

Article
Open access
Published: 21 September 2018

Genome-wide discovery of DNA polymorphisms by whole genome sequencing differentiates weedy and cultivated rice

Chenglin Chai¹^na1^nAff3,
Rama Shankar²^na1,
Mukesh Jain ORCID: orcid.org/0000-0002-7622-1083² &
…
Prasanta K. Subudhi¹

Scientific Reports volume 8, Article number: 14218 (2018) Cite this article

3450 Accesses
23 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Analyzing the genome level DNA polymorphisms between weedy and cultivated rice is crucial to elucidate the molecular basis of weedy and agronomic traits, which in turn can enhance our ability to control weedy rice and its utilization for rice improvement. Here, we presented the genome-wide genetic variations between a weedy rice accession PSRR-1 and two cultivated rice accessions, Bengal and Nona Bokra, belonging to japonica and indica subspecies, respectively. The total number of SNPs and InDels in PSRR/Bengal was similar to that of Nona Bokra/Bengal, but was three times greater than that of PSRR/Nona Bokra. There were 11546 large-effect SNPs/InDels affecting 5673 genes, which most likely differentiated weedy rice from cultivated rice. These large effect DNA polymorphisms were mostly resulted in stop codon gain and least by start codon loss. Analysis of the molecular functions and biological processes of weedy rice specific SNPs/InDels indicated that most of these genes were involved in protein modification/phosphorylation, protein kinase activity, and protein/nucleotide binding. By integrating previous QTL mapping results with the DNA polymorphisms data, the candidate genes for seed dormancy and seed shattering were narrowed down. The genomic resource generated in this study will facilitate discovery of functional variants for weedy and agronomic traits.

The genome and population genomics of allopolyploid Coffea arabica reveal the diversification history of modern coffee cultivars

Article Open access 15 April 2024

Genetic gains underpinning a little-known strawberry Green Revolution

Article Open access 19 March 2024

A pan-genome of 69 Arabidopsis thaliana accessions reveals a conserved genome structure throughout the global species range

Article Open access 11 April 2024

Introduction

Weedy rice (Oryza sativa f. spontanea Rosh.), commonly known as red rice, is one of the most noxious weeds in rice growing areas worldwide^1,2. It competes with cultivated rice for natural resources, leading to significant yield loss³. Unexpected mixing of weedy rice and cultivated rice grains during harvesting reduces the quality and marketability. The infestation of red rice in the Southern rice belt (a region includes four southern U.S. states, i.e., Arkansas, Louisiana, Mississippi and Texas, where a significant portion of the nation’s rice crop is grown) results in loss of 50 million dollars annually⁴. The management of weedy rice is particularly troublesome for rice growers. The persistence of weedy rice in the rice field can be due to early flowering, heavy seed shattering, and intense seed dormancy, which ensure continued presence of weedy rice seeds in soil seed bank⁵.

Genetic studies have indicated multiple mechanisms of weedy rice evolution with possible contribution from ancestral cultivated rice and wild rice, depending on different geographic regions worldwide^{6,7,8,9,10,11,12,13,14,15,16}. Recent genome-wide analyses of DNA polymorphisms have further confirmed this possible origins^17,18,19,20. It was suggested that weedy rice from Northeast Asia have evolved locally from japonica or indica varieties²⁰ or as hybrids between modern indica/indica, or japonica/japonica¹⁷, whereas aus, indica, and wild rice have contributed toward evolution of weedy rice from South Asia²¹. Similarly, the mitochondrial genome analysis has suggested evolution of Korean weedy rice from cultivated rice¹⁸. There are two major genetically distinct groups of weedy rice in the United States such as the straw hulled (SH) and black hulled with long awns (BHA), which are believed to have originated in domesticated indica and aus rice background, respectively¹³. Recent studies involving both morphological data as well as whole genome sequences of weedy, cultivated, and wild rice have supported the evolution of US weedy rice by de-domestication^12,19. It is evident from these above studies that US weedy rice has diverse origins which has been shaped by evolutionary forces and few genetic changes in domesticated backgrounds led to emergence of weedy attributes^13,19. Although primarily considered as a destructive weed, weedy rice can be a valuable genetic resource for improving agronomically important traits including blast disease resistance²², rapid seedling growth⁸, photosynthetic rate and water use efficiency²³, flowering time²⁴, cold tolerance²⁵, seed shattering²⁶, and seed dormancy²⁷.

Rice, feeding more than half of the world’s population, has been domesticated from wild ancestor approximately 10,000 years ago. During rice domestication, non-shattering and non-dormant rice accessions have been selected to avoid yield loss and asynchronous germination, respectively. However, a certain degree of seed shattering is preferred for easy grain threshing and likewise shallow seed dormancy is required to prevent pre-harvest sprouting (PHS), which adversely affects yield and grain quality^28,29. Therefore, understanding the genetic basis of these traits generated a great deal of interest among plant geneticists, breeders, and weed scientists.

The degree of seed shattering varies greatly among different rice accessions. Wild rice (O. rufipogon and O. nivara) and weedy rice shed seeds very easily while the majority of cultivated rice show no or limited shattering³⁰. Within cultivated rice, generally indica cultivars shatter seed easily than japonica cultivars²⁶. The shattering trait in rice is controlled by the formation of an abscission layer^31,32. Several genes responsible for seed shattering have been cloned in rice. Sh4 is a major quantitative trait locus (QTL) that explained 69% of phenotypic variation between indica rice and the wild rice (O. nivara) and it encodes a transcription factor (TF) of trihelix family. SH4 promotes hydrolysis of abscission zone and a nonsynonymous single-nucleotide polymorphism (SNP) in its Myb3 DNA binding domain leading to incomplete abscission zone (AZ) and reduced seed shattering^33,34. A recent study on African rice has revealed role of SH4 in controlling grain length^35,36. Another QTL of seed shattering, qSH1 encodes a member of homeobox TF with a SNP in 5′ regulatory region causing a failure in abscission layer formation³⁷. A recessive shattering locus sh-h encoding a C-terminal domain phosphatase-like protein has been shown to repress AZ formation³⁸. A transcription factor, SHAT1, which is required for AZ development and functions down-stream of Sh4 and qSH1, has been identified³⁹. Several research groups have identified QTLs for seed shattering on all chromosomes except 9 and 10 using populations derived from crosses between cultivated rice and different weedy rice accessions^8,40,41,42. Genetic and genomic studies on seed shattering using cross between cultivated rice and wild rice (O. rufipogon) have also been conducted^43,44. Recently, our laboratory has reported 3–5 QTLs controlling seed shattering with 38–45% of the phenotypic variation in two recombinant inbred line (RIL) mapping populations involving the US weedy rice accession PSRR-1 and two US japonica varieties²⁶. Although the largest QTL on chromosome 4 overlapped with the Sh4, the presence of the non-shattering SNP allele in the weedy rice accession suggested involvement of a linked locus²⁶ or alternative genetic mechanisms⁴⁵.

Seed dormancy refers to the inability of viable seeds to germinate under favorable conditions⁴⁶. Seed dormancy, established and maintained during seed maturation, is gradually broken during dry storage due to after-ripening⁴⁷. It is a complex trait controlled by multiple genes with strong influence of environmental factors⁴⁸. Despite the fact that seed dormancy plays a critical role in environmental adaptation for wild species and is a trait of agronomic importance, the underlying molecular basis is not yet clearly elucidated. Genetic and molecular analyses in Arabidopsis have revealed the role of chromatin modification in controlling seed dormancy through cloning and functional analysis of HUB1 (also known as RDO4) and LDL1 and LDL2^47,49. DOG1, encoding a protein with unknown function, has been suggested to play a key role in the onset of seed dormancy⁵⁰. LDL1 and LDL2 worked redundantly in repressing seed-dormancy related genes including DOG1. In rice, a QTL for seed dormancy, Sdr4, was shown to contribute substantially to the difference in seed dormancy between japonica and indica cultivars⁵¹. Sdr4 expression was positively regulated by a global regulator of seed maturation OsVP1 and acted as an intermediate regulator of dormancy in the seed maturation program. Few studies have been conducted to detect QTL for seed dormancy in weedy rice^27,52 and a pleiotropic gene Rc was responsible for both seed dormancy and pericarp color⁵³. Another gene controlling endosperm-imposed dormancy was involved in gibberellin synthesis⁵⁴. Previously, we have detected 6–7 QTL for seed dormancy in two RIL populations developed from the crosses involving a weedy rice accession (PSRR-1) and these QTLs accounted for ~50% of the total phenotypic variation²⁷. One of the QTL overlapped with Sdr4, however the nucleotide polymorphisms for the variation in seed dormancy could not be validated in our materials.

Based on our previous QTL mapping results on seed shattering and seed dormancy^26,27, we continued to peruse the genetic basis underlying these two important agronomic traits by taking advantage of next-generation sequencing technology. More importantly, we report here the genome-wide genetic variation of weedy rice to generate genomic resources for discovery of genes associated with both weedy and agronomic traits. The objectives of the current study were (i) to identify genome-wide nucleotide polymorphisms between two rice cultivars and weedy rice, which will be useful for improving agriculturally important traits, and (ii) to identify candidate genes for seed shattering and seed dormancy by integrating our rice whole-genome re-sequencing data with QTL mapping data.

Results and Discussion

Two cultivated rice (O. sativa) (a tropical japonica cultivar, Bengal, and an indica cultivar, Nona Bokra) and one straw hulled (SH) weedy rice accession (O. sativa) (PSRR-1) with contrasting phenotypes of seed shattering and seed dormancy^26,27,55 were selected for the analysis of genomic variation. PSRR-1 showed higher degree of shattering compared to Bengal and Nona Bokra. Both PSRR-1 and Nona Bokra are intensely dormant compared to non-dormant Bengal.

Genome re-sequencing and reads mapping

We obtained a total of 307,009,538 paired-end reads and 287,967,294 high quality (HQ) filtered reads from the three genotypes. The percentage of HQ filtered reads ranged from 92% to 95% (Table 1) and all HQ filtered reads were used for mapping. About 92–99% of these reads were successfully mapped to Nipponbare reference genome, covering 92–97% of rice genome. Of the total reads, 94–97% reads were mapped to unique locations in the rice genome (Table 1). All the uniquely mapped reads were used for down-stream data analysis. The Illumina FASTQ files for PSRR-1, Bengal, and Nona Bokra were submitted to the sequence read archive (SRA) at NCBI with the accession numbers PRJNA413818, PRJNA413821, and PRJNA413822, respectively.

Table 1 Summary of mapping information of the three rice accessions in this study.

Full size table

Identification of SNPs and InDels

SNPs and InDels between Bengal and Nona Bokra (Bengal/Nona Bokra), PSRR-1 and Bengal (PSRR/Bengal), and PSRR-1 and Nona Bokra (PSRR/Nona Bokra), were identified (Table 2). Overall, PSRR/Bengal and Bengal/Nona Bokra showed similar numbers of DNA polymorphisms, which were about 2~3 times higher compared with PSRR/Nona Bokra. The total numbers of SNPs for PSRR/Bengal and Bengal/Nona Bokra were 1,704,184 and 1,414,468, respectively, while that of SNPs for PSRR/Nona Bokra was 632,302. However, the number of InDels was significantly lower than that of SNPs for each of the comparisons. The total numbers of InDels were 85,016 and 102, 242 for PSRR/Bengal and Bengal/Nona Bokra, respectively, whereas the number was 36,163 for PSRR/Nona Bokra. Furthermore, the densities of SNPs and InDels for PSRR/Bengal and Bengal/Nona Bokra were also similar, approximately two to three times higher than that of PSRR/Nona Bokra. Among the cultivated rice, genetic differentiation between indica and japonica subspecies is well established. Our study showed that PSRR-1 was genetically much closer to indica cultivar ‘Nona Bokra’ compared to the japonica cultivar ‘Bengal’ based on both total number and density of genome-wide SNPs and InDels. This observation as well as earlier studies^3,11,56 are in clear agreement with recent reports regarding evolution of straw hulled US weedy rice from indica cultivars through de-domestication¹⁹. Our high-density genetic markers across the whole rice genome could be useful in both theoretical and applied genetics such as genotyping, linkage disequilibrium studies, gene cloning, and marker-assisted breeding.

Table 2 Frequency of SNPs and InDels detected in PSRR-1, Bengal, and Nona Bokra.

Full size table

Nonrandom genomic organization of DNA polymorphisms

The genomic organization of DNA polymorphisms was investigated among PSRR-1 and two cultivars (Bengal and Nona Bokra) across all 12 rice chromosomes. The number of identified SNPs and InDels displayed considerable variations across chromosomes (Fig. 1). For both PSRR/Bengal and Bengal/Nona Bokra, the total number of DNA polymorphisms (SNPs and InDels) on each chromosome was proportional to the size of the chromosome (Fig. 1A, B, Supplementary Tables S1, S2). The SNPs and InDels were most abundant in chromosomes 1, 2, and 3, and less abundant in chromosomes 9, 10, and 12. However, PSRR/Nona Bokra showed different pattern of DNA polymorphism distribution among chromosomes: the SNPs were most abundant in chromosomes 1, 2, and 5 and InDels in chromosomes 1, 2, and 11; while SNP and InDel were scarce in chromosomes 7, 10, and 12 (Fig. 1A,B; Supplementary Tables S1, S2).

The distributions of SNPs and InDels within chromosomes in PSRR-1, Bengal, and Nona Bokra were not uniform (Fig. 2; Supplementary Tables S3, S4). Overall, more DNA polymorphisms were distributed in PSRR/Bengal, compared with those between PSRR/Nona Bokra. The number of high-density (≥250) SNP regions of 100 kb for PSRR/Bengal and PSRR/Nona Bokra were 2962 and 1132, respectively (Fig. 2A; Supplementary Table S3). Similarly, 51 and 582 low-density (≤5) SNP regions of 100 kb were detected for PSRR/Bengal and PSRR/Nona Bokra, respectively. Interestingly, we found 1783 and 244 SNP “hotspots” with extremely high density (≥1000 SNPs/100 kb) for PSRR/Bengal and PSRR/Nona Bokra, respectively. The InDels were not evenly distributed within chromosomes (Fig. 2B; Supplementary Table S4). We found 434 and 77 InDel rich (≥40) regions for PSRR/Bengal and PSRR/Nona Bokra, respectively. Likewise, low-frequency InDel regions were also detected. The significantly differential distribution of DNA polymorphisms has been documented in many plants including rice^57,58,59,60.

Analysis of SNPs and InDels

We further investigated the total numbers of transition (Ts) and transversion (Tv) for PSRR/Bengal, PSRR/Nona Bokra, and Bengal/Nona Bokra (Fig. 3A). The total numbers of Ts (A/G and C/T) were significantly higher than those of Tv (A/C, A/T, C/G, and G/T) for all three pairs. The total number of each type of Ts and Tv was nearly similar in PSRR/Bengal and Bengal/Nona Bokra, but was 2~3 times higher than that of PSRR/Nona Bokra. Overall, PSRR/Bengal and Bengal/Nona Bokra had similar frequency of each type of SNPs, which were 2~3 times higher than that of PSRR/Nona Bokra. The frequencies of A/G were at similar level as C/T in all cases. However, the frequencies of Tv were not at the similar level; the frequency of C/G was lower than the other three types of Tv, which were at the similar level. The ratio of Ts/Tv of PSRR/Bengal (~2.5) was slightly higher than those of PSRR/Nona Bokra and Bengal/Nona Bokra (~2.4), which showed different pattern as PSRR/Bengal was grouped with Bengal/Nona Bokra (Fig. 3B). The higher Ts/Tv (termed as transition bias), which had been reported in rice and maize^61,62, was caused by a higher frequency of Ts mutations over Tv mutations (due to conformational advantage in case of mispairing) and better tolerance to Ts changes because of less chance of changing protein structures/functions compared with Tv^59,63. Our results were consistent with previous reports from rice and other plants^60,64.

The length distributions of InDels identified among three rice accessions were analyzed (Supplementary Fig. S1). The size of insertions ranged from 1 to 16 for PSRR/Nona Bokra and 1 to 19 for PSRR/Bengal and Bengal/Nona Bokra. In all cases, the number of insertions was negatively correlated with the length of the insertions, i.e., most insertions (73~74%) involved single nucleotide followed by two nucleotides (14~15%) and three nucleotides (4~6%) and so on in a decreasing order, which led to the majority of insertions (97~98%) being 1 to 4 nucleotides in length. The longest insertion (16 or 19 nucleotides long) was only 0.003~0.007% of the total insertions. The deletions among the three rice accessions ranged from 1 to 32 nucleotides long and showed a similar pattern of distribution as insertions with the largest proportion of deletions (66~68%) with one nucleotide and the smallest proportion (~0.002%) with 32 nucleotides (Supplementary Fig. S1). Although the length distribution of InDels observed in this study was consistent with previous studies^59,60, the maximum length of InDels was greater in this study, which may be due to use of different rice accessions.

Annotation of DNA polymorphisms

The location and nature of DNA polymorphisms are known to influence gene expressions and functions that govern various biological processes^26,34,51. We conducted genome-wide annotation of the SNPs and InDels identified in different genomic regions (Fig. 4). In general, the patterns of SNPs and InDels in different genomic regions were quite similar for all comparisons, though the number of variants in PSRR/Bengal and Bengal/Nona Bokra was much higher than that of PSRR/Nona Bokra. A genic region was defined as the region between the transcription start site and the end of 3′ UTR^65,66. For all three pair-wise comparisons, SNPs occurred more frequently in noncoding regions (including intergenic regions, 5′ UTR, 3′ UTR, and introns) than in the coding regions (Fig. 4A). High frequency of genetic variants in the noncoding regions could result from less pressure from natural selection and/or domestication in these regions⁶⁷. However, DNA polymorphisms in these regions were reported to play important role during evolution and domestication. For example, some causal mutations responsible for important agriculturally important traits such as seed shattering³⁷ and pre-harvest sprouting²⁹ occurred in intergenic region and intron, respectively. In case of InDels, the highest frequencies were also found in the intergenic regions and the lowest frequencies were detected in coding regions for all pair-wise comparisons (Fig. 4B). Since large-effect genetic variants cause non-functional proteins leading to various phenotypic changes during evolution, we were prompted to investigate large-effect SNPs and InDels among the three rice genotypes in this study. The large-effect variants include disruption of splicing sites, loss of translation start codon, and introduction of premature stop codon. The non-synonymous SNPs and large-effect InDels only accounted for 5~6% and 1~2% of total polymorphisms, respectively (Fig. 4C,D). The high frequency of large-effect SNPs and InDels were in the coding regions (CDS) and low frequency of large-effect SNPs and InDels were found in the intergenic regions. For all the comparisons, large-effect SNPs in the intergenic regions largely resulted from stop codon gain (~64% for PSRR/Bengal and Bengal/Nona Bokra, and 73% for PSRR/Nona Bokra) and least from start codon loss (5% for PSRR/Bengal and Bengal/Nona Bokra, and 4% for PSRR/Nona Bokra) (Fig. 4A,C). In contrast, the frequencies of large-effect InDels in the intergenic regions were relatively even among different origin of variants (Fig. 4B,D). The patterns of large-effect InDels of PSRR/Bengal and Bengal/Nona Bokra were quite similar, with the highest percentage of large-effect InDels occurred either in the 5′ end of introns (splicing donor site) or through gain of stop codons and the lowest percentage of large-effect InDels occurred through lost start codons. For PSRR/Nona Bokra, however, the greatest portion of large-effect InDels were in the 3′ end of introns (splicing acceptor site) and the smallest portion of large-effect InDels was caused by loss of stop codon.

Validation of SNPs and InDels

The reliability of the DNA polymorphisms on a global scale is a prerequisite for various genome-wide studies. To experimentally validate SNPs and InDels identified in this study, we sequenced the PCR amplified DNA fragments harboring randomly selected 27 variants including seven SNPs, eight insertions, and 12 deletions. About 92% of selected variants were validated successfully by this approach (Supplementary Table S5). This high validation rate suggested high reliability of the identified DNA polymorphisms with great potential. Since PSRR-1 and Nona Bokra have been demonstrated to be a reservoir of genes for improving agronomic traits as well as for understanding the domestication process^{25,26,27,57,68,69}, the genome-wide DNA polymorphism resources will be useful in future studies.

Functions of large-effect DNA polymorphism

In this study, we identified 11546 nonsynonymous/large-effect SNPs/InDels that were specific to PSRR-1 (not present in Bengal or Nona Bokra), affecting about 5673 genes (Supplementary Table S6). In order to investigate their putative functions affected in weedy rice PSRR-1 compared to Bengal and Nona Bokra, eukaryotic orthologous group (KOG) analysis was conducted (Fig. 5A). Besides genes with general and unknown functions, genes involved in ‘signal transduction’, ‘amino acid transport and metabolism’, and ‘lipid transport and metabolism’ were significantly enriched. The functions of genes were further investigated by gene ontology (GO) analysis (Fig. 5B,C). Genes involved in biological processes such as ‘protein modification/ phosphorylation’ were over-represented (Fig. 5B). Analysis at molecular function level revealed that genes involved in ‘protein kinase activity’ and ‘protein/nucleotide binding’ were significantly represented (Fig. 5C). Those large-effect SNPs/InDels and non-synonymous SNPs might be, to some extent, responsible for the contrasting phenotypes (including seed shattering and dormancy) between weedy rice and cultivated rice.

We were prompted to explore biological insight of those DNA variants. Rice is sensitive to cold and low temperature stress negatively affects early establishment and eventual grain yield⁷⁰. PSRR was tolerant to cold stress at germination stage, while Bengal was susceptible to low temperature²⁵. A recent genome-wide association study on cold tolerance at germination stage has revealed 42 cold tolerance QTLs and corresponding candidate genes⁷¹. By searching PSRR specific SNPs/InDels, we identified two SNPs responsible for nonsense and missense mutations in two cold tolerance candidate genes (LOC_Os01g02750, a putative protein kinase, and LOC_05g36240, an unknown protein), respectively, which are among the candidate gene list (Supplementary Table S6, Supplementary Figs 2 and 3). These candidate genes need to be functionally characterized for their role in improving cold tolerance. Using a similar approach, we also found a candidate gene (LOC_Os11g45980, encoding an NBS-LRR type disease resistance protein) for blast disease resistance in rice (Supplementary Table S6, Supplementary Fig. S4). This gene candidate was close to one of the QTLs associated with blast disease resistance⁷² and harbors a PSRR specific nonsynonymous SNP (Supplementary Table S6, Supplementary Fig. S4). Bengal is reported to be susceptible to blast disease⁷³ while nearly 50% of US weedy rice accessions are resistant to this disease⁷². It will be interesting to investigate if PSRR is resistant to the blast disease, and if so, to explore the function of this candidate gene. However, to fully understand biological functions of all the over-enriched genes harboring PSRR specific SNPs/InDels, which may account for the many contrasting traits of agronomic importance between PSRR and cultivated rice, it is imperative to identify/characterize those traits and incorporate other data such as QTL mapping, transcriptomic profiling, and genetic complementation for each trait to narrow down the candidate genes and confirm their role in expression of those desirable traits.

When genes harboring SNPs/InDels in promoter regions were considered for GO enrichment analysis, similar functional categories were represented (Supplementary Fig. S5). But when KOG analysis was done for genes harboring nonsynonymous SNPs/InDels and SNPs/InDels in the promoters of the genes in the QTL regions (Sh4 region for shattering and qSD7-1 for seed dormancy), there was difference in enrichment pattern for these traits (Supplementary Fig. S6). For seed dormancy, Bengal was contrasted against PSRR-1 and Nona Bokra, which were both dormant. Similarly, GO enrichment analysis was done for genes affected in sh4 region of PSRR-1 in relation to Bengal and Nona Bokra (both non shattering types). Significant enrichment was observed for genes involved in translation, ribosomal structure and biogenesis, and transcription for seed dormancy attribute; whereas genes involved in amino acid transport and metabolism, coenzyme transport, and metabolism were enriched for seed shattering. GO enrichment analysis revealed carbohydrate metabolism and dephosphorylation biological processes for both traits. Substrate specific and transmembrane transporter activity were important for seed dormancy (Supplementary Fig. S7) whereas meiotic cell cycle, cellular response to stimulus, beta xylanase and glucosyl transferase were enriched for seed shattering QTL (Supplementary Fig. S8).

Candidate genes for seed shattering and seed dormancy

Previously we have identified QTL for both seed shattering and seed dormancy using two recombinant inbred line (RIL) populations developed from the crosses involving rice cultivars (Bengal and Cypress) and the same weedy rice accession PSRR-1 used in this study^26,27. Although the major QTL for seed shattering qSH4 overlapped with known shattering gene Sh4, the presence of non-shattering SNP allele in the weedy rice suggested that another gene nearby might be responsible for the shattering phenotype in weedy rice²⁶. Similarly, one of the major QTL for seed dormancy overlapped with known dormancy gene Sdr4, but the non-dormant allele in weedy rice indicated involvement of other gene(s)²⁷. To further explore the genetic basis of these two traits, we narrowed down the candidate genes by linking large-effect DNA polymorphisms with predicted functional and agronomic relevance using the next generation sequencing (NGS) data of the parents (Bengal, Nona Bokra, and PSRR-1).

For seed shattering, the qSH4 was mapped to a region on chromosome 4 between two SSR markers RM5506 and RM127, which is about 1.2 Mb in physical size (Chr4: 33307270.0.34529722 bp interval) and harbors 254 genes²⁶. By filtering out low-impact genetic variances, we identified 15 non-synonymous/large-effect SNPs between PSRR-1 (shattering phenotype) and two non-/reduced shattering cultivars (Bengal and Nona Bokra), which were distributed in 8 genes (Table 3). None of the listed genes was among those reported genes controlling seed shattering, pod dehiscence, or fruit shedding suggesting a new genetic mechanism. More experimental evidence and/or bioinformatics prediction including expression profiling and genetic complementation will be needed to identify and functionally characterize the candidate genes.

Table 3 Unique SNP/InDel and candidate genes for seed shattering in qSH4 QTL region.

Full size table

For seed dormancy, the major QTL qSD7-2^BR was mapped to a 4.5 Mb region between two molecular markers²⁷. Twenty-one genetic variants were unevenly distributed in 11 genes (Table 4). Among these genes, at least two genes were identified that may play major role in the control of seed dormancy. The nonsynonymous SNP in the 13th exon of LOC_Os07g10490, which is annotated as a zeta-carotene desaturase, caused an amino acid change from Arginine in dormant genotypes ‘PSRR-1’ and ‘Nona Bokra’ to Glutamine in non-dormant cultivated rice ‘Bengal’ (Supplementary Fig. 9). The zeta-carotene desaturase was a key enzyme in carotenoid biosynthesis and carotenoids serves as precursors in ABA biosynthesis. Mutation in this gene caused ABA deficiency leading to decreased seed dormancy/preharvest sprouting phenotype. The other gene, Rc, which showed a 14-bp deletion within exon 6 in cultivated rice⁷⁴, was also identified in our study. The 14-bp deletion was found in Rc gene of non-dormant white-pericarp cultivar Bengal, but not in that of the dormant red pericarp weedy rice PSRR-1 and indica cultivar Nona Bokra. The Rc gene was recently reported to play pleiotropic role controlling both seed dormancy and pericarp color⁵³. In this study, we identified few other candidates for further investigation to unambiguously associate a gene with seed dormancy using a different approach. More importantly, this study demonstrated that combining mapped QTL with whole genome sequence data could be a reliable approach for gene identification.

Table 4 Unique SNP/InDel and candidates in seed dormancy QTL (qSD7-2) region.

Full size table

Conclusions

Weedy rice is a promising valuable genetic resource for rice improvement due to its fitness advantage, early flowering time, and biotic and abiotic stress tolerance. Despite its morphological similarity with cultivated rice, differences between weedy and cultivated rice at whole genome level shed some light on the genome organization in weedy rice compared to the cultivated rice. High degree of similarity of weedy rice to indica cultivar revealed through genome-wide DNA polymorphisms suggested that it might have originated from indica rice. Majority of SNPs/InDels were present in intergenic regions. Gain of stop codon was more prevalent compared to start codon loss resulting non-synonymous and large effect SNPs and InDels. Combining our earlier QTL mapping results with the NGS data, candidate genes for two QTLs Sh4 and qSD7-2 were narrowed down. Genome-wide DNA polymorphisms reported here will now facilitate discovery of functional variants associated with important agronomic traits. The genomic resources generated in this study will accelerate both molecular genetics and molecular breeding investigations in rice.

Materials and Methods

DNA sample preparation and sequencing

Genomic DNA was extracted from leaves of two-week old seedlings of two cultivated rice (Bengal and Nona Bokra) and a weedy rice accession (PSRR-1) using Qiagen DNeasy kit (Qiagen Inc., Valencia, CA, USA). Bengal is medium grain high yielding non-dormant japonica rice cultivar with reduced seed shattering released by the Louisiana State University Agricultural Center⁷⁵. Nona Bokra is a salt tolerant land race from India belonging to indica subspecies of rice with tall plant stature, red pericarp, non-shattering, and strong seed dormancy. Nona Bokra has been used to map seed dormancy QTLs⁵⁵. PSRR-1 was collected from the Rice Research Station at Crowley, LA and was purified by single plant selection for two generations before its use for developing mapping populations and sequencing. It has light green leaves, vigorous growth, long auricles and ligules, straw-hulled medium grains, lax open panicles, and pubescent leaves. PSRR-1 is extremely susceptible to shattering and has a higher intensity of both hull and pericarp dormancy compared to Bengal^26,27. The quality and quantity of DNA samples were analyzed by Bioanalyzer 2100 (Agilent Technologies, Singapore) and Qubit 2.0 Fluorometer (Invitrogen Life Technologies, Eugene, Oregon), respectively. The libraries were prepared using Illumina TruSeq DNA sample preparation kit (Illumina, USA) according to the manufacturer’s protocol and paired-end sequencing was performed in an Illumina Hiseq 2000 at the Virginia Bioinformatics Institute, Blacksburg, VA, for generating 101-bp long reads. The generated raw data were filtered using an in-built standard Illumina pipeline.

Read quality checking and read mapping

The filtered reads from the Illumina pipeline were further processed using NGS QC Toolkit (v2.3.3; http://www.nipgr.res.in/ngsqctoolkit.html) to remove primer/adopter sequences and low quality reads; Phred quality score <30) and only high-quality reads (Phred quality score ≥30) were used for mapping⁷⁶. Mapping of the high-quality filtered reads on the rice reference genome (MSU7 version; http://rice.plantbiology.msu.edu/index.shtml) was performed using Burrows-Wheeler Alignment (BWA) software (v0.7.12; http://bio-bwa.sourceforge.net/)⁷⁷. Coverage of the reference genome was estimated using SAMtools (v1.1; http://samtools.sourceforge.net/)⁷⁸.

Detection and analysis of SNPs and InDels

FreeBayes software (v0.9.21; https://github.com/ekg/freebayes) was used for the identification of SNPs and InDels using three criteria: the minimum variant frequency of ≥90%, average quality of the SNP base ≥30, and minimum read depth of 10. Additional filtering of SNPs and InDels was performed when there were three or more SNPs/InDels in any 10-bp window⁷⁹. The frequency of SNPs/InDels in each 100 kb interval on each rice chromosome was calculated to reveal the genome-wide distribution of polymorphisms. Circos⁸⁰ was used to visualize the distribution of DNA polymorphisms on rice chromosomes. The distribution of DNA polymorphisms in different genomic regions was evaluated by integrating the positions of DNA polymorphisms with GFF file. Analyses including identification, genomic distribution, and annotation of DNA polymorphisms (synonymous/nonsynonymous SNPs), and large-effect SNPs/InDel were performed using SnpEff (v4.1k)⁸¹ using default parameters. We used sequence of 2 kb upstream regions of genes for the promoter analysis.

Gene ontology and KOG analysis

Gene ontology (GO) enrichment analysis was carried out using BiNGO plug-in (v 2.44, https://www.psb.ugent.be/cbd/papers/BiNGO/Home.html) available in Cytoscape (version 3.2.1, http://www.cytoscape.org/), with P-value cut-off of ≤0.05. Rice GO information for biological process and molecular function categories available in BiNGO were used for GO enrichment analysis. Genes were classified according to eukaryotic orthologous group (KOG) grouping by searching gene sequences against KOGnitor database available at the National Center for Biotechnology Information (NCBI: https://www.ncbi.nlm.nih.gov/).

Mapping of SNPs/InDels on QTLs

Two major effect QTLs (qSH4 and qSD7-2^BR) have been reported for seed shattering and seed dormancy, respectively^26,27. The large effect SNPs/InDels and nonsynonymous SNPs present in these two QTLs were identified based on their co-localization in genomic coordinates of QTLs.

Validation of SNPs and Indels

For validation, primers were designed from 400 bp flanking sequences of 27 randomly selected SNPs/Indels. The fragments were amplified from the genomic DNA of Bengal, Nona Bokra, and PSRR-1 as templates via polymerase chain reaction (PCR) using Phusion® High-Fidelity DNA Polymerase (New England Biolabs, Ipswich, USA). The PCR products were purified by using either DNA Clean and Concentrator^TM-5 (ZYMO Research, Irvine, USA) or Gel Extraction Kit (OMEGA Bio-tek, Norcross, USA). The purified PCR products were sequenced at the Genomic Facility of Louisiana State University.

References

Estorninos, L. E., Gealy, D. R., Gbur, E. E. & Talbert, R. E. Rice and red rice interference. II. Rice response to population densities of three red rice (Oryza sativa) ecotypes. Weed Sci. 53, 683–689 (2005).
Article CAS Google Scholar
He, Z. et al. Seed-mediated gene flow promotes genetic diversity of weedy rice within populations: implications for weed management. PLoS ONE 9, e112778 (2014).
Article ADS Google Scholar
Gealy, D., Tai, T. & Sneller, C. Identification of red rice, rice, and hybrid populations using microsatellite markers. Weed Sci. 50, 33–339 (2002).
Article Google Scholar
Smith, R. J. Jr. How to control hard-to-kill weeds in rice. Weeds Today 10, 12–14 (1979).
Google Scholar
Delouche, J. et al. Weedy rices: origin, biology, ecology and control. FAO Plant Production and Protection Paper 188, FAO Rome. 144 pp (2007).
Grimm, A., Fogliatto, S., Nick, P., Ferrero, A. & Vidotto, F. Microsatellite markers reveal multiple origins for Italian weedy rice. Ecol. Evol. 3, 4786–4798 (2013).
Google Scholar
Sun, J. et al. Introgression and selection shaping the genome and adaptive loci of weedy rice in northern China. New Phytol. 197, 290–299 (2013).
Article CAS Google Scholar
Thurber, C. S., Jia, M. H., Jia, Y. & Caicedo, A. L. Similar traits, different genes? Examining convergent evolution in related weedy rice populations. Mol. Ecol. 22, 685–698 (2013).
Article CAS Google Scholar
Song, B. K., Chuah, T. S., Tam, S. M. & Olsen, K. M. Malaysian weedy rice shows its true stripes: wild Oryza and elite rice cultivars shape agricultural weed evolution in Southeast Asia. Mol. Ecol. 23, 5003–5017 (2014).
Article CAS Google Scholar
Song, Z. J. et al. Genetic divergence of weedy rice populations associated with their geographic location and coexisting conspecific crop: Implications on adaptive evolution of agricultural weeds. J. System. Evol. 53, 330–338 (2015).
Article CAS Google Scholar
Londo, J. P. & Schaal, B. A. Origins and population genetics of weedy red rice in the USA. Mol. Ecol. 16, 4523–4535 (2007).
Article CAS Google Scholar
Kanapeckas, K. L. et al. Escape to ferality: the endoferal origin of weedy rice from crop rice through de-domestication. PLoS ONE 11, e0162676 (2016).
Article Google Scholar
Reagon, M. et al. Genomic patterns of nucleotide diversity in divergent populations of U.S. weedy rice. BMC Evol. Biol. 10, 180 (2010).
Article Google Scholar
Qiu, J. et al. Genome re-sequencing suggested a weedy rice origin from domesticated indica-japonica hybridization: a case study from southern China. Planta 240, 1353–1363 (2014).
Article CAS Google Scholar
Zhang, J. et al. Cytoplasmic-genetic male sterility gene provides direct evidence for some hybrid rice recently evolving into weedy rice. Sci. Rep. 5, 10591 (2015).
Article ADS Google Scholar
De Wet, J. & Harlan, J. Weeds and domesticates: evolution in the man-made habitat. Econ. Bot. 29, 99–108 (1975).
Article Google Scholar
He, Q., Kim, K. W. & Park, Y. J. Population genomics identifies the origin and signatures of selection of Korean weedy rice. Plant Biotechnol. J. 15, 357–366 (2017).
Article CAS Google Scholar
Tong, W., He, Q. & Park, Y. J. Genetic variation architecture of mitochondrial genome reveals the differentiation in Korean landrace and weedy rice. Sci. Rep. 7, 43327 (2017).
Article ADS Google Scholar
Li, L. F., Li, Y. L., Jia, Y., Caicedo, A. L. & Olsen, K. M. Signatures of adaptation in the weedy rice genome. Nat. Genet. 49, 811 (2017).
Article CAS Google Scholar
Qiu, J. et al. Genomic variation associated with local adaptation of weedy rice during de-domestication. Nat. Commun. 8, 15323 (2017).
Article ADS CAS Google Scholar
Huang, Z. et al. All roads lead to weediness: Patterns of genomic divergence reveal extensive recurrent weedy rice origins from South Asian. Oryza. Mol. Ecol. 26, 3151–3167 (2017).
Article Google Scholar
Liu, Y. et al. QTL analysis for resistance to blast disease in U.S. weedy rice. Mol. Plant-Microbe Interact. 28, 834–844 (2015).
Article CAS Google Scholar
Gao, Q. et al. Photosynthetic and water physiological characteristics of weedy rice in northern China. J.Appl. Ecol. /Zhongguo sheng tai xue xue hui, Zhongguo ke xue yuan Shenyang ying yong sheng tai yan jiu suo zhu ban 24, 3131–3136 (2013).
CAS Google Scholar
Thurber, C. S., Reagon, M., Olsen, K. M., Jia, Y. & Caicedo, A. L. The evolution of flowering strategies in US weedy rice. Am. J. Bot. 101, 1737–1747 (2014).
Article Google Scholar
Borjas, A. H., De Leon, T. B. & Subudhi, P. K. Genetic analysis of germinating ability and seedling vigor under cold stress in US weedy rice. Euphytica 208, 251–264 (2016).
Article Google Scholar
Subudhi, P. K. et al. Mapping of seed shattering loci provides insights into origin of weedy rice and rice domestication. J. Hered. 105, 276–287 (2014).
Article CAS Google Scholar
Subudhi, P. K. et al. Genetic architecture of seed dormancy in U.S. weedy rice in different genetic backgrounds. Crop Sci. 52, 2564–2575 (2012).
Article Google Scholar
Cai, H. W. & Morishima, H. Genomic regions affecting seed shattering and seed dormancy in rice. Theor. Appl. Genet. 100, 840–846 (2000).
Article CAS Google Scholar
Fang, J. et al. Mutations of genes in synthesis of the carotenoid precursors of ABA lead to pre-harvest sprouting and photo-oxidation in rice. Plant J. 54, 177–189 (2008).
Article CAS Google Scholar
Lee, G. H., Kang, I. K. & Kim, K. M. Mapping of novel QTL regulating grain shattering using doubled haploid population in rice (Oryza sativa L.). Intl. J. Genomics 2016, 2128010 (2016).
Google Scholar
Roberts, J. A., Elliott, K. A. & Gonzalez-Carranza, Z. H. Abscission, dehiscence, and other cell separation processes. Ann. Rev. Plant Biol. 53, 131–158 (2002).
Article CAS Google Scholar
Lin, Z. et al. Origin of seed shattering in rice (Oryza sativa L.). Planta 226, 11–20 (2007).
Article CAS Google Scholar
Li, C., Zhou, A. & Sang, T. Genetic analysis of rice domestication syndrome with the wild annual species. Oryza nivara, New Phytol. 170, 185–193 (2006a).
Article CAS Google Scholar
Li, C., Zhou, A. & Sang, T. Rice domestication by reducing shattering. Science 311, 1936–1939 (2006b).
Article ADS CAS Google Scholar
Liu, H. & Yan, J. Rice domestication: an imperfect African solution. Nat. Plants 3, 17083 (2017).
Article Google Scholar
Wu, W. et al. A single-nucleotide polymorphism causes smaller grain size and loss of seed shattering during African rice domestication. Nat. Plants 3, 17064 (2017).
Article CAS Google Scholar
Konishi, S. et al. An SNP caused loss of seed shattering during rice domestication. Science 312, 1392–1396 (2006).
Article ADS CAS Google Scholar
Ji, H. et al. Inactivation of the CTD phosphatase-like gene OsCPL1 enhances the development of the abscission layer and seed shattering in rice. Plant J. 61, 96–106 (2010).
Article CAS Google Scholar
Zhou, Y. et al. Genetic control of seed shattering in rice by the APETALA2 transcription factor shattering abortion1. Plant Cell 24, 1034–1048 (2012).
Article CAS Google Scholar
Bres-Patry, C., Lorieux, M., Clément, G., Bangratz, M. & Ghesquiére, A. Heredity and genetic mapping of domestication-related traits in a temperate japonica weedy rice. Theor. Appl. Genet. 102, 118–126 (2001).
Article CAS Google Scholar
Gu, X. Y., Kianian, S. F., Harel, G. A., Hoffer, B. L. & Foley, M. E. Genetic analysis of adaptive syndromes interrelated with seed dormancy in weedy rice (Oryza sativa). Theor. Appl. Genet. 110, 1108–1118 (2005).
Article CAS Google Scholar
Qi, X. et al. More than one way to evolve a weed: parallel evolution of US weedy rice through independent genetic mechanism. Mol. Ecol. 24, 3329–3344 (2015).
Article Google Scholar
Kwon, S. J. et al. Genetic analysis of seed-shattering genes in rice using an F_3:4 population derived from an Oryza sativa × Oryza rufipogon cross. Genet. Mol. Res. 14, 1347–1361 (2015).
Article CAS Google Scholar
Xie, X. et al. Levels and patterns of nucleotide variation in domestication QTL regions on rice chromosome 3 suggest lineage-specific selection. PLoS ONE 6, e20670 (2011).
Article ADS CAS Google Scholar
Thurber, C. S. et al. Molecular evolution of shattering loci in U.S. weedy rice. Mol. Ecol. 19, 3271–3284 (2010).
Article Google Scholar
Lin, S. Y., Sasaki, T. & Yano, M. Mapping quantitative trait loci controlling seed dormancy and heading date in rice, Oryza sativa L., using backcross inbred lines. Theor. Appl. Genet. 96, 997–1003 (1998).
Article CAS Google Scholar
Zhao, M., Yang, S., Liu, X. & Wu, K. Arabidopsis histone demethylases LDL1 and LDL2 control primary seed dormancy by regulating DELAY OF GERMINATION 1 and ABA signaling-related genes. Front. Plant Sci. 6, 159 (2015).
PubMed PubMed Central Google Scholar
Gu, X. Y., Kianian, S. F. & Foley, M. E. Multiple loci and epistases control genetic variation for seed dormancy in weedy rice (Oryza sativa). Genetics 166, 1503–1516 (2004).
Article CAS Google Scholar
Liu, Y., Koornneef, M. & Soppe, W. J. The absence of histone H2B monoubiquitination in the Arabidopsis hub1 (rdo4) mutant reveals a role for chromatin remodeling in seed dormancy. Plant Cell 19, 433–444 (2007).
Article Google Scholar
Bentsink, L., Jowett, J., Hanhart, C. J. & Koornneef, M. Cloning of DOG1, a quantitative trait locus controlling seed dormancy in Arabidopsis. Proc. Natl. Acad. Sci., USA 103, 17042–17047 (2006).
Article ADS CAS Google Scholar
Sugimoto, K. et al. Molecular cloning of Sdr4, a regulator involved in seed dormancy and domestication of rice. Proc. Natl. Acad. Sci. USA 107, 5792–5797 (2010).
Article ADS CAS Google Scholar
Gu, X. Y., Turnipseed, E. B. & Foley, M. E. The qSD12 locus controls offspring tissue-imposed seed dormancy in rice. Genetics 179, 2263–2273 (2008).
CAS PubMed PubMed Central Google Scholar
Gu, X. Y. et al. Association between seed dormancy and pericarp color is controlled by a pleiotropic gene that regulates abscisic acid and flavonoid synthesis in weedy red rice. Genetics 189, 1515–1524 (2011).
Article CAS Google Scholar
Ye, H., Beighley, D. H., Feng, J. & Gu, X. Y. Genetic and physiological characterization of two clusters of quantitative trait loci associated with seed dormancy and plant height in rice. G3 3, 323–331 (2013).
Article CAS Google Scholar
Marzougui, S. et al. Mapping and characterization of seed dormancy QTLs using chromosome segment substitution lines in rice. Theor. Appl. Genet. 124, 893–902 (2011).
Article Google Scholar
Vaughan, L. K. et al. Is all red rice found in commercial rice really Oryza sativa? Weed Sci. 49, 468–476 (2001).
Article CAS Google Scholar
Nordborg, M. et al. The pattern of polymorphism in Arabidopsis thaliana. PLoS Biol. 3, e196 (2005).
Article Google Scholar
Zhang, W. et al. The pattern of insertion/deletion polymorphism in Arabidopsis thaliana. Mol. Genet. Genomics 280, 351–361 (2008).
Article CAS Google Scholar
Subbaiyan, G. K. et al. Genome-wide DNA polymorphisms in elite indica rice inbreds discovered by whole-genome sequencing. Plant Biotechnol. J. 10, 623–634 (2012).
Article CAS Google Scholar
Jain, M., Moharana, K. C., Shankar, R., Kumari, R. & Garg, R. Genomewide discovery of DNA polymorphisms in rice cultivars with contrasting drought and salinity stress response and their functional relevance. Plant Biotechnol. J. 12, 253–264 (2014).
Article CAS Google Scholar
Morton, B. R. Neighboring base composition and transversion/transition bias in a comparison of rice and maize chloroplast noncoding regions. Proc. Natl. Acad. Sci. USA 92, 9717–9721 (1995).
Article ADS CAS Google Scholar
Batley, J., Barker, G., O’Sullivan, H., Edwards, K. J. & Edwards, D. Mining for single nucleotide polymorphisms and insertions/deletions in maize expressed sequence tag data. Plant Physiol. 132, 84–91 (2003).
Article CAS Google Scholar
Wakeley, J. The excess of transitions among nucleotide substitutions: new methods of estimating transition bias underscore its significance. Trends Ecol. Evol. 11, 158–162 (1996).
Article CAS Google Scholar
Agarwal, G. et al. Comparative analysis of kabuli chickpea transcriptome with desi and wild chickpea provides a rich resource for development of functional markers. PloS ONE 7, e52443 (2012).
Article ADS CAS Google Scholar
Hindorff, L. A. et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl. Acad. Sci. USA 106, 9362–9367 (2009).
Article ADS CAS Google Scholar
Li, X. et al. Genic and nongenic contributions to natural variation of quantitative traits in maize. Genome Res. 22, 2436–2444 (2012).
Article CAS Google Scholar
Barreiro, L. B., Laval, G., Quach, H., Patin, E. & Quintana-Murci, L. Natural selection has driven population differentiation in modern humans. Nat. Genet. 40, 340–345 (2008).
Article CAS Google Scholar
Subudhi, P. K. et al. A chromosome segment substitution library of weedy rice for genetic dissection of complex agronomic and domestication traits. PLoS ONE 10, e0130650 (2015).
Article Google Scholar
Puram, V. R. R., Ontoy, J., Linscombe, S. & Subudhi, P. K. Genetic dissection of seedling stage salinity tolerance in rice using introgression lines of a salt tolerant landrace Nona Bokra. J. Hered. 108, 658–670 (2017).
Article Google Scholar
Cruz, R. P. et al. Avoiding damage and achieving cold tolerance in rice plants. Food Energy Secur. 2, 96–119, https://doi.org/10.1002/fes3.25 (2013).
Article Google Scholar
Shakiba, E. et al. Genetic architecture of cold tolerance in rice (Oryza sativa) determined through high resolution genome-wide analysis. PLoS ONE 12, e0172133, https://doi.org/10.1371/journal.pone.0172133 (2017).
Article CAS PubMed PubMed Central Google Scholar
Liu, Y. et al. QTL Analysis for resistance to blast disease in U.S. weedy rice. Mol Plant-Microbe Interactions 28, 834–844 (2015).
Article CAS Google Scholar
Wamishe, Y., Cartwright, R. & Lee, F. Management of Rice Diseases in Arkansas rice production hand book (ed. Hardke, J.), 123–137 (University of Arkansas Division of Agriculture Cooperative Extension Service MP192, 2013).
Sweeney, M. T., Thomson, M. J., Pfeil, B. E. & McCouch, S. Caught red-handed: Rc encodes a basic helix-loop-helix protein conditioning red pericarp in rice. Plant Cell 18, 283–294 (2006).
Article CAS Google Scholar
Linscombe, S. D. et al. Registration of ‘Bengal’ rice. Crop Sci. 33, 645–646 (1993).
Article Google Scholar
Patel, R. K. & Jain, M. NGS QC Toolkit: a toolkit for quality control of next generation sequencing data. PLoS ONE 7, e30619 (2012).
Article ADS CAS Google Scholar
Li, H. & Durbin, R. Fast and accurate short-read alignment with Burrows-Wheeler Transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS Google Scholar
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Article Google Scholar
Jhanwar, S. et al. Transcriptome sequencing of wild chickpea as a rich resource for marker development. Plant Biotechnol. J. 10, 690–702 (2012).
Article CAS Google Scholar
Krzywinski, M. et al. Circos: an information aesthetic for comparative genomics. Genome Res. 19, 1639–1645 (2009).
Article CAS Google Scholar
Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 6, 80–92 (2012).
Article CAS Google Scholar

Download references

Acknowledgements

We thank Ms. Teresa De Leon for technical assistance in this project. This research was supported by United States Department of Agriculture-National Institute of Food and Agriculture (Grant No. 2006-35320-16555) to P.K. Subudhi. The manuscript is approved for publication by the Director of Louisiana Agricultural Experiment Station, USA as manuscript number 2018-306-32090.

Author information

Chenglin Chai
Present address: Noble Research Institute, LLC, 2510 Sam Noble Parkway, Ardmore, OK, 73401, USA
Chenglin Chai and Rama Shankar contributed equally.

Authors and Affiliations

School of Plant, Environmental, and Soil Sciences, Louisiana State University Agricultural Center, Baton Rouge, LA, 70803, USA
Chenglin Chai & Prasanta K. Subudhi
School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, 110067, India
Rama Shankar & Mukesh Jain

Authors

Chenglin Chai
View author publications
You can also search for this author in PubMed Google Scholar
Rama Shankar
View author publications
You can also search for this author in PubMed Google Scholar
Mukesh Jain
View author publications
You can also search for this author in PubMed Google Scholar
Prasanta K. Subudhi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

P.K.S. designed the study. C.C. and R.S. conducted the experiment, contributed to the data analysis, and generated all the figures and tables. M.J. supervised the data analysis. C.C. and P.K.S. wrote the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Mukesh Jain or Prasanta K. Subudhi.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Combined Supplemental information except Table S6

Supplementary Dataset Table S6

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Chai, C., Shankar, R., Jain, M. et al. Genome-wide discovery of DNA polymorphisms by whole genome sequencing differentiates weedy and cultivated rice. Sci Rep 8, 14218 (2018). https://doi.org/10.1038/s41598-018-32513-z

Download citation

Received: 26 October 2017
Accepted: 10 September 2018
Published: 21 September 2018
DOI: https://doi.org/10.1038/s41598-018-32513-z

Keywords

This article is cited by

Genome-wide SNP and InDel analysis of three Philippine mango species inferred from whole-genome sequencing
- Cris Q. Cortaga
- John Albert P. Lachica
- Eureka Teresa M. Ocampo
Journal of Genetic Engineering and Biotechnology (2022)
Discovery of DNA polymorphisms via genome-resequencing and development of molecular markers between two barley cultivars
- Yueya Zhang
- Jin Shi
- Weiwei Chen
Plant Cell Reports (2022)
Whole genome sequence analysis of rice genotypes with contrasting response to salinity stress
- Prasanta K. Subudhi
- Rama Shankar
- Mukesh Jain
Scientific Reports (2020)
Parallel reaction monitoring revealed tolerance to drought proteins in weedy rice (Oryza sativa f. spontanea)
- Bing Han
- Xiaoding Ma
- Longzhi Han
Scientific Reports (2020)
Whole-genome sequencing reveals uniqueness of black-hulled and straw-hulled weedy rice genomes
- Md. Shofiqul Islam
- Sapphire Coronejo
- Prasanta Kumar Subudhi
Theoretical and Applied Genetics (2020)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results and Discussion

Genome re-sequencing and reads mapping

Identification of SNPs and InDels

Nonrandom genomic organization of DNA polymorphisms

Analysis of SNPs and InDels

Annotation of DNA polymorphisms

Validation of SNPs and InDels

Functions of large-effect DNA polymorphism

Candidate genes for seed shattering and seed dormancy

Conclusions

Materials and Methods

DNA sample preparation and sequencing

Read quality checking and read mapping

Detection and analysis of SNPs and InDels

Gene ontology and KOG analysis

Mapping of SNPs/InDels on QTLs

Validation of SNPs and Indels

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing Interests

Additional information

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

Keywords

This article is cited by

Comments

Search

Quick links