Domestication represents a unique opportunity to study the evolutionary process. The elimination of seed dispersal traits was a key step in the evolution of cereal crops under domestication. Here, we show that ObSH3, a YABBY transcription factor, is required for the development of the seed abscission layer. Moreover, selecting a genomic segment deletion containing SH3 resulted in the loss of seed dispersal in populations of African cultivated rice (Oryza glaberrima Steud.). Functional characterization of SH3 and SH4 (another gene controlling seed shattering on chromosome 4) revealed that multiple genes can lead to a spectrum of non-shattering phenotypes, affecting other traits such as ease of threshing that may be important to tune across different agroecologies and postharvest practices. The molecular evolution analyses of SH3 and SH4 in a panel of 93 landraces provided unprecedented geographical detail of the domestication history of African rice, tracing multiple dispersals from a core heartland and introgression from local wild rice. The cloning of ObSH3 not only provides new insights into a critical crop domestication process but also adds to the body of knowledge on the molecular mechanism of seed dispersal.
Crop domestication is a process of reshaping wild species to be adapted for cultivation and to meet human needs1. Several morphological characteristics, such as plant architecture, seed size and dispersal, are common changes during domestication2,3,4,5,6,7,8,9,10,11,12,13,14,15. Cereal crops, which are globally critical foods, present perhaps the most classical change: the elimination of the primary seed dispersal mechanism, known as shattering2. It is thought that wild progenitors of modern cereal crops shed their seeds on maturation to ensure effective reproduction, and cultivars retain seed on the plant to avoid yield loss and to improve efficient harvests. Recently, the genes controlling the seed dispersal of several cereal crops, such as Asian rice, wheat, sorghum and barley, have been characterized, promoting our understanding of the genetic mechanisms of the elimination of seed shattering13,14,15. In addition, investigating the evolutionary history of the genes controlling the loss of seed shattering provides valuable insight into the geography of artificial selection shaping domestication and the natural selection patterns on these traits in the wild. For example, a detailed analysis of the molecular evolution of the Non-brittle rachis 1 (BTR1) and Non-brittle rachis 2 (BTR2) genes identified the origin of cultivated barley15.
African rice (Oryza glaberrima Steud.) was gradually domesticated from the African wild species (Oryza barthii), with a peak genetic bottleneck at ~3,000 years ago16 that coincides in timing with an abundance of archaeological findings17. The crop is well adapted for cultivation in West Africa, and possesses traits for increased tolerance to biotic and abiotic stresses including high temperature, drought, soil acidity and weed competitiveness. In areas with the most adverse ecological conditions, O. glaberrima is favoured by farmers for its adaptability and resistance to multiple constraints17,18,19,20,21. Recent genomic studies of O. glaberrima and O. barthii pointed to one domestication event for African rice22, and suggested the geographical routes of the spread19. Scholars have attributed this spread to the translocation of the Mande homeland and migrations to the coast23 that have largely been due to climate change24. However, the genetic mechanism and evolutionary history of major domestication genes of African rice, knowledge of which can assist in tracing dispersal routes after the onset of domestication, is largely unknown. The simple domestication history of O. glaberrima makes it a good model to study the evolution of domestication traits. As loss of seed dispersal is one of the most important domestication traits, unravelling the genetic mechanism and evolutionary history of genes controlling this trait in O. glaberrima may provide insight into the routes through which the crop diversified and adapted to new agroecologies and cultural systems.
Our previous study indicated that a single nucleotide polymorphism (SNP) in the shattering 4 gene (SH4, an orthologue of grain length 4 (GL4)) resulted in a premature stop codon and led to loss of seed shattering during African rice domestication25. However, this SNP mutation does not exist in some non-shattering African rice varieties, implying that there might be another gene or another mutation of SH4 controlling the non-shattering trait. To identify the new gene or mutation responsible for the loss of seed shattering in cultivated African rice, we developed an F2 segregating population derived from a cross between the African wild rice accession W1411, which exhibits a shattering phenotype, and a non-shattering cultivar of African cultivated rice, IRGC104165, containing the wild allele of SH4 (Supplementary Fig. 1).
To distinguish precisely the differences in abscission layer anatomy between W1411 and IRGC104165, we observed longitudinal sections of spikelets using confocal microscopy. We found that the W1411 samples exhibited a complete abscission layer between the seed pedicel and the spikelet, which can be seen in a longitudinal section as continuous lines of abscission cells between the vascular bundle and the epidermis (Fig. 1a). Conversely, IRGC104165 samples had a wider and partially developed abscission layer (Fig. 1b). Further careful comparison of the abscission layers of W1411 and IRGC104165 spikelets showed that the former consist of mostly one layer of small, thin-walled cells, while the latter consist of three or four layers of cells. On the palea side, the abscission layer cells were partially developed, and a very irregularly developed abscission layer existed on the lemma side. The fracture surface of rachilla was investigated using scanning electron microscopy (SEM). We found that W1411 samples had a smooth fracture surface (Fig. 1c–e), whereas IRGC104165 samples had a smooth surface only at the peripheries in the transverse plane on the palea side (Fig. 1f–h). These results indicated that the loss of seed shattering in IRGC104165 resulted from the irregular development of the seed abscission layer.
A genetic linkage analysis of 168 F2 individuals derived from the cross between W1411 and IRGC104165 suggested that seed shattering was controlled by a single gene lying on the long arm of chromosome 3. We designated this gene as Oryza barthii seed shattering 3 (ObSH3) (Fig. 2a). Using a total of 2,650 recessive homozygote plants with the non-shattering phenotype from the F2 population, we delimited ObSH3 between the SNP29 and SNP31 markers (Fig. 2b). In this fine mapping region, the genomic sequence of IRGC104165 was 17-kb without a predicted open reading frame (ORF). By contrast, the genomic sequence of W1411 was 63-kb, with a 45.5-kb insertion compared with that of IRGC104165, which contained six predicted ORFs (ORF1–ORF6) (Fig. 2c; Supplementary Fig. 2). The quantitative rtPCR analyses showed that of the six ORFs, only ORF3 was expressed in the abscission layer region (Supplementary Fig. 3). A sequence analysis of 5′- and 3′-rapid amplification of cloned/complimentary DNA end (RACE) products indicated that the ORF3 cDNA in W1411 is 1,187-bp long, with an ORF of 561-bp, a 325-bp 5′ untranslated region (UTR) and a 301-bp 3′ UTR (Supplementary Fig. 4). ORF3 was predicted to encode a transcription factor gene belonging to the YABBY family, which plays important roles in the development of plant lateral organs such as leaves and floral organs26,27,28. Therefore, we focused on ORF3 as a candidate for ObSH3.
To confirm this hypothesis, we knocked out the ORF3 gene of W1411 using the CRISPR (clustered regularly interspaced short palindromic repeats)–Cas9 genome editing system. We selected a unique target site in the coding region of the ORF3 gene that did not have any similar and potentially off-target sites in the rice genome. In the T0 generation, more than ten heterozygous transgenic plants were screened by sequencing analysis. The transgenic plants that showed sequence variations in the target region were self-pollinated to generate T1 generations and genotyped using the primers flanking the target region. Two knockout lines with homozygous mutations that included a 1-bp insertion (Cas9-1) and a 5-bp deletion (Cas9-2) in the target region were selected. These targeted mutations caused frameshifts (and therefore loss of function), and all exhibited loss of seed shattering as well as an incomplete abscission layer, similar to that of IRGC104165 plants (Fig. 2d–g; Supplementary Fig. 5). These results indicated that ORF3 is ObSH3, and is essential for abscission zone development below the grain.
We also generated a construct (pSH3) by placing a 12-kb genomic fragment from W1411, covering the ORF3 gene only, into the vector pCAMBIA1300. Owing to the recalcitrance of the African variety IRGC104165 to regenerate shoots from callus, we introduced the pSH3 construct into a chromosome segment substitute line GIL116. This construct includes a small region containing the sh3 locus from African rice under the Asian cultivated rice Oryza sativa var. Taichong 65 genetic background. GIL116 plants exhibited a partially developed abscission layer and harder seed shedding than that of Taichong 65 plants (Fig. 2h–i). All 15 of the independent transgenic lines showed abscission layers and easy shedding traits similar to that of Taichong 65, but without changes in agronomic traits such as grain length and grain weight (Fig. 2j; Supplementary Fig. 6). These results further confirmed that the ORF3 gene was SH3.
SH3 encodes 186 amino acids, and was predicted to contain a nuclear localization signal using the PSORT program (http://psort.hgc.jp/). To confirm this prediction, we developed a constitutively expressing construct by fusing the full-length SH3 to the carboxyl terminus of green fluorescent protein (GFP). The construct was transiently expressed in onion epidermal cells. The GFP signal was detected in the nucleus, which is in accordance with the prediction that ObSH3 is a transcription factor (Fig. 3a).
To further examine the SH3 protein, we retrieved PSI-BLAST results using the full-length protein sequence of SH3 as a query against the non-redundant protein database (https://www.ebi.ac.uk/Tools/sss/psiblast/). A phylogenetic analysis of these putative homologues indicated that SH3 was most closely related to genes found in other monocots, including maize (B4FY22), barley (M0YM09) and Brachypodium (I1GPY5). An amino acid sequence analysis also showed that the zinc finger domain and the YABBY domain of ObSH3 orthologues were highly conserved in monocots (Supplementary Fig. 7).
The expression pattern of the ObSH3 gene was examined by real-time quantitative PCR, with the β-glucuronidase (GUS) reporter gene driven by the ObSH3 promoter. We found by quantitative rtPCR that ObSH3 was mainly expressed at the abscission layers of seeds, leaves and stems, but not in the roots (Fig. 3b). The GUS signal correlated well with the quantitative rtPCR results, which exhibited an intense signal in the pedicel abscission zone and the apiculus of the spikelet hull (Fig. 3c; Supplementary Fig. 8). RNA in situ hybridization results corroborated the expression of ObSH3 at the abscission layer of the spikelet pedicel (Fig. 3d), which is consistent with its role of controlling abscission layer cellular development.
Our previous study showed that a SNP in SH4 caused both loss of seed shattering and reduced grain sizes in O. glaberrima25. To compare the genetic effect of seed shattering between plants with mutations in sh3, sh4 or both (sh3 and sh4), we developed three near-isogenic lines (NIL-sh3, NIL-sh4 and NIL-sh3/sh4) under the Asian cultivated rice O. sativa var. Taichong 65 genetic background. These near-isogenic lines contained a very small sh3 (NIL-sh3) or sh4 (NIL-sh4) region from African cultivated rice O. glaberrima. Longitudinal sections of spikelets at the anthesis stage were compared using confocal microscopy. We found that Taichong 65 samples showed a nearly complete abscission zone between the grain and the pedicel. Both NIL-sh3 and NIL-sh4 exhibited partially developed abscission layers. The double mutant NIL-sh3/sh4 samples had no abscission layer (Fig. 4a). Consistent with the severity of abscission layer loss, the seeds of the NIL-sh3/sh4 double mutant was harder to shed than the seeds of NIL-sh3 and NIL-sh4 (Fig. 4b), indicating that combinations of these mutations can tune the threshability of rice.
To elucidate the evolutionary history of seed shattering in O. glaberrima, we first investigated the genetic relationship of 93 O. glaberrima and 94 O. barthii accessions using previously published resequencing data16,22. Our results were consistent with those of previous studies16,22, in which O. glaberrima can be partitioned into the following four geographical quadrants with landraces sharing some genetic proximity: northwest (OG-NW) and southwest (OG-SW) coastal populations, northeast (OG-NE) and southeast (OG-SE) inland populations. O. barthii can be grouped into four populations (OB-I, OB-II, OB-III and OB-V) as well as one admixture population (OB-IV) (Fig. 5a; Supplementary Figs. 9–11). Results from a previous study16 demonstrated that after domestication, O. glaberrima diversified into east and west populations, and those later diversified into north and south populations. Our phylogenetic analysis of 93 O. glaberrima and 94 O. barthii accessions matched that result and supported the findings of another study22 that demonstrated that the OB-V population shared most ancestry with O. glaberrima. While details of the domestication trajectory of African rice are beyond the scope of this work, we point out that the OG-NE accessions, although less represented in number, are spread across several clades in the phylogeny, and share more alleles with O. barthii according to the ADMIXTURE plot for K = 7 (where K is the number of populations assumed for analysis). This result is consistent with the hypothesis that O. glaberrima was domesticated in the middle Niger River delta of Mali and then spread across West Africa23,29,30.
A spatial analysis of the distribution of SH3 and SH4 or combined genotypes showed that most accessions with both mutations occur in arid regions north of 11° N (Fig. 5b). However, the addition of the phylogeny and ADMIXTURE analysis (Fig. 5a) demonstrates that this double mutation form evolved twice: once in the NE and once in the NW. Principal component analysis (PCA) confirmed the isolation of NE double mutation accessions (Supplementary Fig. 12). Both the NE and NW innovations could have arisen from exploiting standing variation during artificial selection for stronger resistance to shattering. For O. glaberrima plants that carry either of the mutations but not both, there is widespread occurrence of the SNP mutation in SH4, whereas there is geographical and phylogenetic restriction of the SH3 deletion. This result suggests that the SH3 deletion is a derived form that is only maintained in certain agroecologies, possibly unpopular elsewhere because of the additional phenotype of smaller grain size that reduces yield25. That is, O. barthii carrying the SH4 mutation are far more widespread (Fig. 5c). The OG-SE population also has the SH4 mutation only, as does the adjacent OB-V accessions in the phylogeny (Fig. 5a).
The prevalence of either or both non-shattering genotypes in the wild progenitor of African rice, O. barthii, is of great significance regarding how domestication is generally characterized. It may be that several of the classical traits that became fixed over time, and that are part of the “domestication syndrome”2,34, are in fact also under natural selection in the same direction35. Only in recent years have large numbers of wild relatives of crops been sequenced and evaluated for domestication-associated mutations. The loss of the seed shattering trait is a textbook example of a change that is expected to have dire consequences on fitness in the wild. The ecological significance of non-shattering phenotypes in the wild merits future investigation, especially as diversity panels of wild relatives of crops are being assembled for future crop improvement36,37. Likewise, the agroecological and cultural significance of having these different non-shattering genotypes merits future investigation, especially as rice breeding programmes in Africa must consider the adoptability of new varieties into existing farming regimens. What advantage does complete removal of the grain abscission zone have for farmers in arid climates? Harvest lengths and storage time before threshing may be location-specific practices set to optimize thresh ability for different varieties. A better understanding of the ecological context of these genotypes, spanning pre-harvest and postharvest abiotic and biotic risks, presents opportunities to explain the process of co-evolution between crop species and people.
Plant materials and growth conditions
For the cloning of ObSH3, W1411 and IRGC104165 plants were used. W1411, an African wild rice accession (O. barthii), was collected from Sierra Leone. IRGC104165, a cultivar of African rice (O. glaberrima), was collected from Guinea. The F2 segregating population was derived from the cross between W1411 and IRGC104165. The other cultivars and wild-rice accessions used in this study are listed in Supplementary Table 1. All plants were grown in field conditions in Beijing or Sanya, Hainan province, China.
Evaluation of shattering
The degree of shattering was measured after each panicle was shaken gently by hand. Plants with >80% of the grains removed were classified as shattering, whereas those with few grains removed were classified as non-shattering. The panicles of plants were harvested 35 days after heading and were kept at room temperature. Each panicle was attached vertically upside down to a digital force gauge (FGP-1; Nidec-Shimpo), and each grain was pulled down using forceps. The maximum tensile strength measured at the moment when the grain detached from the pedicel was recorded in gram-force (gf) units. For each plant, a total of 50 grains from three panicles were measured.
The primers used in this study are listed in Supplementary Table 2.
Histological analysis and SEM
Approximately ten spikelet samples were gathered at the flowering stage. A longitudinal section was made through the junction between the flower and pedicel by hand cutting, and the sections were stained with acridine orange. Sections were observed using an Olympus FV1000 laser scanning microscope. A 488 nm and a 543 nm laser lines were used. For SEM, the pedicel junctions after detachment of mature seeds were fixed in 2.5% glutaraldehyde solution, then gold plated, and observed using a Hitachi S-2460 scanning electron microscope.
Fine mapping of ObSH3
An F2 population was bred from the cross between W1411 and IRGC104165. DNA was extracted from fresh leaves according to the cetyl-trimethyl-ammonium bromide method38. The genomic location of the ObSH3 locus was defined by two molecular markers SNP29 and SNP31. Details of the markers used for fine mapping are given in the Supplementary Table 2.
Preparation of constructs for gene editing and ObSH3 promoter:GUS fusion
Constructs for CRISPR–Cas9-based genome editing of ObSH3 (the target sites are shown in Supplementary Fig. 4) were performed as previously described39. Rice transformation was performed using the Agrobacterium-mediated method. To construct the ObSH3 promoter:GUS fusion plasmid, an ~2-kb DNA fragment comprising the promoter sequence of ObSH3 in W1411 was amplified and cloned into the pCAMBIA1301-GUS-nos vector. The SH3 coding sequence together with the 2,208-bp upstream and 696-bp downstream flanking region was amplified using the primers pSH3-2F and pSH3-2R from W1411 by PCR and recombined with the pCAMBIA1300 vector digested by Kpn I and Sal I to generate the pSH3 vector. Transgenic plants were generated by Agrobacterium-mediated transformation and then transformed into ZH17. Relevant PCR primer sequences are given in Supplementary Table 2.
5′- and 3′-RACE and quantiative real-time PCR
Total RNA was extracted using TRIzol reagent (Qiagen). We conducted 5′- and 3′-RACE using a SMARTer RACE 5′/3′ kit (TaKaRa) following the manufacturer’s instructions. The product of first-strand cDNA was used as the template for the PCR. Real-time (rt)PCR was performed using an ABI Prism 7900 Sequence Detection System (Applied Biosystems). Quantitative rtPCR was carried out using SYBR Green Master Mix (Bio-Rad). Three replicates were performed. Rice ACTIN1 was used as the internal control. Primers used for quantitative real-time PCR are listed in Supplementary Table 2.
Subcellular localization of ObSH3
The coding sequences of ObSH3 were amplified from W1411 to generate the 35S::ObSH3–GFP vector. We bombarded the resulting plasmid into onion epidermal cells using a helium biolistic device (Bio-Rad PDS-1000). The bombarded tissues were examined using a confocal laser scanning microscope (Olympus FV1000).
RNA in situ hybridization
Young panicles from W1411 were fixed in formaldehyde–acetic acid–ethanol fixation solution, subjected to a series of dehydration and infiltration, and embedded in paraffin. The tissues were sliced into 8–10-µm sections with a microtome (Leica RM2265). A 258-bp gene-specific region of ObSH3 cDNA was amplified by PCR to generate sense and antisense RNA probes. Digoxigenin-labelled RNA probes were prepared using a DIG Northern Starter Kit (catalogue no. 2039672; Roche) according to the manufacturer’s instructions. Primers used for the probes are listed in Supplementary Table 2.
Identification of SH3 genotype in O. glaberrima and O. barthii accessions
To determine the genotype of SH3 deletion in O. glaberrima and O. barthii accessions, the sequence of the bacteria artificial chromosome (BAC)-spanning SH3 deletion (GenBank accession no. KF284072) was downloaded from the NCBI GenBank database, as the genomic region containing the SH3 gene was missing in the O. glaberrima CG14 genome assembly22. Raw sequencing reads of 93 O. glaberrima and 94 O. barthii accessions were aligned onto the BAC sequence using Burrows–Wheeler Aligner (BWA)40 v.0.7.10. The presence or absence of the SH3 deletion in O.glaberrima and O. barthii accessions was determined using the Eukaryotic Pan-genome Analysis Toolkit (EUPAN)41 v.0.43 and manual curation using Integrative Genomics Viewer42 (IGV) v.2.4.
Evolutionary analysis of SH3 and SH4
Previously published resequencing reads of 93 O. glaberrima16 and 94 O. barthii22 accessions were downloaded from the NCBI Short Read Archive (SRA) database. Raw sequencing reads were aligned onto the O. glaberrima CG14 genome assembly22 using BWA40 v.0.7.10. PCR duplicates in the aligned reads were masked using the MarkDuplicate function of Picard Tools v.1.128 (https://broadinstitute.github.io/picard/). SNP calling and filtering were performed using Genome Analysis Toolkit (GATK)43 v.3.4. In total, 6,640,731 SNPs were identified and used in the evolutionary analysis.
The phylogenetic tree of 93 O. glaberrima and 94 O. barthii accessions was inferred using SNPs from the 12 assembled chromosomes. To eliminate the effect of rare genetic variants, only SNPs with minor allele frequency (MAF) values greater than 0.05 were retained for the analysis. In total, 2,216,796 SNPs were used for the phylogenetic reconstruction. Due to computational limitations, the approximate maximum likelihood tree of O. glaberrima and O. barthii accessions was constructed using FastTree44 v.2.1.8 and the GTR+CAT approximation model with 20 rate categories. The phylogenetic tree of O. glaberrima and O. barthii was plotted and annotated using the interactive tree of life (iTOL) online tool v.345.
The population structure of 93 O. glaberrima and 94 O. barthii accessions was inferred using ADMIXTURE46 v.1.23. SNPs from unanchored scaffolds and contigs of O. glaberrima CG14 assembly were removed for the analysis. In addition, only SNPs with MAF values greater than 0.05 were retained for the analysis to eliminate the effect of rare genetic variants. SNP pruning was performed for the SNP dataset using plink47 v.1.90 with the parameter “–indep-pairwise 400 50 0.3”. In total, 193,454 SNPs were inputted into the ADMIXTURE program. The ancestry of each population was inferred from K = 2 to K = 8 (Supplementary Table 3). The result of the ADMIXTURE analysis was plotted using a custom R script.
PCA of 93 O. glaberrima and 94 O. barthii accessions was performed using the smartpca program implemented in the EIGENSOFT package48 v.6.0.1. All the SNPs in the O. glaberrima and O. barthii SNP dataset (6,640,731) were inputted into the smartpca program. After filtering out SNP positions that were missing substantial numbers of genotype calls, smartpca used 2,161,274 SNPs to perform PCA analysis. PCA was performed for O. glaberrima, and O. barthii individuals were projected onto the principal component space of O. glaberrima to better elucidate the relationship of these two species (Supplementary Table 4). The results of the PCA were plotted such that O. glaberrima or O. barthii accessions with different genotypes of SH3 and SH4 genes were indicated by different colours using a custom R script.
The global positioning system coordinates of 93 O. glaberrima accessions were downloaded from a previous study16. Two O. glaberrima accessions (IRGC103993 and IRGC104573) with missing calls at the causative SNP of SH425 were excluded from the biogeographical analysis. The countries of origin of 94 O. barthii accessions were downloaded from a previous study22. The map of the biogeographical analysis was drawn using a custom R script.
Further information on experimental design is available in the Nature Research Reporting Summary linked to this article.
The R script that supports the findings of this study is available in the Dyrad Digital Repository (https://doi.org/10.5061/dyrad.qh5r649).
The data that support the findings of this study are available in the Dyrad Digital Repository (https://doi.org/10.5061/dyrad.qh5r649). The gene sequence of ObSH3 has been deposited in GenBank with the following accession code: MH159201.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We thank the International Rice Research Institute for providing the wild rice and cultivated rice samples. This research was supported by the Ministry of Agriculture of China (2016ZX08009-003) and the National Key R&D Program for Crop Breeding (2016YFD0100901). The funders had no role in the study design, data collection and analyses, decision to publish, or preparation of the manuscript.
About this article
Science China Life Sciences (2018)