Barley SIX-ROWED SPIKE3 encodes a putative Jumonji C-type H3K9me2/me3 demethylase that represses lateral spikelet fertility

The barley inflorescence (spike) comprises a multi-noded central stalk (rachis) with tri-partite clusters of uni-floretted spikelets attached alternately along its length. Relative fertility of lateral spikelets within each cluster leads to spikes with two or six rows of grain, or an intermediate morphology. Understanding the mechanisms controlling this key developmental step could provide novel solutions to enhanced grain yield. Classical genetic studies identified five major SIX-ROWED SPIKE (VRS) genes, with four now known to encode transcription factors. Here we identify and characterise the remaining major VRS gene, VRS3, as encoding a putative Jumonji C-type H3K9me2/me3 demethylase, a regulator of chromatin state. Exploring the expression network modulated by VRS3 reveals specific interactions, both with other VRS genes and genes involved in stress, hormone and sugar metabolism. We show that combining a vrs3 mutant allele with natural six-rowed alleles of VRS1 and VRS5 leads to increased lateral grain size and greater grain uniformity.

at the two developmental stages assayed, AP (awn primordium, 5mm-spike) and WA, (white anther, 10mm-spike). (a-f) qRT-PCR validation of (a) HOX-2, (b) TCP FAMILY TRANSCRIPTION FACTOR, (c) LONELY GUY-LIKE, (d) CYTOKININ OXIDASE/DEHYDROGENASE, (e) TERMINAL FLOWER 1-like and (f) TREHALOSE 6 PHOSPHATE PHOSPHATASE. HvActin was used for normalization. X-axis shows the inflorescence developmental stages in which the qRT-PCR was carried out. Y-axis shows the relative expression level based on ∆Ct (cycle threshold) calculation. Mean ± S.E of three biological replicates is shown. Significant differences among the samples were established using Student's t-test (twotailed). Flow diagram with the main steps (centre) and tools (right side) employed to perform the RNA-Seq analysis.

Plant Ontogeny
Phenotypic evaluation of the spikes from four F2 populations, Bowman*BW419(int-a.1), Bowman*BW902(vrs3.f), Barke*BW419(int-a.1) and Barke*BW902(vrs3.f), showed that the homozygous vrs3 individuals from the BW902 populations exhibited greater levels of awn development on the lateral spikelets from the two basal spikelet node positions compared to the vrs3 homozygous individuals from the BW419(int-a.1) population (Basal spikelet node 1: χ 2 = 16.87, 5 d.f. P = 0.005, Basal spikelet node 2: χ 2 = 12.47,5 d.f. P = 0.025). Across all four positions sampled along the spike, the homozygous vrs3 spikes from the Bowman populations showed a greater association with lateral awn development than those from the Barke populations, illustrating the genotype dependence of the phenotypic penetrance.
Analysis of lateral spikelet grain fill showed a similar trend to lateral awn development. The Bowman*BW419 and Barke*BW419 populations showed no significant association between the basal spikelet node internodes and lateral spikelet grain fill, suggesting that at this position, the BW419(int-a.1) homozygous vrs3 F2 spikes resemble a two-rowed phenotype (BW419Ba (χ 2 =2.5, 1d.f., P = 0.293) and BW419Bo ( χ 2 =6.45, 1d.f., P = 0.057)). However, the BW902 populations showed a significant association between the basal spikelet node and lateral spikelet grain fill, indicating that BW902(vrs3.f) exhibits a stronger mutant phenotype with lateral spikelet grain fill persisting further down the spike. Across the extended mutant series, lateral spikelet grain fill varied considerably amongst individuals (max. 56.5%, min. 0.6%). Variation in lateral grain fill was also observed between mutant alleles at the same nucleotide position. Mutations that were generated and remain present in different genetic backgrounds show different phenotypic penetrance, i.e. int-a. 27  Across all mutants, no significant difference in days to ear emergence (flowering) between mutant and wild type was identified. However, multivariate analysis suggested that lateral spikelet grain fill was enhanced in vrs3 mutant lines whose genetic background resulted in earlier ear emergence.

Phylogeny
Comparative sequence analysis identified orthologues of VRS3 across grass species. The rice VRS3 partial orthologue OsJMJ706 (LOC_Os10g42690) was previously functionally characterised as an H3K9me2/me3 demethylase 1 . Both sequence homology and conservation of synteny therefore suggest that this is also the function of VRS3. Reciprocal BLAST of VRS3 identified a paralogue within Barley (HORVU6Hr1G067480 = MLOC_53868.2), which was also found in other grass species, and the chromosomal locations of the paralogues are consistent with a duplication event in a common ancestor to the Poaceae rather than separate intraspecific duplications. Although sharing a syntenic chromosomal location with VRS3, phylogenetically JMJ706 is atypically positioned ancestrally, possibly indicative of directional selection at this locus. Genome wide association studies have associated JMJ706 with increased primary and secondary branching in the rice inflorescence 2 . Panicle branching represents a key yield-related trait in rice that may explain the increased sequence diversity at this locus. The phylogeny presented here is consistent with that of a detailed phylogenetic study of JmjC domain containing proteins across plant species, which placed Arabidopsis (AT5G46910), rice and maize loci within the plant-specific PKDM8, H3K9 demethylase clade 3 and REF6 and ELF6 within the sister PKDM9, H3K27me3 clade. The H3K9 methylation marks H3K9me2 and H3K9me3 are repressive marks associated with gene silencing.

Barley Atlas gene expression
An RNA-seq based atlas of gene expression from 16 tissues from the six-row cultivar Morex 4,5 is available at http://camel.hutton.ac.uk/barleyGenes_JLOC2/. The atlas provides transcript abundance data (FPKM values) of predicted genes from the whole-genome-shotgun sequence from the six-row barley cv. Morex. FPKM values for VRS1-VRS5 genes across the tissue series are shown in Supplementary Fig. 3

. (For review, login: JHI-barleyGenes, Password: JHI-barleyGenes)
RNA-seq bioinformatic analysis. A total of 914,782,574 pairs of reads with the read length of 150bp were generated from 32 samples. For quality control and removal of adapters we used FASTQC (version 0.11.3) and Trimmomatic (version 0.30) 6 . The minimum phred quality score was 20 and the minimum length of the reads accepted after trimming was 75 base pairs. Reads that remained unpaired after QC were excluded from further analysis. Nearly 95% of the reads remained for downstream analysis.
Quantification of the transcripts was carried out using Salmon version 0.7.2 7 . Transcripts from both high and low confidence genes (see 5) were used as the transcript reference, and k-mer of 31 and quasi-mapping mode were chosen for building the index in Salmon. In the Salmon quantification step, --seqBias -gcBias were specified to adjust for sequence specific and GC content biases.
The transcript quantification output from Salmon was imported and summarized from transcriptlevel to gene-level read counts using the R package tximport version 1.2.0 8 . countsFromAbundance was set as default. The summarized gene read counts were then converted to count per million reads (CPM). Genes with CPM>1 in at least 6 samples were considered as expressed and were selected for further analysis. The data were then normalized using weighted trimmed Mean of M values (TMM) method to account for the raw library sizes using function calcNormFactors (calculate normalization factors to scale the raw library sizes) in EdgeR version 3.16.5 9 . Principle Component Analysis (PCA) 10 was carried out to examine batch effects and no technical issues were identified. The normalized CPM values were then transformed into log2-reads-per-million (log2CPM), which were used to estimate the relationships between mean and variance and generate weights for variance adjustments using the voom function in Limma package version 3.30.9 11,12 . After that, a general linear model was set up using genotype and development stages (Bowman WT and BW902(vrs3.f), awn primordium (AP, spike length 5mm) and white anther (WA, spike length 10mm) stages as factors and contrast groups BW902.AP vs WT.AP and BW902.WA vs WT.WA were set to examine the differentially expressed genes between specific subgroups. The p values were adjusted for multiple testing using Benjamini & Hochberg (BH) 13 method. The final lists of differentially expressed genes for each contrast were produced using the following criteria: 1) adjusted p <0.05) |log 2 fold change| ≥ 0.5 and 3) the expression values of that gene ≥ 2TPM in at least 6 samples. All P values quoted in the manuscript to interpret and analyse the data are adjusted P values by BH method to control false discovery rate. The complete list of differentially expressed genes between Bowman and BW902(vrs3.f), including those that lay within the introgression in BW902(vrs3.f) (491 genes), is shown in Supplementary Data 1b and 1c. The level of expression of these genes across the 2 comparisons is shown in Supplementary Data 1a. Genes present in the introgression regions are identified and annotated in Supplementary Data 1b and 1c. Biological interpretation of the differentially expressed genes focused on those outside of the introgression (364 genes) to focus on the downstream effects of the mutation.
Gene Ontology enrichment analysis. 187 of the 491 differentially expressed genes have GO terms associated (Supplementary Data 1d). Gene ontology enrichment analysis was performed with GOEAST 14 . Customized analyses were run in default (hypergeometric test, 0.1 for false discovery rate, minimum 5 of associated genes in reference (m value) for display). The customized gene annotation dataset was the list of 23,636 annotated genes with GO codes associated. Some GO terms were found overrepresented (Supplementary Data 1e).
Quantitative PCR validation. RNA-seq results for key genes were independently validated by qRTPCR. Plants of Bowman WT and BW902(vrs3.f) were grown in the glasshouse in a randomised design, under the same conditions as for the RNA-seq experiment. Three biological repetitions per genotype with 10 spikes for the AP (5mm spikes) stage and 8 spikes for the WA (10mm spikes) stage each were harvested. RNA was isolated as described in the Online Methods. Reverse transcription and cDNA synthesis were performed using SuperScript III Reverse Transcriptase (Invitrogen). Quantitative PCR was carried out Universal Probe Library (Roche) hydrolysis probes on the Applied Biosystems StepOnePlus machine. qRT-PCR primer sequences are listed in Supplementary Table 4.Three technical repeats were performed for each cDNA-primer combination per sample. qRT-PCR data were obtained using SDS2.3 software (Applied Biosystems) and analysed by 2 -ΔCT method. Significant differences among the samples were calculated using Student's t-test (two-tailed). The qRT-PCR validation of differentially expressed genes identified in the RNA-seq experiment are given in Supplementary Fig. 5.

Biological analysis.
While some of the differentially expressed (DE) genes between Bowman and BW902(vrs3.f) are known to control inflorescence morphology, we detected additional DE genes which may be relevant to the vrs3 phenotype. Apart from VRS1, other HD-Zip class I genes (HvHOX) proteins that have diverse functions in regulating adaptive plant growth in conjunction with phytohormones 15 Fig. 5d).
Molecular control of hormone metabolism was perturbed between wild type and mutant. In addition to predicted changes in jasmonate, cytokinin metabolism and signalling in BW902(vrs3.f) , we found APETALA2-like/Ethylene Responsive Factor (AP2-like/ERF) genes downregulated. Genes that belong to AP2-like/ERF family have been reported to play an important role in the spikelet meristem determinacy and floral organ identity in cereals 19 . In our data, we detected HORVU7Hr1G037180, an ERF transcription factor that was significantly downregulated in both stages (LFC= -0.694 adjusted P= 0.004 (AP); LFC= -0.568adjusted P= 0.006 (WA)). The function of the putative ortholog of this gene in rice (OsAP2/EREBP-132) (DREB2B) seems to be related to drought resistance and heat-shock stress tolerance 20,21 . We also found HORVU5Hr1G045290, another AP2like/ERF, with significant DE in WA spikes (LFC= -0.614; adjusted P= 0.023) possibly suggesting that it acts further downstream in the developmental pathway. Arabidopsis and rice orthologs of this gene belong to the WRINKLED 1 family, a key regulator of seed oil biosynthesis and seed storage metabolism in Arabidopsis and maize where it is involved in glycolysis and fatty acid biosynthesis in the plastids 22,23 . In maize, expression of one WRI1 orthologs (ZmWRI1a) has been detected in immature and mature tassels 23 .
Sugar metabolism also appears influenced by VRS3. While we observed a significant reduction in the expression of T6PP in AP BW902(vrs3.f) inflorescences, we found three sugar transporter SWEET genes were significantly DE in BW902(vrs3.f) spikes at WA stage: HORVU5Hr1G076770 and HORVU7Hr1G054710 were significantly downregulated (LFC= -1.106, adjusted P= 5.03 10 -4 and LFC= -1.147, adjusted P= 0.004, respectively) while HORVU3Hr1G107780 (LFC= 0.736; adjusted P= 0.002) was significantly upregulated. In the barley atlas dataset all three are highly expressed in developing inflorescence/caryopsis tissues. SWEET genes (also known as saliva or MtN3) encode transmembrane proteins involved in sugar transport. In plants, they participate in a wide range of the biological processes, such as host-pathogen interactions, reproductive development, senescence, and abiotic stress responses 24 . HORVU5Hr1G076770 and HORVU7Hr1G054710 are orthologs of Xa13/Os8N3/OsSWEET11 in rice, where supressed expression causes reduced fertility or sterility due to compromised microspore development, resulting in the gradual degeneration of immature pollen 24,25 . Here, downregulation was significant in WA BW902(vrs3.f) spikes, the opposite of what may be expected based on the data from rice. However, the SWEET gene HORVU3Hr1G107780 was significantly upregulated at the same stage, consistent with the effect of SWEET genes in rice inflorescence 26 . The contrasting regulation we observe here suggests that different SWEET genes may play different roles in determining barley spike architecture and pollen fertility.
Stress metabolism and signalling genes were also DE. We observed thionin-encoding genes (also known as PR13 family) to be significantly downregulated exclusively in WA BW902(vrs3.f) spikes. These genes are known to be involved in plant defence. Differences in the expression of thionins throughout inflorescence development has been described previously 27,28 and downregulation in WA BW902(vrs3.f) spikes may indicate a specific role in certain stages of barley spike development. Glutathione S-transferase (GST family), a group of genes related to stress signalling were also significantly downregulated in BW902(vrs3.f). HORVU4Hr1G081170 (LFC= -1.225, adjusted P= 0.003 (AP); LFC= -0.927, adjusted P= 4.68 10 -4 (WA)) and HORVU4Hr1G081150 (LFC= -0.7811 adjusted P= 0.04 (AP); LFC= -0.801, adjusted P= 0.004 (WA)) putative orthologs of OsGSTF14 and OsGSTF15 in rice respectively, are strongly expressed in the panicle 29 . The lower expression found in the vrs3 inflorescence may suggest that these genes could act as a regulators of spike development and meristem determinacy in barley.
We detected DE of genes involved in meristem determinacy such as MADS-box TFs in BW902(vrs3.f).

Field /glasshouse trial
Genotyping of F2 individuals used allele specific KASP assay designed to diagnostic polymorphisms within the Vrs1, VRS5(Int-c) and Vrs3 genotypes. 27 different genotype combinations segregated as would be expected for an F2 population with three segregating genes. These combinations could be divided into five different phenotypic classifications (Supplementary Fig. 6).

Field-grown plants
Each plot comprised a paired row of a 20 plant F3 family derived from a single F2 plant homozygous for the VRS1, VRS5(INT-C) and VRS3. At Zadoks growth stage 89 (hard-grain), a sample of five plants was harvested per plot. The main spike of each plant was separated into central and lateral grain. The remaining spikes on the five plants were threshed as a bulk to determine grain size variation across all tiller types.
We observed significant increases in grain width in the 666 genotype compared to the 662 genotype across the three grain fractions, central, lateral and whole plant (Supplementary Fig. 6). In contrast, to the glasshouse environment, we found only a significant increase in lateral grain area. The lateral:central grain area ratio marginally increased under field conditions in the 666 allele combination compared to the 662 combination, 0.91 compared to 0.90, but in this environment the difference was not significant. Average grain size was smaller in the glasshouse environment than the field indicative of more limiting conditions. This may suggest that the incorporation of vrs3 within the spring six-rowed barley model would be most beneficial in environments where grain fill is impaired such as high temperatures or drought post-anthesis.