Unravelling miRNA regulation in yield of rice (Oryza sativa) based on differential network model

Rice (Oryza sativa L.) is one of the essential staple food crops and tillering, panicle branching and grain filling are three important traits determining the grain yield. Although miRNAs have been reported being regulating yield, no study has systematically investigated how miRNAs differentially function in high and low yield rice, in particular at a network level. This abundance of data from high-throughput sequencing provides an effective solution for systematic identification of regulatory miRNAs using developed algorithms in plants. We here present a novel algorithm, Gene Co-expression Network differential edge-like transformation (GRN-DET), which can identify key regulatory miRNAs in plant development. Based on the small RNA and RNA-seq data, miRNA-gene-TF co-regulation networks were constructed for yield of rice. Using GRN-DET, the key regulatory miRNAs for rice yield were characterized by the differential expression variances of miRNAs and co-variances of miRNA-mRNA, including osa-miR171 and osa-miR1432. Phytohormone cross-talks (auxin and brassinosteroid) were also revealed by these co-expression networks for the yield of rice.

For RNA sequencing (RNA-seq) library construction, 3 μg of total RNA of each sample was used for library preparation using TruSeq Stranded Total RNA Sample Preparation kit (Illumina, San Diego, USA). RNA was fragmented into small pieces and then first-strand cDNA was synthesized with Super Script Ⅱreverse transcription (Invitrogen). After purifying, the second strand cDNA library was synthesized, following several rounds of PCR amplification. The clean reads of RNA-seq were mapped to the rice reference genome (Nipponbare-Reference-IRGSP-1.0) using Tophat 2 (version 2.0.13) (Trapnell et al., 2010).
Transcript reconstruction was conducted by Cufflinks software (version 2.2.1) (Trapnell et al., 2012). And DESeq2 was used to make read counts and to identify differentially expressed genes (DEGs). A False Discovery Rate (FDR) was determined with threshold of 0.05 and |log2(fold change)| ≥1) to recognize the significance of the gene expression difference. All the small and RNA sequencing data were deposited in the NCBI Short Read Archive (SRA)

Differential edge-like transformation (DET)
The differential co-expression analysis is to see if or not the expression correlation of a gene-pair (e.g. two genes or molecules) changes between control and case samples (Zeng et al., 2014;Yu et al., 2015). Thus, the Pearson correlation coefficient (PCC) between genes I and j in control or case samples can be calculated.
On one hand, to evaluate the correlation in one sample, the pseudo-correlation or correlation-like expressionof such a gene-pair in a sample k can be defined as: Pseudo-correlation in control condition: Pseudo-correlation in case condition: Where, C (.) represents the Z-transform of one sample, i.e., sample k, by the mean and variance in such sample's group (e.g. a control sample and control group). Comparing (1) and (2), clearly for one gene-pair, the mean of its pseudo-correlation in all control (or case) samples just equals to the Pearson correlation coefficienton control (or case) group (Zeng et al., 2014;.
On the other hand, to evaluate the correlation change of one sample compared to multiple controls, the delta expression of a gene-pair can be computed as: Pseudo-correlation in control condition: · C C Delta-correlationin case condition: Where, C (.) represents the Z-transform of one sample by the mean and variance in such sample's group (e.g. a control sample againstthe control group); and C' (.) represents the Thus, to provide a more comprehensive approach to evaluate the network, i.e. the edges / gene-pairs, in one sample, we here introduce a new differential edge-like transformation (DET) such as: Edge-like correlation in control condition: Edge-like correlation in case condition: Where, λ , λ , λ , λ , and there are similar numbers of control and case samples.
DET transforms the expression of genes to the edge-like correlation of gene-pairs in one sample, and the mean of edge-like correlation of a gene-pair in all control and case samples is just the Pearson correlation coefficienton all samples, so that this measurement has equivalent numerical meaning for any control or case sample. In addition, according to the following theory 1, such a transformation also has other two compatiblefeatures with previous methods (e.g. above pseudo-correlation and delta-correlation): (i) When the sample numbers are extremely unbalance between control and case, the edge-like correlation will approach to delta correlation; (ii) When the genes have the equivalent expression variances between control and casegroups, the edge-like correlation will be a linear combination among psesudo-correlation and delta-correlation; otherwise, the edge-like correlation will have weighted variants for control and case samples, respectively.
Totally, DET can provide a more general approach to evaluate the network, i.e. the edges / gene-pairs, in one sample, under more complex practical conditions, such as: the sample numbers are extremely unbalance, or the genes have differential variances / co-variances.
Theoretical Result: Assume that gene i and j have one-sample's representative expressions x and x in control with samples, and y and y in case with samples, the edge-like correlation of a gene-pair between i and j in one sample can be obtained as a transformation: (ii) When 1 and and , Where, λ ; the expression average and variance for gene i (or gene j) on control samples are and (or and ); and conveniently, the sample in case has these similar variables and annotations; and and (or and ) are the mean and variance of gene i (or gene j) in all control and case samples.

Proof:
According to the relation between sample means in control, case and all, there are: Then following results hold, there will be 1 and 0, and Under this condition, the edge-like correlation will approach to delta-correlation (with a scale).
Similarly, when 0, it can be dealt as ∞.
(ii) When 1, there will be and , and Similarly, there is, correlationsis proposed in this study (Fig. 1).
In addition, if and , the above measurements can be further reduced to Where, λ .
Thus, under this condition, the edge-like correlation will be a linear combination among pseudo-correlation and delta-correlation and their combinations, similarly for control or case samples.

Gene regulatory networks with DET
Based on the theory of differential edge-like transformation (DET), a new differential network model by combining the miRNA-mRNA regulatory network and its edge-like In one condition or in one tissue, there is usually only one sample, the conventional correlation-based approaches (e.g. WGCNA) will not be applicable. By contrast, we applied DET to capture the significant gene expression and correlation changes simultaneously in a dynamical and network manner. Assume that the (miRNA & mRNA) expression data ..x iM ] are an expression profile of a miRNA or a mRNA across a group of tissues, (i) the tissue-specific expression of molecule i on one tissue k can be obtained by: , mean std (ii) the tissue-specific edge-like correlation of molecules i and j on one tissue k can be obtained by (given molecule i is a miRNA and molecule j is a target mRNA): ρ , mean std · mean std Simply, the differential molecule or tissue-specific molecule on one tissuek can be selected by

Mann-Whitney U test between
, and , , … , , , , , … . And the differential molecule-pairs or sample-specific molecule association on one sample k can be selected by Mann-Whitney U test between ρ , and ρ , , … , ρ , , ρ , , … . For any miRNA, its average PCC (i.e. the edge-like correlation) with other relevant miRNAs or mRNAs in control or case is defined as AP control and AP case , then a factor as PCC-induced key-associated score for this miRNA is computed as |AP case -AP control |.

Supplemental References
Gardner  corresponding P-values (left). Heatmaps of the miRNAs expression in "grey" and "yellow" with expression patterns of selected candidate miRNAs (right).
(C) Networks of "grey" and "yellow" module. The miRNAs co-expression relationships are colored in "grey" and "yellow", respectively.  -miR396b-5p, (B) osa-miR171a, (C) osa-miR812m. The thickness of lines represents the correlation extent of miRNAs and their targets and the thicker, the more negative correlation between them. Rectangles: miRNAs, circles: genes, triangles: TF. red color, up-regulation and green, down-regulation.

Supplementary Table S1
Details of sample-specific miRNAs (SmiRNAs) in different tissues at the key yield stage analyzed using differential edge-like transformation (DET) analysis with their expression levels in each sample. A set of 150 sequencing screened yield related miRNAs from literatures.