NBBt-test: a versatile method for differential analysis of multiple types of RNA-seq data

Rapid development of transcriptome sequencing technologies has resulted in a data revolution and emergence of new approaches to study transcriptomic regulation such as alternative splicing, alternative polyadenylation, CRISPR knockout screening in addition to the regular gene expression. A full characterization of the transcriptional landscape of different groups of cells or tissues holds enormous potential for both basic science as well as clinical applications. Although many methods have been developed in the realm of differential gene expression analysis, they all geared towards a particular type of sequencing data and failed to perform well when applied in different types of transcriptomic data. To fill this gap, we offer a negative beta binomial t-test (NBBt-test). NBBt-test provides multiple functions to perform differential analyses of alternative splicing, polyadenylation, CRISPR knockout screening, and gene expression datasets. Both real and large-scale simulation data show superior performance of NBBt-test with higher efficiency, and lower type I error rate and FDR to identify differential isoforms and differentially expressed genes and differential CRISPR knockout screening genes with different sample sizes when compared against the current very popular statistical methods. An R-package implementing NBBt-test is available for downloading from CRAN (https://CRAN.R-project.org/package=NBBttest).

www.nature.com/scientificreports/ When a DSB is repaired by Cas9 in a non-homologous end-joining (NHEJ) way, a repairing error occurs with a high probability due to an insertion/deletion mutation that is likely to cause a codon frameshift, which leads to a premature stop codon 17 . As a result, a targeted locus would be efficiently knocked out. A recently developed lentiviral delivery method has made it possible to create large-scale genome-wide CRISPR/Cas9 knockout libraries targeting ~ 10 4 genes 18 . These libraries allow both negative and positive selection screens to be conducted in mammalian cell lines in a cost-effective manner 16,[18][19][20][21] . All genome-wide CRISPR screens use cell growth as a phenotypic measure. Either a screen is referred to as positive selection screen in which gene knockout results in a selective advantage for cells such as drug or toxin resistance or a screen is a negative selection screen in which a gene knockout results in a selective disadvantage in that cell such as decreased proliferation in cancer cells.
In CRISPR/Cas9 knockout screens, genes targeted by set of multiple sgRNAs constitute a mutant pool. Cells that carry sgRNA targeting genes resisting to strong selection pressure would be enriched and the signals (RNA) becomes strong so that they are easily detected, while signals from negative selection are weak because of depletion during screen. So, a lot of genes including those promoting cell growths and housekeeping genes are negatively selected in screen. Detection of the genes targeted by sgRNAs and targeting sites could be resolved by high-throughput sequencing using libraries.
Therefore, ACP events, AS events and sgRNAs are all detectable by sequencing the transcriptome using RNA-seq where the short sequencing reads derived from different structural elements of genes can be mapped to corresponding regions and annotated using a given reference genome. There are many methods for differential expression analysis of RNA-seq data at the gene level, most of which use similar information across gene expression profiles to estimate dispersion [22][23][24] . For example, popular methods such as edgeR [23][24][25] and DESeq2 22 model estimate of dispersion for each gene by using a common estimate across all genes with similar expression strength as a weighted conditional likelihood. The first version of DESeq 26 determines dispersion estimates by modeling the dependence of the dispersion on the average expression value over all samples. BBSeq 27 models the dispersion on the mean with absolute deviation of dispersion estimates by excluding the influence of outliers. DSS 28 is a Bayesian approach to estimate the expression dispersion of genes that accounts for the heterogeneity of dispersion values for different genes. As an improvement to DESeq, DESeq2 uses shrinkage estimation for dispersions and fold changes to improve the stability and interpretability of estimates 22 . On the other hand, ShrinkBayes 29 and baySeq 30 estimate priors using a Bayesian model over all genes, and then provide posterior probabilities for differential expression. While these methods are suitable for determining gene-level expression profiles, they have severe limitations when used for differential splicing, differential polyadenylation, and differential sgRNA target analysis because RNA-seq count data for ACP, AS and CRISPR screens do not provide gene-level dispersion and their dispersion estimates are not accurate at sub-gene level. For this reason, a variety of methods exclusively for detecting differential splicing events have been developed. These methods fall under count-based methods and isoform resolution methods. Count-based methods include DEXSeq 31 , DSGseq 32 , SplicingCompass 33 , rMATS 34,35 , rDiff-parametric, and SeqGSEA 36 , while Cufflink 37 and DiffSplice 38 are examples of isoform resolution methods. For identifying differential CRISPR knockout screen, model-based analysis of gene-wide CRISPR/Cas9 knockout (MAGeCK) 18 has recently been developed. MAGeCK performs at either sgRNA target level or gene level. The other algorithms such as RNAi gene enrichment ranking (RIGER) 39 , redundant siRNA activity (RSA) 40 , and permutation-based non-parametric analysis (PBNP) 17 can be used to identify differential hits at gene level.
In summary, the current methods use a common dispersion estimator from the gene expression profile to model the dispersion estimate of each gene in RNA-seq data, which is not an ideal strategy because the range of gene expression is large across the genome and the gene expression profiles widely vary among different type of transcriptome measurement experiments. Hence the use of a common dispersion estimator strongly impacts the estimates of individual dispersions of genes, which affects the stability and interpretability of estimates. In addition, most of the current methods do not really consider the effects of small samples on differential analysis and we found from our heatmap analysis that many differentially expressed genes or isoforms detected by these statistical methods have larger null variation than biological variation. To address these issues, we here develop a novel statistical framework to do differential analysis of multiple types of RNA-seq experimental data. This new method named as negative beta binomial t-test (NBBt-test) expands the prior approaches of Beggerly et al. 41 and Tan et al 42 . NBBt-test is a versatile method for identifying differential splicing events, differential adenylation events, differential CRISPR knockout screens and differentially expressed genes. This methodology incorporates t-test, fold-change and F-test so that it works well for both small and large sample sized datasets. We compared NBBt-test with the existing statistical methods using both experimental and large-scale simulated RNA-seq and CRISPR FACS datasets and found that our methodology is very robust with very low type I error rate or FDR, high power, high efficiency. An R-package implementing NBBt-test is available for downloading from CRAN (https:// CRAN.R-proje ct. org/ packa ge= NBBtt est).

Results
Inflation-shrinkage variable. To address variance inflation and shrinkage issues raised in small-sample experiments, we here offer inflation-shrinkage variable (ρ) to express statistical effects of small samples. ρ gi is defined as geometric mean of ϕ gi and ζ gi 42 : To avoid infinity due to zero count, we modify ϕ gi as (1) ρ gi = ζ gi ϕ gi . www.nature.com/scientificreports/ X gAi = x gAi1 , ..., x gAin A and X gBi = x gBi1 , ..., x gBin B . We also modify ζ gi as ϕ gi is used to measure overlap between datasets A and B for sgRNA i targeting gene g or isoform i resulted from differentially splicing or adenylating within gene g and ζ gi is used to measure homogeneity or dispersion of data within conditions where X gi = 1 2 X gAi , X gAi = 1 m A S gAij and S gAi = m A j=1 x gAij where x gAij is the count of sgRNA i within gene g in sample j in condition A, S gAi is the sum of count of sgRNA i in condition A, X gAi and X gBi are, respectively, means of sgRNA i within gene g in conditions A and B. For the count data of RNA reads, x gAij ≥ 0 and x gBij ≥ 0 . If X gAi = X gBi = 0 , then X gi = 0 and ζ gi = 0 . Both ϕ gi and ζ gi are defined in [ 0, ∞ ]. For a given two-condition experiment, Eq. (2) indicates that if X gAi = {x gAi1 , . . . , x gAin A } and X gBi = {x gBi1 , . . . , x gBin B } do not overlap, then ϕ gi > 1 , otherwise, ϕ gi < 1 . ζ gi < 1 defines large within-condition variation or noise while ζ gi > 1 implicates that the noises are small, and observations are relatively consistent across replicates within conditions. ϕ gi and ζ gi are independent variables because ϕ gi depends on maximum and minimum values of two datasets, while ζ gi is determined by means and variances of these two datasets, maximum and minimum values are independent of means and variances. Hence, a measure for both data gap between two conditions and small null variation can be measured by ρ 2 gi = ϕ gi ζ gi , that is, if ϕ gi > 1 and ζ gi > 1 , then ρ 2 gi > 1 . For example, X A = {4764, 4602, 4538} and X B = {7877, 7524, 7871} have gap and good homogeneity (small dispersion within conditions). Our calculation shows ϕ = 1.579 > 1 and ζ = 1.691 , and ρ = 1.633 , indicating that data are relatively consistent across all replicates within conditions, which is well agreeable with the observations. Another example is X A = {1390, 1482, 1561} and X B = {1540, 1270, 1217}. One can see that these two datasets overlap but also have poor homogeneity (large noises). Our calculation shows ϕ = 0.902 < 1 , ζ = 0.194 << 1 and hence ρ = 0.418 < 1 , which is again agreeable with the observations. Equation (2) shows that ϕ gi is similar to fold change and from Eq. (3), ζ gi is similar to F-statistic.
Model and estimation of parameters. Different from DESeq2 22 , NBBt-test begins with a count submatrix with n g isoforms of gene g for rows and m k replicates in condition k for columns. The matrix entries indicate the number of sequencing reads that have been unambiguously mapped to a gene. For the sake of convenience, we begin with CRISPR read count data. Suppose we select G genes of interest. Gene g (g = 1, …, G) has n g sgRNAs to hit a DNA sequence in a screen experiment. Let x gij be a normalized or adjusted count (e.g., RPKM (Reads Per Kilobase Million) or FPKM (Fragments Per Kilobase Million) or TPM (Transcripts Per Kilobase Million)) of RNA reads within gene g targeted by sgRNA i (i = 1, …, n g ) in experiment (biological replicate experiment) j (j = 1, …, m k ) in condition k. For differential analysis, we here just consider two conditions such as treatment and control groups, so k = 1, 2. For the convenience, our NBBt-test is restricted to sgRNAs targeting a gene, in other words, count data of multiple RNA reads within a gene are used as a sub-matrix with rows for RNA within gene g targeted by a sgRNA and columns for samples or replicates. NBBt-test first works with a submatrix and iterates G sub-matrices. Similar to DESeq2 22 , NBBt-test also assumes that the count of RNA reads follows a negative binomial distribution with a specific number r of failures to RNA sequencing reads in a RNA species and probability p of this RNA species (also called RNA isoform) to be sequenced. For the count data of CRIPSR knockout screen, p is estimated by proportion of sequencing RNA sequences from gene g targeted by a sgRNA. Different from DESeq2, we are interested in p instead of r. We here use p gij = x gij /X g to initially estimate the proportion where X g is the largest total count over all n g sgRNAs among m replicate experiments. Using X g instead of X gj to calculate p gij is because a set of sgRNAs are already designed before the experiment and hence difference among replicate experiments results from technical noise instead of biological system error. Using X gj to calculate p gij would increase proportion of technical noise due to the fact that X gj contains noise among m replicates. Our simulation also shows that p gij = x gij X g is better than p gij = x gij X gj (results not shown). In the negative binomial distribution,p gij follows a beta distribution 43 : p ∼ Beta (α, β) . Mean and variance of the proportion for a sgRNA targeting a gene are given by parameters α and β 41 . To consider the case that RNA experimental sample sizes are limited, we use weights to optimally estimate parameters across replicates 41,42 (see Supplementary Statistical methods for detail). Due to m j=1 w gj = 1 , sum of weighted means over all replicates in a condition is expectation of the mean. With the weights, the proportion of sgRNA i targeting gene g in a condition is estimated by p gi = m j=1 w gjpgij and the variance is also unbiasedly estimated using weights 41,42 (see Supplementary Statistical methods for detail). For poly(A) RNA-seq or splicing/exon RNA-seq, since RNA in an experiment is abstracted from a library, different libraries would have different total RNA amounts, that is, RNA amount is fixed by library size. To remove difference due to library sizes, p gij is defined as p gij = x gij X gj where X gj = n g i=1 x gij . Since we have weights for parameters ( α, β , p, and V), we use an iteration algorithm to optimally estimate these parameters driven by estimating weights. However, this iteration algorithm is not sensitive to count size, in all RNA-seq data many RNA isoforms or sgRNAs have small read counts. Small counts would result in high similarity of proportions among few replicates, which leads variances to be much smaller than differences between means so that the t-statistics are inflated. To avoid occurrence of this phenomenon, we propose another alternative estimate of p variance: (2) ϕ gi = max min X gAi + 1 max X gBi + 1 , min X gBi + 1 max X gAi + 1 where X gi = m j=1 x gij and X g = n g i=1 X gi . The variance for proportion (p) of sgRNA i targeting gene g or RNA isoform i within gene g would be given by choosing a bigger one from these two variances estimated by iteration algorithm (see Supplementary Statistical methods for detail) and by Eq. (4). Equation (4) shows that the lower bound of V gi is 1 T-test for differential expression of RNA isoforms or sgRNAs. With p Agi , p Bgi , V Agi , V Bgi , and ρ given in conditions A and B, a new t-statistic is defined as t α gi = ρ g ω α t gi for differential screens of the ith sgRNA targeting gene g or t α gi = ρ gi ω α t gi for differential expression of the ith RNA isoform of gene g where t gi is t-statistic of Baggerly et al. (see Supplementary Statistical methods for details), ρ g is gene-wise inflation-shrinkage variable at gene level and ρ gi is isoform-wise inflation-shrinkage variable at RNA isoform level. ω α is a null ρ given by simulation under significance level of α (see Supplementary statistical methods for details). Different from variance and fold change shrinkages of DESeq2, ρ/ω α has inflation or shrinkage function. ω α is due to false positive findings under significance level of α by multiple null simulations and hence is used as a threshold for an observed ρ . This means that t α gi would be inflated ( t α gi > t gi ) when ρ > ω α , t α gi would be shrunken ( t α gi < t gi ) when ρ < ω α and t α gi = t gi when ρ = ω α . In practice, when two datasets have a gap and are consistent across replicates, we have ρ > ω α , however, if an isoform or a sgRNA has outlier and overlapped datasets or big noises, then ρ < ω α . Only in the small-sample experiments, the first case suggests that in probability these two datasets more possibly come from different distributions than from a distribution while in the second case, the two datasets are more likely sampled from a distribution. Hence, in most cases, true differences between conditions would be inflated so that true positives would be found and the differences between two conditions due to noises would be strongly shrunken to be very small so that false positive findings would be excluded.
Although CRISPR sgRNAs and RNA isoforms occur at sub-gene level, a set of CRISPR sgRNAs targeting a set of DNA sequences of a gene was already designed before experiment, the variation of RNA reads of sgRNAs is fixed or due to a fixed effect, while the variation of isoform RNA reads is uncertain, depends on gene structure such as alternative poly(A) sites, alternative splicing sites in UTR, exons, introns, in other words, due to the random effect. This factor has not been considered in the current methods. To use the information on this factor, we introduce gene-wise ρ g to adjust the difference in sgRNA counts targeting a gene between conditions A and B. RNA isoforms are derived from alternative splicing sites or alternative polyadenylation sites, hence, amounts of RNA isoforms are uncertain at sub-gene and gene levels, or depend on response of the gene to conditional effect, that is to say, variation of an RNA isoform is not fixed at either splicing sites or poly(A) sites and at gene level but due to random effect. For this reason, we introduce isoform-wise ρ gi to adjust difference between two conditions. T-test for differential expression of genes. To test for differential expression of genes or differential screen of genes targeted by a set of sgRNAs is the main purpose of NBBt-test. For doing so, we use sum of the read counts over all the designed sgRNAs targeting the DNA sequence of a gene as the read count of this gene. For CRISPR and RNA isoform data, we use the proportion ( p gj ) of read count of gene g in replicate experiment j to the total read count across all genes in a condition as initial estimate of parameter p of a negative binomial distribution: p gj = X gj /X j where X gj = n g i=1 x gij and X j = G g=1 X gj ( j = 1, 2, . . . , m, g = 1, 2, . . . , G ). After normalizing the data, the total count over all genes is the same for all replicates, that is, X 11 = · · · = X 1m 1 = X 21 = · · · = X 2m 2 = X . For normalized RNA-seq data without RNA isoforms, X gj is count of RNA reads for gene g in replicate j. We still assume that p gj follows beta distribution with alpha ( α ) and beta ( β) . Therefore, we use the iteration algorithm with weights (see Supplementary statistical methods) to estimate p g and V g in a condition. With the estimated parameters, a new t-statistic for differential expression or screen of gene g is defined as t α g = ρ g ω α t g where t g is t-statistic of Baggerly et al. (see Supplementary statistical methods for detail),ρ g is gene-wise inflation-shrinkage variable and ω α is a threshold for ρ g . Therefore, at gene level, t α g is inflated with ρ g > ω α or shrunken with ρ g < ω α .

Estimation of omega ( ω α ).
In new t-test definitions and in Eqs. (S27) and (S38) in Supplementary Statistical Methods, ω α is a null ρ-value detected at significance level of α , used as a threshold of ρ gi . The ω α can be estimated by using null data by following the steps: Step 1: Perform simulation using negative binomial distributions with parameters given by inputting dataset to produce a null count dataset consisting of the same sgRNAs or RNA isoforms and the same number of replicates in each condition with the original real dataset.
Step 4: Sort the K ρ values from smallest to largest ( ρ 1 < ρ 2 , · · · , < ρ k ′ <, · · · , ρ K−1 < ρ K ) and choose a ρ value at k ′ K ≥ 0.85 : ρ(α) = ρ k ′ where k′ is in the order sequence of K ρ values. k ′ K ≥ 0.85 suggests that at least 85% false positives selected at significance level of α would be controlled. However, if K < 7, only K′=K has K′/K ≥ 0.85. In this case, we choose mean over K ρ values:ρ(α) = 1 K ( K k=1 ρ k ) . Another case that no null RNA species has p value < α also possibly occurs. In this case, we choose the largest ρ value among all K RNA species: www.nature.com/scientificreports/ Step 5: repeat from step 1 to step 4 for s times and average ρ values over s simulated null datasets: To verify the estimate of ω α , we simulated null expression data of 10,000 genes in two conditions in three cases: each condition has 3, 6 and 15 replicates and calculated ρ and applied the above estimation method to estimate ω α . The result is shown in Fig. 1d. Our simulation result shows that in the case of 3 replicates, ω α is 0.35 larger than ρ , in the case of 6 replicates, ω α is 0.15 larger than ρ , while when the sample size = 15, ω α is only 0.05 larger than ρ . This result is expected because the smaller the sample size, the larger the probability of gap occuring between two samples and the larger the ω α . When sample size is larger than or equal to 15, probability of gap occurring between two samples is zero when effect size is less than or equal to 30 (Fig. 1c).
The number of iterations of the simulation depends on the number of RNA species and α . This is akin to controlling false discovery rate because number of p-values < α is determined by gene number and α . For example, 100 genes would have 5 false positives with 5 ρ values expected by α = 0.05, 1000 genes would have 50 false positives with 50 ρ values expected by α = 0.05 and 10,000 genes would produce 500 false positives with 500 ρ values expected by α = 0.05. In this sense, we expect that we will exclude 85% of 500 false positives detected at level of α in 10,000 genes so ω α would greatly control the false discovery rate.
Statistical effects of small samples. In small-sample experiments, data tend to have either smaller or larger null variances along with means ( Fig. 1a). For example, in a dataset of 10,000 experiments with two conditions randomly sampled from a negative binomial distribution NB(100, 50), 70% of the experiments with 4 replicates (n = 4) per sample had smaller variances than those with 15 replicates (n = 15) per sample and 30% of experiments with 4 replicates per sample had much larger variances than those in two samples with 15 replicates per sample (Fig. 1b). If mean-distances between two samples are the same or approximate in these two cases, then 70% of t-statistics would be inflated due to small standard errors and 30% would be significantly shrunken by larger standard errors. The t-statistic inflation phenomenon has been noticed by Baldi and Long 44 , Tusher et al 45 , Cui and Churchill 46 in high-throughput data. Another statistical effect of small samples is that there is a big chance in high-throughput experiments that a data gap event occurs between experimental conditions. For Comparative benchmarks. We used both simulated and real RNA-seq datasets to test the performance of NBBt-test for differential analysis in comparison to the current popular methods such as edgeR 23 Differential polyadenylation RNA isoforms. We used the count data of polyadenylated (Poly (A)) RNA reads 3,4 from Jurkat T-cells of mouse non-stimulated (NS) and stimulated (48 h) with plate-bound antibodies to perform DESeq2, edgeR, and NBBt-test and used a Venn diagram to compare the findings of these methods (Fig. 2a). The differentially expressed poly(A) isoforms detected by NBBt-test show highly consistent www.nature.com/scientificreports/ expression across all replicates within each condition and distinct differences between the conditions (Fig. 2b,e,g and h), that is, Figs. 2b,e,g and h in heatmaps clearly show that differences in expression of all isoforms detected by NBBt-test between conditions 48 h stimulation and NS are definitely larger than those within conditions, while those detected by DESeq2 only (Fig. 2c), edgeR only (Fig. 2d), and commonly by both DESeq2 and edgeR ( Fig. 2f) were not consistent across replicates within a condition because the differences in expression of many isoforms within conditions are larger than those between 48 h stimulation and NS.

Validation of qPCR experiments.
To validate that NBBt-test indeed performs better than DESeq2 and edgeR in identifying differentially expressed (DE) poly(A) RNA isoforms, we used qPCR results from the previous study 42 to verify the predicted results obtained by applying these three methods to poly(A) RNA isoform data of genes TESK2, BC11B, UBL3, MST123, CD47, and KIAA0465. Gene TESK2 is a housekeeping gene with a single poly(A) site and the difference Ct between cell stimulation and rest statuses was not significant at α < 0.05 . Except for gene BC11B, the poly(A) RNA isoforms and qPCR result of all the other genes had the same expression directions (up-or down-expression) ( Table 1). Gene UBL3 has three poly(A) sites and was detected by qPCR to have differential expression ( ��Ct = 0.54) . DESeq2 did not find DE isoforms at all three poly(A) sites using RNA-seq data, while edgeR and NBBt-test identified DE isoform at the third site (30,338,534 bp), therefore, the differential expression value of gene BC11B was due to differential expression of isoform at this poly(A) site. Gene MST123 also has three poly(A) RNA isoforms and was detected by qPCR to have stronger differential expression ( Ct = 2.42). DESeq2 found DE isoform at the first poly(A) site 38,270,255 bp, NBBttest identified the DE isoforms at the first and second poly(A) sites (38,270,255 bp and 38,270,363 bp), while edgeR detected DE isoforms at all these three sites. At the third poly(A) site, both NBBt-test and DESeq2 did not find DE isoform, hence it did not contribute to differential expression of the gene. Thus, stronger DE value of MST123 is likely attributed to differential expression at the first two poly(A) sites. Gene CD47 has two poly(A) sites. DESeq2 and NBBt-test found that these two isoforms were differentially expressed between cell stimulation and rest statuses, but edgeR did not find them. Therefore, differential expression of CD47 might be attributed to differential expression of these two isoforms. In gene KIAA0465, all the three methods found the first isoform differentially expressed. For the housekeeping gene, TESK2, both edgeR and DESeq2 identified differential expression while the NBBt-test did not find it. If a gene has significant qPCR difference (ΔΔCt) between cell stimulation and rest statuses under α = 0.05 and an RNA isoform within this gene is found to be differentially expressed by two or all these three methods under FDR = 0.05, then this isoform is determined to be very possibly truly differentially expressed, otherwise, DE of this isoform is false. Using this way, true DE isoforms were inferred and listed in the last column of Table 1. Compared findings of a statistical method to true DE isoforms in the last column in Table 1, we found that the ratio of true findings of NBBt-test is 100% (11/11) and the true finding ratios of DESeq2 and edgeR tests are 81.8% (9/11) and 72.7% (8/11), respectively.
Identification of differential splicing events. After mapping Arabidopsis RNA-seq data 50 to TAIR10 genome using STAR aligner 51 , we ran Spladder 11 on the bam files to produce read coverage data of RNA isoforms.  We used NBBt-test to test for differential splicing events at 3′UTR, 5′UTR, skipping exon, intron retention and multiple skipping exons. As an example, NBBt-test found a skipping exon between 2,244,500 and 2,244,750 bp within gene AT3G07090 in the heat shock samples (Fig. 3b) compared to control samples (Fig. 3a), while DEX-Seq did not find it. We performed DEXSeq 31 and NBBt-test on exon count data obtained by applying HTseq 52 to map RNA-seq reads 53 to D.melanogaster genome. As examples, we here just displayed exon expression profiles of genes Ald, Ant2, lmpL3, and bmm using NBBplot: pasilla counts of knockdown and control replicates in DE exon E1 (red box) within gene Ald identified by DEXSeq only are not separated (Fig. 4a). However, in gene lmpL3 (Fig. 4c), DE exon E2 (red box) found by NBBt-test only shows clear separation replicate counts between control and pasilla groups. DE exons E1, E2 and E3 (red boxes) in gene Ant2 (Fig. 4b) and exon E2 (red box) within gene bmm (Fig. 4d) identified by both DEXSeq and NBBt-test show significantly more counts of reads in the three pasilla knockdown replicates than in the four control replicates. The differential expression of exons E1, E2, and E3 within gene Ant2 and of E2 within gene bmm were validated by qPCR 53 . Similarly, Fig. 5 displays big differences between NBBt-test and DEXSeq in the identification of differentially expressed exons within Arabidopsis genes, AT5G26780 and AT1G0914055. Gene AT5G26780 has 20 exons, except for exons E12, E17, E19, and E20, NBBt-test found that the other 16 exons showed differential expression between control and heat shock groups (Fig. 5a), while in DEXSeq findings, only exon E17 had differential expression (Fig. 5b). However, the observed data show that exon E17 had no difference between the control and heat shock groups. Figure 6b shows that gene AT5G26780 had higher read coverage in the control group (186, 205, 207) than in the heat shock group (54,102,57), which are well consistent with read counts of exons shown in Fig. 5a or b. In gene AT1G09140 (Fig. 6a), NBBt-test found that all 13 exons had differential expression (Fig. 5c) and the read coverage data (Fig. 6a) show this result that three heat shock(HS) replicates had many more read counts (3422, 3652, 4274) than three replicates (973, 877, 1325) of control group (CTL) but DEXSeq was able to find only three DE exons (E3, E12 and E13) (Fig. 5d) while NBBt-test found them all (Fig. 5c). Differential splicing events of these two Arabidopsis genes were previously validated by qPCR experiment 54 .
Differential CRISPR screen analysis. For the differential CRISPR knockout screens, we used datasets from Evers et al 55 that have two spike-in datasets RT112 and UMUC3 where there are 93 genes each targeted by 10 sgRNAs and 48 of 93 genes were known to be essential genes. Along with NBBt-test, the other four statistical methods were chosen to perform differential analysis of CRISPR knockout screens. These methods are edgeR 23 18 . DESeq2, edgeR and Baggerly are general methods for differential analysis of RNA-seq data while MAGeCK was specifically developed for testing differential CRISPR knockout screens at either gene or sgRNA levels. For RT112 data, the essential genes are clearly separated from non-essential genes ( Supplementary Fig. S1a), receiver operating characteristic (ROC) curves showed that NBBt-test had the best performance among these five methods, while the edgeR had the poorest performance (Fig. 7a). Figure 8a shows that NBBt-test had the highest F1-score (see Methods section), followed by MAGeCK, suggesting that NBBt-test had the highest precision and the highest recall among these methods. Baggerly and ibb methods reported the lowest scores. However, for UMUC3 data, www.nature.com/scientificreports/ there is no significant difference in the performance among these methods ( Fig. 7b) but NBBt-test still had the highest F1-score across − log10(FDR) = 1 to 2 even though the F1-scores are less than 0.7. MAGeCK and DESeqa2 had the same F1-score (Fig. 8b). This is probably due to low quality of the UMUC3 data where 135 of 486 essential sgRNA targets are false and mixed up with the non-essential sgRNA targets ( Supplementary  Fig. S1b). This is also seen at gene level ( Supplementary Fig. S2): For RT112, all these five methods had a perfect performance ( Supplementary Fig. S2a) but in UMUC3 data, except that MAGeCK had a perfect performance, NBBt-test had the best performance among the other methods ( Supplementary Fig. S2b). We created heatmaps (Fig. 9a,b) to display − log 2 FDR profiles of these five statistical methods where in the essential column, gene had FDR = 0 (red) if it is essential, otherwise, FDR = 1(yellow). The heatmaps show that NBBt-test is the best method to find essential genes based on RT112 (Fig. 9a) and UMUC3 (Fig. 9b) data.

Simulation comparison.
To fully evaluate our NBBt-test, we designed multiple scenarios to simulate poly(A) count data, splicing junction count data and FACS data of CRISPR screens as described in the Methods section. For the count data of poly(A) RNA reads, we designed four scenarios to simulate stimulated and normal statuses each having 3 replicates with 30% technical noise (or outliers). We used the simulated datasets in these four scenarios to compare NBBt-test to DESeq2, edgeR, and the Baggerly methods. The results show that NBBt-test had the best performance (ROC curves), the highest F1-score, the lowest type I error rate and the lowest observed FDR among these four methods (Fig. 10). We also simulated data of two samples each having 5 replicates with 30% technical noise of 13,409 poly(A) sites scattered in 9,294 genes and each having 3 replicates without technical noise in four scenarios (a: 10% of poly(A) sites were positively or negatively responded to 100U T-cell stimulation effect; b: 10% of poly(A) sites were responded to 300U stimulation effect; c: 30% of poly(A) sites were responded to 100U stimulation effect. d: 30% of poly(A) sites were responded to 300U stimulation effect where U is a uniform variable, 0 < U ≤ 1 ). We used the first set of simulated datasets to compare NBBt-test, DESeq2, edgeR and Baggerly. The results are summarized in Supplementary Fig. S3. In these four scenarios with technical noise, NBBt-test had the highest performance among these five methods ( Supplementary Fig. S3), the highest F1-scores, the lowest type I error rates and the lowest observed FDRs in all these four scenarios ( Supplementary Fig. S3). For the second set of simulated data, both NBBt-test.RNA (test for , both DEXSeq and NBBt-test identified exons E1, E2 and E3 differently expressed between control and pasilla knockdown. In gene lmpL3 (c), exon E2 was found by NBBt-test only to have differential expression. In gene bmm (d), exon E2 was found by both DEXSeq and NBBt-test to be differentially expressed. Genes sesB/Ant2 and bmm were validated by RT-PCR 53 . Data from Brooks et al. 53 . Red box represents differential expression exon and green box represents no differential expression exon. www.nature.com/scientificreports/ differential expression of RNA species or gene expressions), NBBt-test.isform (test for differential expression of RNA isoforms) also had the highest performance (ROC), the highest F1-score, the lowest type I error rate and the lowest observed FDR in all four scenarios ( Supplementary Fig. S4) among these methods. We also simulated FACS data of CRISPR knockout screen to compare NBBt-test, edgeR, DESeq2, Baggerly, and MAGeCK in 8 scenarios (see Methods section). Likewise, NBBt-test still had much better performance than the other four methods in all these 8 scenarios ( Supplementary Fig. S5). Except for scenarios g and h, NBBt-test had the highest F1-score across − log10(α ) in the other 6 scenarios (Supplementary Fig. S6) and had type I error rate of 0 from α = 0.0 to α = 0.1 except that NBBt-test had type I error < 0.005 in scenarios a and b ( Supplementary  Fig. S7). MAGeCK, edgeR and DESeq2 also showed type I error rate = 0.0.
In another experiment, we simulated count data of splicing events in two samples using negative binomial generator in R environment based on pasilla exon splicing count data where there are 69,733 exons distributed within 14,206 Drosophila genes. Control sample has 4 biological replicates and knockdown sample has 3 biological replicates. In the all four scenarios, NBBt-test outperformed DEXSeq and had F1-score = 100% and the observed FDR = 0 from FDR = 0 to 0.1 ( Supplementary Fig. S8), while DEXSeq had F1-score = 44% and the observed FDR = 0.65 from FDR = 0 to 0.1 (Supplementary Fig. S8).

Discussion
NBBt-test is designed to identify differential isoforms or CRISPR screen sgRNAs by taking advantage of three important features of RNA-seq data. In the first case, we assume that RNA sequencing is a negative binomial event with a probability (p) that follows beta distribution with parameters, α and β, which is generally accepted in RNA sequencing 23,24 . We employed a modified algorithm of Baggerly et al. 41 to estimate parameters, α, β and p. The second feature is that RNA-Seq experiments are mostly conducted in very small samples (mostly less than 8 biological replicates), while small samples easily result in data gaps between two different conditions. In theory, two count datasets that are separated with a gap have higher probability that they come from two different distributions than those that overlap. For this reason, we introduced a gene-specific or isoform-specific variable ρ into the beta t-test to control false discoveries. ρ is used to measure the overlap between two count datasets and data homogeneity. If two count datasets overlap more and/or have bigger within-condition variances, then ρ < 1, Figure 5. NBBplots for differential expression of exons within genes AT5G26780 and AT1G09140 in Arabidopsis. Top part in each panel is expression (count of reads in y-axis) of exons within the gene where three red lines stand for three replicates of control in the Arabidopsis and three blue lines represent three replicates of heat shock. Bottom part is a map of exons and introns. Red box shows differential expression of the exon between heatshock and control samples found. (a) Differential expressions of exons E1-11, E14-E16, E18 within gene AT5G26780 between heat shock and control samples were identified by NBBt-test. (b) Exon E17 within gene AT5G26780 was identified by DEXseq to have differential expression between heat shock and control samples. (c) All 13 exons within gene AT1G09140 were identified to be differentially expressed between heat shock and control samples by NBBt-test. (d) Differential expression of exons E3, E12-E13 within gene AT1G09140 between heat shock and control samples were identified by DEXSeq. Genes AT5G26780 and AT1G09140 were validated to have exons differential expression by experiment. www.nature.com/scientificreports/ otherwise, ρ > 1 for separating and homogeneous data. Thus, ρ shrinks t-values for overlapped count datasets and inflates t-values for clearly separated count datasets with small noises. To consider sample size effect, we set a threshold ω α for ρ . That is, t-statistic is inflated with ρ > ω or shrunken with ρ < ω α . As the result, most of the t-values with ρ < ω α are compressed into small values while those with ρ > ω α are enlarged. Since p-value only depends on t-value given the degree of freedom, p-values with inflated t-values become smaller while those with shrunken t-values become larger, so very few false positives would be found. Threshold ω α is an estimate of null ρ value. ω α is a dependent of sample size and data quality. Our simulation shows that the smaller the sample size is, the larger the ω α is than ρ (Fig. 1d). However, when the sample size reaches 15 replicates, ω α is only slightly greater than ρ . These results are expected because in the case of sample size < 15, the probability that data gap occurs between conditions is inversely proportional to sample size (Fig. 1c), then ρ is smaller than ω α . ρ < ω α leads to a result that the new t-statistics < the old t-statistics, implicating that t-statistics is shrunken and p-value increases. In the case of sample size ≥ 15, however, the data gap is vanished, ρ ∼ = ω α , the new t-test is reduced to the old one. For the third feature, we used a gene-level statistic to furthermore correct bias of t-statistics. We used the sum count over all isoforms as count of a gene. We created a new t-statistic to test for differential expression of genes or isoforms. The new t-test includes three parts: classic t-test, fold change (FC), and F-like test. In Eq. (2), ϕ gi is a strict fold change or nonparametric U-test for two overlapping data. Both are two components of ρ and in Eq. (3), ζ gi is similar to F-statistic. ϕ gi controls false discoveries or false positives by measuring data gaps between conditions while ζ gi controls noise or within-group variations. These are the reasons why NBBttest outperforms the other differential methods. We can predict that even though ρ loses false discovery control in large samples (sample size m > 15), t-test would still have higher statistical power than the other compared methods. Therefore, t-test can find differential isoforms/sgRNAs or genes in either small samples or large samples with low type I error and high power. CRISPR knockout screen count data. Evers et al. 55 compared performances of CRISPR knockout screening, shRNA, and CRISPRi in discriminating hits from non-hits in functional genetic screens and published their  selected from COP9 signalosome (7 genes), proteasome (10 genes), nuclear pore complex (5 genes), ribosomal complex (20 genes) and RNA polymerase complex (4 genes) were considered essential for both cell lines. From the non-essential genes identified by Hart et al. 13,28 . Genes that are expected to have very consistent phenotype  www.nature.com/scientificreports/ are defined as non-essential 55 . Each gene was hit by a set of 10 sgRNAs. These two CRISPR knockout screening count datasets were downloaded from Evers et al. 55 .
Pasilla RNA-seq count data. The mammalian proteins, NOVA1 and NOVA2 (collectively named here as NOVA1/2) are perhaps the best-characterized splicing regulators to date. NOVA1/2 encodes RNA binding proteins with three KH-domains that recognize clusters of YCAY repeats 53 . The gene, pasilla (PS), the Drosophila melanogaster ortholog of mammalian NOVA1 and NOVA2, is well-studied for its splicing regulation. To study the impact of splicing regulators on splicing events, the gene pasilla, a splicing regulator in D. Melanogaster, was knocked down with RNAi. To explore PS-regulated exons, RNA-seq was used to identify splicing events that changed upon depletion of PS knocked down with RNAi. Libraries were prepared from RNA extracted from seven biologically independent D. melanogaster cell samples: three control samples and four PS knockdown samples and deep sequenced on an Illumina Genome Analyzer II (GAII), partly using single-end and partly paired-end patterns at various read lengths 53 . The RNA-seq sequence reads are available for download from the NCBI Gene Expression Omnibus (http:// www. ncbi. nlm. nih. gov/ geo) using accession numbers GSM461176-GSM461181. RNA-seq count data were downloaded from https:// genome. cshlp. org/ conte nt/ 21/2/ 193/ suppl/ DC1.
qPCR data for validating differential poly(A) isoforms. qPCR data for validating differential poly(A) isoforms 42 were obtained by isolating total RNA from resting and stimulated (48 h  www.nature.com/scientificreports/ Simulation study. Simulation of poly(A) RNA count data. Since RNA sequencing was assumed to follow negative binomial distribution (NB), we used the NB pseudorandom generator to create poly(A) RNA isoform count datasets. Simulations were performed in R-environment by using NB(µ + τ,s) and NB(µ , s) where µ and s are respectively mean and dispersion parameter (here is variance) for per poly(A) RNA isoform solved from experimental poly(A) RNA isoform count data from Jurkat T-cell in two conditions (resting and stimulating statues) each having 3 replicated libraries, where 18,290 isoforms scattered in 9572 genes were collected. We set two levels (A = 100 and 300) of treatment effect impacting differential expression of poly(A) isoforms and linearly and randomly assigned effect size τ = UA to 10% and 30% null poly(A) isoforms where U is uniform variable ( U ∈ (0, 1] ), so τ is linearly distributed from > 0 to A. In addition, 30% of the null poly(A) isoforms were assigned with outliers (technical). Equal sample sizes were set as 3 and 5 replicate libraries.
Simulation of exon count data. Similarly, we still assumed that exon RNA sequencing follows negative binomial distribution. In R-environment, we used NB pseudorandom generator to create exon RNA isoform count datasets. The simulation was conducted by using parameters per exon RNA isoform solved from experimental exon RNA isoform count data from D. Melanogaster in two conditions (control and PS knockdown) where control has 4 replicate libraries and PS knockdown has 3 replicate libraries and 69,733 exons were scattered in 16,206 annotated Drosophila genes. We set two levels (A = 100 and 300) of PS knockdown effect on differential splicing events and linearly and randomly assigned the effect size τ = UA to null exon isoforms with 10 and 30% probabilities, where U is uniform variable ( U ∈ (0, 1] ), switch ratio = 0. Thus, we simulated 4 scenarios: Scenario a: knockdown effect size A = 100U, 10% of exons were differentially spliced; Scenario b: knockdown effect size A = 100U, 30% of exons were differentially spliced; Scenario c: Knockdown effect size A = 300U, 10% of exons were differentially spliced; Scenario d: Knockdown effect size A = 300U, 30% of exons were differentially spliced.   15,19,20,57 interfere gene function by targeting Cas9 nuclease programmed by a sgRNA to the coding region of a gene of interest in a genome, and followed by error-prone repair through the cellular non-homologous end-joining pathway, while CRISPR interference (CRISPRi) and CRISPR activation (CRISPRa) screens 58 repress or activate gene transcription by exploiting a catalytically dead Cas9 to recruit transcriptional repressors or activators to their transcription start sites, as directed by sgRNAs. There are two distinct strategies for phenotypic selection: Fluorescence activated cells sorting (FACS)-based screens in which cell populations are separated based on a fluorescent reporter signal that is a function of the phenotype. Another strategy is growth screens in which the pooled screens are conducted to select genes with growth phenotype by comparing cell populations at an early time point with cells grown in the absence or presence of selective pressures, such as drugs or toxins. To evaluate these differential methods for identifying differential hits of sgRNAs, we here simulated data of CRISPR FACS screens using CRISPulator tool 59 in 8 scenarios consisting of combinations of three parameters each with two levels: FACS bins (0.1, 0.25), noises (0.5, 1.0), phenotype proportions (0.2, 0.4). The FACS bin is percent of cells in "low" and "high" population. Noise indicates deviation of Gaussian distribution, for example, noise = 0.5 means σ = 0.5, a normal distribution with deviation of 0.5. Phenotype proportion is fraction of genes with negative and positive phenotypes; for example, phenotype proportion = 0.2 means that 20% of genes show negative and positive phenotype and 80% of genes show null (or neutral) phenotype. In each scenario, we set number of genes = 1000, coverage (number of sgRNAs per gene) = 10 and two conditions each having equal sample size of 4. After simulation, we implemented MAGeCK 18 to map the RNA sequence reads to a reference genome, annotate genes with phenotypes, and create hit count data.
Computational programs and pipeline analysis of RNA-seq data. Figure 11a summarized workflow of NBBt-test. As a differential analysis tool, NBBt-test can perform differential analysis of poly(A) RNA-seq, CRISPR knockout screen RNA-seq, splicing RNA-seq, differential exons, RNA-seq(shRNA-seq, mRNA-seq). Figure 11b displays an overview of the computational procedure of the NBBt-test.  www.nature.com/scientificreports/ Analysis pipeline for poly(A) RNA-seq data. The paired end RNA reads were mapped to the mouse genome (mm9) using the paired-end mapping module of bwa 60 with default alignment parameters. Non-mapping reads were remapped to the UCSC KnownGene reference and then projected back to mm9. Individual reads were condensed to tags based on their 3′ coordinate by sliding 20-nucleotide window using the frequencyweighted median 3′ coordinate as the tag identifier. Tags were then filtered using a progressive filtering strategy assessing adenosine and guanine composition in the five, ten, and fifteen bases followed the tag-mapping site and then assigned to individual transcription units. For each transcription unit, the aggregate tags mapping to the unit were ranked based on frequency. Tags were extracted from the highest frequency to the lowest until the extracted tags excessed 90% of the aggregate frequency for the gene or isoform. The remaining tags were discarded 42 . All reads derived from an annotated transcription unit were considered as an individual entity. After that, counts of reads were obtained for an RNA isoform. This pipeline analysis was implemented by polyapipeline (Github).
Analysis pipeline for CRISPR knockout screen RNA-seq data. Data analysis pipeline for CRISPR screen RNA-seq data was conducted by performing MAGeCK 18 following the third demo: going through a public CRISPR/Cas9 screening dataset as described in the tutorial (https:// sourc eforge. net/p/ mageck/ wiki/ demo/# the-second-demo-start ing-from-raw-fastq-files). Briefly, MAGeCK requires a library file that lists three columns: sgRNA ID, sequence, and gene without comma in a txt file. So before running MAGeCK, user should make a library.txt file following provided library format. In the current version, MAGeCK can automatically determine the trimming length and sgRNA length, so user does not need to consider trimming RNA sequences.
In this pipeline, we used two commands, count and test. The count command produces a count table from MAGeCK containing two columns for sgRNA and gene and the rest of the columns for count data. The test command is used to compare two conditions to test for differential genes targeted by sgRNAs or differential sgRNA-targeted genes.
Analysis pipeline for splicing RNA-seq data. Similar to the general RNA sequence analysis pipeline, for splicing RNA-seq data we performed FastQC 61 to check for sequencing data quality, trimmed adapters and contaminations, filtered or removed overrepresented sequences or low-quality sequences, and corrected for errors to obtain clean sequencing data using the Galaxy server 62 . We used three pipelines to map splicing RNAseq reads to the reference genome as described below.
Star-spladder for splicing events. Firstly, STAR 51 was used to map RNA reads from samples onto the reference genome using gff3 annotation file and generate sorted bam files. In python 2.7 environment, spladder 11 was carried out on these sorted bam files with C = 3 (confidence = 3) and gff3 annotation file and ran build model to align and annotate 3′UTR splicing events, 5′UTR splicing events, (cassette) exon skipping events, intron retention, multiple exons skipping and mutual exclusive exons into splicing graph representation. Spladder outputs merge_graphs_confirmed.gff3, .icgc.txt.gz, .pickle, counts.hdf5 files of each splicing type. More detailed instructions for running Spladder and setting parameters can be found at https:// splad der. readt hedocs. io/ en/ latest/ gener al. html and https:// splad der. readt hedocs. io/ en/ latest/ splad der_ modes. html.

STAR-DEXSeq for differential exons.
Like STAR -spladder, at step 1, STAR was performed to map RNA reads onto the reference genome using gff3 annotation file to generate sorted sam files. At step 2, in python 2.7 environment, dexseq_prepare_annotation.py was run on annotation.gff3 to DEXSeq annotation file (e.g., DEXseq.gff). The next step was to run dexseq_count.py on sam files and DEXseq.gff to output count data files with genes and exons annotated. This step was done by using THseq 52 .
RNA-seq data analysis for differential gene expression. After running star to build start index on a reference genome for mapping RNA-seq data, we started with STAR to run alignReads runModel with setting genes.gtf in hg19, bedGraph, and GeneCounts. STAR outputted gene count data file.
Benchmarking. For poly(A) RNA-seq count data and general RNA-seq count data, we compared the per- www.nature.com/scientificreports/ F1-score. F 1 -score is a measure of a test's accuracy in statistics and widely used in the field of information retrieval for measuring search, document classification, and query classification performance 66 . F1-score is defined as where where tp = true positives, fn = false negatives, and fp = false positives. F1-score is a harmonic mean of the precision and recall. F1-score reflects balance between the precision and the recall of the tests and hence also reaches its best value at 1 (perfect precision and recall) when all true positives are found or fn = 0 and fp = 0. N-score. N-score: Many heatmaps were made by using z-score: a standard normal value: z = x−x σ where x and σ are respectively average expression value and standard deviation of a gene over all replicates. z-score separates data into two groups: positive and negative and are in range of − 4 to 4 or less for expression of all genes, so it can visualize difference in expression of genes in two conditions. However, it just is because all data are separated into positive and negative groups, heatmap can clearly display difference in expression between two conditions even though the difference in expression of some genes between two conditions are not enough distinct. Therefore, z-score heatmap may show us false difference between two conditions. To avoid this false difference visualization, we here gave n-score for heatmap: Let expression value(x) of a gene or an isoform be log2 transformation value, we then use y = 2 × to re-convert into the original expression value (y) and use n ij = y ij max(y) max(y i ) to build n-score where max(y) is maximum among all genes and all replicates of two groups and max(y i ) is maximum among all replicates of two groups in gene i. n has the same order of magnitude for all genes detected in high-throughput experiment but it does not have positive or negative value. Therefore, unlike z-score, n-score heatmap cannot visualize distinct difference in expression of genes between two conditions if the difference between conditions is not at an order level.
NBBt-test package. NBBt-test can be implemented by R package, NBBttest. Package NBBt-test was recruited by CRAN and can be installed into R console (Mac) or RGui (PC) or Rstudio. In addition, NBBttest can also be downloaded from github (https:// github. com/ Yuande/ NBBtt est) where README.md provides a detail instruction for user to run NBBt-test. Using NBBt-test one can check quality of count data, do multiple differential analyses of alternative splicing, alternative polyadenylation, CRISPR knockout screening, or gene expression, visualize the results and use NBBplot to display differential expression of exons within a specified DE gene identified by NBBt-test between two conditions across all replicates. NBBt-test package has two heatmap functions, myheatmap and myheatmap2. Function myheatmap uses row z-score across the sample individuals to show differential expression of genes or isoforms selected by NBBt-test between two conditions in one or multiple datasets. Function myheatmap2 uses n-score to show differential expression of genes or isoforms selected by NBBt-test between two conditions in one or multiple datasets.

Data availability
Source data used in this study are publicly available from corresponding original publications as follows: Pasilla RNA-seq count data from Brooks et al. 53 with Gene Expression Omnibus (GEO) accession numbers, GSM461176-GSM461181; CRISPR knockout screen data was downloaded from Evers et al. 55 with Short Read Archive (SRA) accession code, SRP072947; Poly(A) RNA-seq count data and qPCR data from Tan et al. 42 ; and the Arabidopsis heat shock RNA-seq data from Gulledge et al. 50  www.nature.com/scientificreports/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.