A probability-based approach for the analysis of large-scale RNAi screens


We describe a statistical analysis methodology designed to minimize the impact of off-target activities upon large-scale RNA interference (RNAi) screens in mammalian cells. Application of this approach enhances reconfirmation rates and facilitates the experimental validation of new gene activities through the probability-based identification of multiple distinct and active small interfering RNAs (siRNAs) targeting the same gene. We further extend this approach to establish that the optimal redundancy for efficacious RNAi collections is between 4–6 siRNAs per gene.

Figure 1: Analysis of genome-wide siRNA data.
Figure 2: Gene-centered analysis of large-scale RNAi data.


We thank L. Miraglia for helpful discussions and oversight of screens, J. Zhang for excellent technical assistance, S. Batalov (Genomics Institute of the Novartis Research Foundation) and P. Aza-Blanc (Burnham Institute) for the identification of negative control siRNA sequences, E. Lader (Qiagen) for facilitating collaboration, D. Elleder (Salk Institute) for providing the MLV supernatant, N.R. Landau (New York University, School of Medicine) for providing pNL43-luc-r+e, and N. Somia (University of Minnesota) for the gift of pCMVgp. R-language implementation of the RSA algorithm was provided by B. Zhou (Genomics Institute of the Novartis Research Foundation). This work was supported by the Novartis Research Foundation and a grant from the US National Institutes of Health (1 R01 AI072645-01).

