Wisdom of crowds for robust gene network inference

Reconstructing gene regulatory networks from high-throughput data is a long-standing challenge. Through the Dialogue on Reverse Engineering Assessment and Methods (DREAM) project, we performed a comprehensive blind assessment of over 30 network inference methods on Escherichia coli, Staphylococcus aureus, Saccharomyces cerevisiae and in silico microarray data. We characterize the performance, data requirements and inherent biases of different inference approaches, and we provide guidelines for algorithm application and development. We observed that no single inference method performs optimally across all data sets. In contrast, integration of predictions from multiple inference methods shows robust and high performance across diverse data sets. We thereby constructed high-confidence networks for E. coli and S. aureus, each comprising 1,700 transcriptional interactions at a precision of 50%. We experimentally tested 53 previously unobserved regulatory interactions in E. coli, of which 23 (43%) were supported. Our results establish community-based methods as a powerful and robust tool for the inference of transcriptional gene regulatory networks.

Figure 1: The DREAM5 network inference challenge.
Figure 2: Evaluation of network inference methods.
Figure 3: Analysis of community networks compared to individual inference methods.
Figure 4: E. coli and S. aureus community networks.


We thank all challenge participants for their invaluable contribution; R. Norel and J. Saez Rodriguez, who participated in different aspects of the organization and scoring of DREAM5; and P. Carr, M. Reich, J. Mesirov and the rest of the GenePattern team for providing software and support. This work was funded by the US National Institutes of Health (NIH) National Centers for Biomedical Computing Roadmap Initiative (U54CA121852), Howard Hughes Medical Institute, NIH Director's Pioneer Award DPI OD003644 and a fellowship from the Swiss National Science Foundation to D.M. Challenge participants acknowledge: grants ANR-07-BLAN-0311-03 and ANR-09-BLAN-0051-04 from the French National Research Agency (A.-C.H., P.V.-L., F.M., J.-P.V.); the Interuniversity Attraction Poles Programme (IAP P6/25 BIOMAGNET), initiated by the Belgian State, Science Policy Office, the French Community of Belgium (ARC Biomod) and the European Network of Excellence PASCAL2 (V.A.H.-T., A.I., L.W., Y.S., P.G.); the European Community's 7th Framework Program, grant no. HEALTH-F4-2007-200767 for the APO-SYS program, and a doctoral fellowship from the Edmond J. Safra Bioinformatics Program at Tel Aviv University (G.K., R.S.); the Irish Research Council for Science Engineering and Technology for financial support under the EMBARK scheme, and the Irish Centre for High-End Computing for provision of computational facilities and technical support (A. Sîrbu, H.J.R., M.C.); the US National Cancer Institute grant U54CA132383 and US National Science Foundation grant HRD-0420407 (Z.O., Y.Z., H.W., M.S.); and the Sardinian Regional Authorities (A.F., A.P., N.S., V.L.). V.A.H.-T. is recipient of a fellowship from the Fonds pour la formation à la Recherche dans l′Industrie et dans l′Agriculture (F.R.I.A., Belgium); Y.S. is a postdoctoral fellow of the Fonds voor Wetenschappelijk Onderzoek - Vlaanderen (FWO, Belgium); P.G. is Research Associate of the Fonds National de la Recherche Scientifique (FNRS, Belgium).

Author information

D.M., J.C.C., D.M.C., R.J.P., M.K., J.J.C. and G.S. conceived the challenge; R.J.P. and G.S. performed team scoring; N.M.V. and K.R.A. performed experimental validation; D.M., J.C.C., R.K., R.J.P. and G.S. performed research; D.M., J.C.C., R.K., N.M.V., R.J.P., K.R.A., M.K., J.J.C. and G.S. analyzed results; D.M., J.C.C., R.K., M.K., J.J.C. and G.S. wrote the paper; and challenge participants performed network inference and provided method descriptions.

Correspondence to Gustavo Stolovitzky.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Notes 1–10 (PDF 19429 kb)

Supplementary Data 1

DREAM5 network inference challenge (expression data, experiment descriptions, gene names, gold standards and evaluation scripts) (ZIP 37067 kb)

Supplementary Data 2

DREAM5 method scores (AUPR, AUROC and overall score) and summaries (XLS 57 kb)

Supplementary Data 3

DREAM5 network predictions of all individual methods and the community (ZIP 77113 kb)

Supplementary Data 4

E. coli, S. aureus and S. cerevisiae community networks (all predictions) (ZIP 5322 kb)

Supplementary Data 5

E. coli and S. aureus community networks at 50% precision cutoff (ZIP 192 kb)

Supplementary Data 6

E. coli and S. aureus network modules (XLS 595 kb)

Supplementary Data 7

E. coli experimental support for tested interactions (XLS 390 kb)

