Quantify and control reproducibility in high-throughput experiments


Ensuring reproducibility of results in high-throughput experiments is crucial for biomedical research. Here, we propose a set of computational methods, INTRIGUE, to evaluate and control reproducibility in high-throughput settings. Our approaches are built on a new definition of reproducibility that emphasizes directional consistency when experimental units are assessed with signed effect size estimates. The proposed methods are designed to (1) assess the overall reproducible quality of multiple studies and (2) evaluate reproducibility at the individual experimental unit levels. We demonstrate the proposed methods in detecting unobserved batch effects via simulations. We further illustrate the versatility of the proposed methods in transcriptome-wide association studies: in addition to reproducible quality control, they are also suited to investigating genuine biological heterogeneity. Finally, we discuss the potential extensions of the proposed methods in other vital areas of reproducible research (for example, publication bias and conceptual replications).

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Accuracy and performance of the proposed methods in simulations.
Fig. 2: Highly reproducible TWAS signals identified from the height GWAS data in the UK Biobank and the GIANT consortium.
Fig. 3: Tissue-consistent and -specific height TWAS signals identified from whole blood and skeletal muscle tissues.

Data availability

All processed data for simulations and real data analysis are available at https://github.com/ArtemisZhao/INTRIGUE/intrigue_paper. GWAS summary statistics for the UK Biobank and the GIANT consortium are available at https://doi.org/10.5281/zenodo.3629742. eQTL data for TWAS analysis are available at https://gtexportal.org/home/datasets.

Code availability

The source code for software implementation (in R and C/C++), simulation studies and real data processing are provided in https://github.com/ArtemisZhao/INTRIGUE. A Docker image that duplicates the complete computational environment for reproducing the reported results can be freely downloaded from https://hub.docker.com/r/xqwen/intrigue.


  1. 1.

    Goodman, S. N., Fanelli, D. & Ioannidis, J. P. What does research reproducibility mean? Sci. Transl. Med. 8, 341ps12–341ps12 (2016).

    Article  Google Scholar 

  2. 2.

    Begley, C. G. & Ioannidis, J. P. Reproducibility in science: improving the standard for basic and preclinical research. Circ. Res. 116, 116–126 (2015).

    CAS  Article  Google Scholar 

  3. 3.

    Leek, J. T. & Peng, R. D. Opinion: reproducible research can still be wrong: adopting a prevention approach. Proc. Natl Acad. Sci. USA 112, 1645–1646 (2015).

    CAS  Article  Google Scholar 

  4. 4.

    Leek, J. T. et al. Tackling the widespread and critical impact of batch effects in high-throughput data. Nat. Rev. Genetics 11, 733–739 (2010).

    CAS  Article  Google Scholar 

  5. 5.

    AC’t Hoen, P. et al. Reproducibility of high-throughput mrna and small rna sequencing across laboratories. Nat. Biotech. 31, 1015–1022 (2013).

    Article  Google Scholar 

  6. 6.

    Goh, W. W. B., Wang, W. & Wong, L. Why batch effects matter in omics data, and how to avoid them. Trends Biotech. 35, 498–507 (2017).

    CAS  Article  Google Scholar 

  7. 7.

    Ioannidis, J. P. et al. Repeatability of published microarray gene expression analyses. Nat. Genetics 41, 149–155 (2009).

    CAS  Article  Google Scholar 

  8. 8.

    Baggerly, K. A. & Coombes, K. R. et al. Deriving chemosensitivity from cell lines: forensic bioinformatics and reproducible research in high-throughput biology. Ann. Appl. Stats 3, 1309–1334 (2009).

    Article  Google Scholar 

  9. 9.

    Flutre, T., Wen, X., Pritchard, J. & Stephens, M. A statistical framework for joint EQTL analysis in multiple tissues. PLoS Genet. 9, e1003486 (2013).

    CAS  Article  Google Scholar 

  10. 10.

    Li, G., Shabalin, A. A., Rusyn, I., Wright, F. A. & Nobel, A. B. An empirical Bayes approach for multiple tissue eqtl analysis. Biostatistics 19, 391–406 (2017).

    Article  Google Scholar 

  11. 11.

    Consortium, G. et al. The genotype-tissue expression (gtex) pilot analysis: multitissue gene regulation in humans. Science 348, 648–660 (2015).

    Article  Google Scholar 

  12. 12.

    Goodman, S. N. A comment on replication, P-values and evidence. Stat. Med. 11, 875–879 (1992).

    CAS  Article  Google Scholar 

  13. 13.

    Heller, R., Bogomolov, M. & Benjamini, Y. Deciding whether follow-up studies have replicated findings in a preliminary large-scale omics study. Proc. Natl Acad. Sci. USA 111, 16262–16267 (2014).

    CAS  Article  Google Scholar 

  14. 14.

    Li, Q., Brown, J. B., Huang, H. & Bickel, P. J. et al. Measuring reproducibility of high-throughput experiments. Ann. Appl. Stats 5, 1752–1779 (2011).

    Article  Google Scholar 

  15. 15.

    Tukey, J. W. The future of data analysis. Ann. Math. Stats 33, 1–67 (1962).

    Article  Google Scholar 

  16. 16.

    Stephens, M. False discovery rates: a new deal. Biostatistics 18, 275–294 (2016).

    PubMed Central  Google Scholar 

  17. 17.

    Efron, B. et al. Size, power and false discovery rates. Ann. Stats 35, 1351–1377 (2007).

    Article  Google Scholar 

  18. 18.

    Leek, J. T. & Storey, J. D. Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genetics 3, e161 (2007).

    Article  Google Scholar 

  19. 19.

    Johnson, W. E., Li, C. & Rabinovic, A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8, 118–127 (2007).

    Article  Google Scholar 

  20. 20.

    Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genetics 47, 1091–1098 (2015).

    CAS  Article  Google Scholar 

  21. 21.

    Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016).

    CAS  Article  Google Scholar 

  22. 22.

    Zhang, Y. et al. PTWAS: investigating tissue-relevant causal molecular mechanisms of complex traits using probabilistic TWAS analysis. Genome Biol. 21, 232 (2020).

    Article  Google Scholar 

  23. 23.

    Storey, J. D. et al. The positive false discovery rate: a Bayesian interpretation and the q-value. Ann. Stats 31, 2013–2035 (2003).

    Article  Google Scholar 

  24. 24.

    Aguet, F. et al. The gtex consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).

    CAS  Article  Google Scholar 

  25. 25.

    Peters, J. L. et al. Assessing publication bias in meta-analyses in the presence of between-study heterogeneity. J. Royal Stat. Soc. A 173, 575–591 (2010).

    Article  Google Scholar 

  26. 26.

    Lin, L. & Chu, H. Quantifying publication bias in meta-analysis. Biometrics 74, 785–794 (2018).

    Article  Google Scholar 

  27. 27.

    Terrin, N., Schmid, C. H., Lau, J. & Olkin, I. Adjusting for publication bias in the presence of heterogeneity. Stat. Med. 22, 2113–2126 (2003).

    Article  Google Scholar 

  28. 28.

    Augusteijn, H. E., van Aert, R. & van Assen, M. A. The effect of publication bias on the q test and assessment of heterogeneity. Psych. Meth. 24, 116–134 (2019).

    Article  Google Scholar 

  29. 29.

    Lau, J., Ioannidis, J. P., Terrin, N., Schmid, C. H. & Olkin, I. The case of the misleading funnel plot. BMJ 333, 597–600 (2006).

    Article  Google Scholar 

  30. 30.

    Higgins, J. P. & Thompson, S. G. Quantifying heterogeneity in a meta-analysis. Stat. Med. 21, 1539–1558 (2002).

    Article  Google Scholar 

  31. 31.

    Schmidt, S. Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Rev. Gen. Psych. 13, 90–100 (2009).

    Article  Google Scholar 

  32. 32.

    Wen, X. Bayesian model selection in complex linear systems, as illustrated in genetic association studies. Biometrics 70, 73–83 (2014).

    Article  Google Scholar 

  33. 33.

    Wen, X. & Stephens, M. Bayesian methods for genetic association analysis with heterogeneous subgroups: from meta-analyses to gene-environment interactions. Ann. Appl. Stats 8, 176–203 (2014).

    Article  Google Scholar 

Download references


This work was supported by National Institutes of Health grant nos. R35GM138121, R01DK108805 and R01DK119380.

Author information




Y.Z., M.G.S. and X.W. conceived the ideas. Y.Z. and X.W. designed the experiments. Y.Z. and X.W. developed methods, implemented software and performed analyses. Y.Z., M.G.S. and X.W. wrote the manuscript.

Corresponding author

Correspondence to Xiaoquan Wen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Lin Tang was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Proportion estimates from batch effect affected high-throughput experiments with no genuine biological signals.

Each simulated dataset consists of 1,000 genes. No gene is differentially expressed in the case (N = 20) and the control (N = 20) samples. In each replication dataset, 500 genes are affected by the unobserved batch effects with various magnitudes (η/σ). The figure shows the estimates of (πIR, πR) from the CEFN and the META models for all magnitudes of batch effects examined. The reproducible proportions across all datasets remain close to 0, while the estimates of the irreproducible proportions monotonically increases as the batch effects become stronger.

Extended Data Fig. 2 A directed acyclic graph representation of the proposed Bayesian hierarchical model.

The estimated effects, \({\hat{\beta }}_{i,j}\)’s are observed, \({\bar{\beta }}_{i}\)’s and βi,j’s are latent random variables. ω, k (or r) are hyper-parameters.

Supplementary information

Supplementary Information

Supplementary Table 1 and Notes.

Reporting Summary

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhao, Y., Sampson, M.G. & Wen, X. Quantify and control reproducibility in high-throughput experiments. Nat Methods 17, 1207–1213 (2020). https://doi.org/10.1038/s41592-020-00978-4

Download citation


Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing