Establishing statistical validity for study findings goes beyond a consideration of P values alone (R. Nuzzo Nature 506, 150–152; 2014). In the era of big data, we now have many biological measures available for assessing how likely findings are to be true positives.

This more-comprehensive approach has long been used by epidemiologists to address concerns about bias and causality: for example, in investigations of possible components of hypothetical disease-causing pathways (L. H. Kuller et al. Am. J. Epidemiol. 178, 1350–1354; 2013). A way of inferring a causal association is to apply Hill's criteria, which seek ties between many factors, such as dose response, temporality and disease exposure (A. B. Hill Proc. R. Soc. Med. 58, 295–300; 1965).

Advances in genomics and systems biology enhance our capacity for such investigations. We can now determine whether findings operate in a specific genotype context or fit biologically plausible pathways or networks — as was done in a re-evaluation of results from a genome-wide association study for multiple sclerosis (International Multiple Sclerosis Genetics Consortium Am. J. Hum. Genet. 92, 854–865; 2013).