Science is a way to distinguish things we know not to be true from other things. Large challenges lie ahead as we apply the scientific method to understanding biochemical systems, cellular organization and the functions of complex organs such as the brain.

“It seems to me that the method of most rapid progress in such complex areas. . . is going to be to set down explicitly at each step just what the question is and what all the alternatives are, and then to set up crucial experiments to try to disprove some,” wrote John R. Platt in his essay “Strong Inference” (Science 146, 347–353, 1964). And in many ways, these problems are still there for the solving over 46 years after this call to scientific clarity, in which Platt attributes the rapid successes of the early molecular biologists to their choice of the simplest problems, their logical rigor and their habits of systematically pairing simple and rapidly disprovable hypotheses with decisive experiments.

The intervening years have seen substantial growth of the research community and its funding and a massive improvement in computing power, none of which would have convinced Platt that we are better equipped to generate, clearly state and experimentally dispose of competing hypotheses. Quite the contrary: he regarded quantitative measurement and calculation as secondary to the scientific method, warning that “we substitute correlations for causal studies and physical equations for organic reasoning.” And, ultimately, “any conclusion that is not an exclusion is insecure and must be rechecked.”

This emphasis on logical exclusion distinguishes biological advances from resource and method projects. But the rigorous approach is readily extended to the complex projects we are considering because the complexity of a research project does not change the basic requirement for inference so long as the results are intended to be understood by human brains. A model or predictor aids secure inference when it is treated as a falsifiable hypothesis with falsifiable sub-hypotheses. Therefore, we would expect to publish a list of conditions in which the model or predictor is not valid, and tests demonstrating conditions in which it is not valid, as well as hypotheses drawn from the model or predictor and tests that disprove these hypotheses.

There are a number of benefits to separating the logical gems that authors are prepared to have tested by others from their setting of consistent observations and rhetoric that is not directly part of the scientific work of the paper. These pluses are: to allow peer referees to do their job and readers to understand the work; to make clear the caveats and limits to application of results to other fields; to limit proliferation of useless observational studies and reduce duplication and waste of effort.

It may also be possible to distinguish the direct influence of the research independently of the publications that describe it. In order to do this, each of these two components—hypotheses and experiments—will need to be coded with unique identifiers and separately cited. Such an extreme cultural change may not be needed if publications are carefully structured. Surely it is obvious that a study providing strong inferences will be both well used and highly cited.