The problem of the invisibility of negative results is underlined by the media storm over a paper supporting extrasensory perception being published in a reputable psychology journal (see The New York Times, 5 January 2011). Although individual reports might be statistically valid in isolation, their conclusions could still be questionable — other test results of the same hypothesis must also be taken into account.
Say a study finds no statistically favourable evidence for a hypothesis at the predetermined significance level (P=0.05, for example) and, like most with negative results, it is never published. If 19 other similar studies are conducted, then 20 independent attempts at the 0.05 significance level are, by definition, expected to give at least one hit. A positive result obtained in one of the 19 studies, viewed independently, would then be statistically valid and so support the hypothesis, and would probably be published.
Statistical corrections are routinely made for multiple testing within a study, but they are important across studies too. The difficulty lies in determining the number of parallel investigations of the same hypothesis. Perhaps different disciplinary research societies could help bring these covert experiments to light.