It is both encouraging and disheartening to hear that major science publishers intend to roll out the CrossCheck plagiarism-screening service across their journals (see page 167).

What is encouraging is that many publishers are not only tackling plagiarism in a systematic way, but have agreed to do so by sharing the full text of their articles in a common database. This last was not a given, considering the conservatism of some companies, yet it was a necessary step for the service to function — the iThenticate software used by CrossCheck works by comparing submitted articles against a database of existing articles. CrossCheck's 83 members have already made available the full text of more than 25 million articles.

What is disheartening is that plagiarism seems pervasive enough to make such precautions necessary. In one notable pilot of the system on three journals, their publisher had to reject 6%, 10% and 23% of accepted papers, respectively.

Granted, there are reasons to believe that such levels of plagiarism are exceptional. Previous studies of samples on the physics arXiv preprint server (see Nature 444, 524–525; 2006) and of PubMed abstracts (see Nature doi:10.1038/news.2008.520; 2008) found much lower rates. But the reality is that data are sorely lacking on the true extent of plagiarism, whether its prevalence is growing substantially and what differences might exist between disciplines. The hope is that the roll-out of CrossCheck will eventually yield reliable data on such questions over wide swathes of the literature — while also acting as a powerful deterrent to would be plagiarists.

In the process, editors and publishers must remember that plagiarism comes in many varieties and degrees of severity, and that responses should be proportionate. For example, past studies suggest that self-plagiarism, in which a researcher copies his or her own words from a published paper, is far more common than plagiarism of the work of others. Arguably, self-plagiarism can sometimes be justified, as when a researcher is bringing similar ideas before readers of journals in a different field. All plagiarism can also involve honest errors or mitigating circumstances, such as a scientist with a poor command of English paraphrasing some sentences of the introduction from similar work.

Such examples underscore that plagiarism-detection software is an aid to, not a substitute for, human judgement. One rule of thumb used by Nature journals and others in considering an article's degree of similarity to past articles — in particular, for small amounts of self-plagiarism in review articles — is whether the paper is otherwise of sufficient originality and interest.

Nature Publishing Group is a member of CrossCheck and has been testing the service on submissions to its own journals. It has noted only trace levels of plagiarism in research articles, which are spot-checked, and often in only the supplementary methods. Plagiarism has been more common in submitted reviews, all of which are tested. This is particularly true in clinical reviews, although the rates are still far below the 1% mark, and in most instances concerned some level of self-plagiarism.

Although the ability to detect plagiarism is a welcome advance, addressing the problem at its source remains the key issue. More and more learned societies, research institutions and journals have in recent years adopted comprehensive ethical guidelines on plagiarism, many of which carefully distinguish between different levels of severity. It is crucial that research organizations in all countries, and particularly the mentors of young researchers, instil in their scientists the accepted norms of the international scientific community when it comes to plagiarism and publication ethics.