Look out plagiarists — you are being watched. Credit: I. LOGAN/GETTY IMAGES

Major science publishers are gearing up to fight plagiarism. The publishers, including Elsevier and Springer, are set to roll out software across their journals that will scan submitted papers for identical or paraphrased chunks of text that appear in previously published articles. The move follows pilot tests of the software that have confirmed high levels of plagiarism in articles submitted to some journals, according to an informal survey by Nature of nine science publishers. Incredibly, one journal reported rejecting 23% of accepted submissions after checking for plagiarism.

Over the past two years, many publishers (including Nature Publishing Group) have been trialling CrossCheck, a plagiarism checking service launched in June 2008 by CrossRef, a non-profit collaboration of 3,108 commercial and learned society publishers. The power of the service — which uses the iThenticate plagiarism software produced by iParadigms, a company in Oakland, California — is the size of its database of full-text articles, against which other articles can be compared. Publishers subscribing to CrossCheck must agree to share their own databases of manuscripts with it. So far, 83 publishers have joined the database, which has grown to include 25.5 million articles from 48,517 journals and books.

Catching copycats

As publishers have expanded their testing of CrossCheck in the past few months, some have discovered staggering levels of plagiarism, from self-plagiarism, to copying of a few paragraphs or the wholesale copying of other articles. Taylor & Francis has been testing CrossCheck for 6 months on submissions to three of its science journals. In one, 21 of 216 submissions, or almost 10%, had to be rejected because they contained plagiarism; in the second journal, that rate was 6%; and in the third, 13 of 56 of articles (23%) were rejected after testing, according to Rachael Lammey, a publishing manager at Taylor & Francis's offices in Abingdon, UK.

The three journals were deliberately selected because they had seen instances of plagiarism in the past, says Lammey. "My suspicion is that when we roll this out to other journals the numbers would be significantly lower." Mary Ann Liebert, a publishing company in New Rochelle, New York, has found that 7% of accepted articles in one of its journals had to be rejected following testing, says Adam Etkin, director of online and Internet services at the company.

CrossRef's product manager for CrossCheck, Kirsty Meddings, based in Oxford, UK, says that publishers are now checking about 8,000 articles a month, but many say that they have few hard statistics on the levels of plagiarism they are finding. Most are delegating CrossCheck testing to journal editors, and have not yet compiled detailed results. "We leave the use of the service to the discretion of the editor-in-chief of the journal, with some choosing to check every submission, but most use it only to check articles they consider suspicious," says Catriona Fennell, director of journal services at Elsevier in Amsterdam. "We are seeing a really wide variety of usage."

Not so many years ago, we got one or two alleged cases a year. Now we are getting one or two a month. ,

Publishers are unsure whether plagiarism is on the increase, whether it is simply being discovered more often, or both. "Not so many years ago, we got one or two alleged cases a year. Now we are getting one or two a month," says Bernard Rous, director of publications at the Association for Computing Machinery in New York, the world's biggest learned society for scientific computing, which is in the early stages of implementing CrossCheck. "There probably is more plagiarism than people have been aware of," adds Lammey.

Casting the net wider

The levels of plagiarism uncovered by CrossCheck have been more than enough to persuade publishers to embrace the software. "As you can see, CrossCheck is having an effect both on the papers we review and those we accept for publication, and with this in mind, we're keen to roll this trial out to our other journals," says Lammey. Most of the publishers interviewed by Nature said they had similar plans.

Using the CrossCheck software brings extra costs and overheads for journals. Publishers seem to find the fees reasonable, which start out at $0.75 per article checked and decrease with volume. The bigger overhead, they say, is the time needed for editors to check papers flagged by the software as suspiciously similar.

Establishing plagiarism requires "expert interpretation" of both articles, says Fennell. The software gives an estimate of the percentage similarity between a submitted article and ones that have already been published, and highlights text they have in common. But similar articles are sometimes false positives, and some incidents of plagiarism are more serious than others.

Self-plagiarism of materials and methods can sometimes be valid, for example, says Fennell. "There are only so many different ways you can describe how to run a gel," she says. "Plagiarism of results or the discussion is a greater concern." Sorting out acceptable practice from misconduct can often take a lot of time, says Lammey.

Overall, publishers say that they are delighted to have a tool to police submissions. "We are using CrossCheck on about a dozen journals, and it has spotted things that we would otherwise have published," says Aldo de Pape, manager of science and business publishing operations at Springer in Rotterdam, the Netherlands. "Some were very blatant unethical cases of plagiarism. It has saved us a lot of embarrassment and trouble."