Publication rates are increasing in line with accelerating scientific progress that is boosted by buoyant funding and advances in facilitating technologies. Equally, the time-lag from bench to journal is decreasing and the pressures to publish mount with the increased chance of duplicated research and competitiveness. The old adage of 'publish or perish' is ever more pertinent and it is not surprising that sloppiness, plagiarism and even fraud rear their ugly heads. Ethics can fall by the wayside all too easily in today's intense research atmosphere.

None of these issues are new and it is hard to quantify whether the number of cases uncovered is rising faster than the increase in research output (Nature 435, 737–738 (2005)). Nevertheless, alarm bells have been ringing in editorial offices and the Nature journals have been considering for some time what more can be done to ensure that only solid data reaches our presses. By far and away the most prominent problem is that scientists do not take the time to understand complex data-acquisition tools and occasionally seem to be duped by the ease of use of image-processing programmes to manipulate data in a manner that amounts to misrepresentation. The intention is usually not to deceive but to make the story more striking by presenting clear-cut, selected or simplified data — an approach we have dubbed 'data beautification'. The Journal of Cell Biology has looked at the problem systematically and estimates that up to 20% of accepted papers contain some questionable data, a rate that has not decreased since the journal instituted an editorial data-screening process (J. Cell Biol. 166, 11–15 (2004); Nature 434, 952–953 (2005)).

We select our referees with the utmost care and will usually use at least one expert who specializes in the relevant technological platforms. Referees assess data quality, but it would be impractical to expect them to screen for anything but the most overt data manipulation. Even an elaborate peer review process such as ours does not completely filter out illicitly manipulated data. We are therefore working on methods to improve the screening process in consultation with the community and will report in more detail on the various steps that we are developing.

Firstly, Nature journals have encouraged voluntary declarations of author contributions for the past year (http://www.nature.com/ncb/journal/v7/n3/full/ncb0305-203.html). This adds further accountability and take-up of this facility is increasing.

Secondly, we will improve our guidelines to authors and referees by outlining more explicitly what data processing is acceptable and what is not. We will also ask authors to be prepared to provide additional raw data for inspection on the request of referees and editors.

Finally, we are investigating the implementation of improved data-screening procedures at the editorial level (possibly partially automated).

Nevertheless, science must not degenerate into a game of 'what can we get away with?' for a fast publication. Responsibility and accountability apply at every level, from graduate student to institute director. Corresponding authors carry responsibility for evaluating the primary data and for confirming that the published data is real and properly processed. Editors and referees evaluate the importance and quality of data submitted for publication, but they cannot and should not be expected to view every submission as potentially fraudulent. Editors are not 'data police' and neither are referees. Additional data screening will be useful for filtering out 'beautified' data, but will not catch anyone but the most naive cheats. Well-crafted fraud is essentially impossible to detect without assessing the primary data or indeed being present throughout the experimentation.

It is important to remember that fraud is always doomed, in that data is only accepted as dogma after independent verification post-publication. Valuable time and effort may be wasted, but a fraudster will rarely bask in the glory of their publication for more than a year or two. A significant issue at the heart of the scientific tradition of reproducibility is that publication pressures significantly select against the replication of data. The community needs to find approches that give academic credit for the essential exercise of replicating data.

Nature has on occasion commissioned independent tests after publication of extremely surprising claims, such as the infamous 'memory of water' paper of 1988 and very recently Hwang's cloned dog 'Snuppy'. In the aftermath of the Hwang case, we have consulted with the cloning community to evaluate whether independent confirmation should be a requirement for publication (Nature 439, 243 (2006)). However, time delays alone suggest this would be unrealistic as a routine requirement. Also, unless the validating scientists get proper academic credit for this work, it will be hard to encourage.

Hwang's two fraudulent papers that reported the efficient generation of human stem-cell lines from cloned embryos seems to illustrate that fraud is not confined to junior researchers, and so whistle-blowers should be both encouraged and protected. Several countries have official bodies designed for this (for example, the US Office of Research Integrity, the DFG ombudsman in Germany, the Danish Committees on Scientific Dishonesty and the Panel for Health and Biomedical Research Integrity of the UK). A re-emphasis on research ethics is desirable and the Hwang case underlines the importance of teaching ethics in graduate school, as is already recognized by a number of universities. Later this year, the UK government will release a new ethics code drafted by a working group headed by the chief scientific advisor, David King, although it remains to be seen whether this will provide concrete help or merely a general framework (http://www.cst.gov.uk/cst/reports/#11).

In the meantime, the Hwang debacle has catapulted many of these issues to the centre stage. The global media coverage focussed on every aspect of the peer review process and this must be a good thing. Given that this case includes breaches in bioethics, data misrepresentation and image manipulation, it is instructive to review what happened and whether the publication process could have prevented any of these offences.

The Hwang group rose to international stardom with the first of two retracted papers in March 2004, but by May of that year Nature had raised potential concerns about the source of the eggs used in the study. Although using the eggs of laboratory members was not against Korean law at the time, it clearly infringed international ethical standards and Hwang initially denied all knowledge of these donations. As the evidence mounted through a tip-off to the Korean media, Hwang accepted that he knew about the egg donation. Meanwhile, the corresponding author, Gerard Schatten, distanced himself from the collaboration due to 'ethical breaches', but stood by the data. Internationally accepted ethical standards, especially in the area of cloning, would have clarified the situation and the journal could have insisted on disclosure of the donation records. Schatten is still under pressure about his corresponding authorship on the 2005 Science paper, as his contributions appear to have been indirect at best. An author contribution disclosure requirement would have clarified this from the beginning.

During an investigation by journalists from Munhwa Broadcasting Corp., an unnamed previous collaborator also raised issues about the data, which were strengthened in interviews with other former laboratory members, albeit through unethical journalistic practices. Although no details were released by the press at that time, scientists were alerted and began investigating the data in more detail. Postings on a Korean website serving young scientists lead to the discovery that images of embryoid bodies in the 2005 paper were duplicated. Worldwide media attention sparked investigations by Seoul National University and Schatten's Pittsburgh University — the former officially declared the data in both papers unfounded, but found that the cloned dog, Snuppy, was genuine, in line with Nature's own investigation. In this case, routine visual inspection by editors is unlikely to have picked up the image manipulations, although automated screening may have done. An insistence on independent DNA fingerprinting and analysis of mitochondrial DNA sequences (which, as noted, could only be envisaged in exceptional cases) would have been more informative. Clearly, the whistle-blowers in Korea did not feel comfortable coming forward, arguing for a science policy and culture better attuned to criticism across the ranks.

Fingers have been pointed by some, including the former editor of a reputable journal, at deficiencies in Science's editorial process. Having reviewed the manipulations uncovered, we would argue that a standard peer-review process would have been unlikely to spot these. Images are indeed duplicated within the same panel, but the data is deeply embedded in multi-panel figures and the duplications are offset. The DNA fingerprinting traces are of relatively poor resolution, but do not show any immediately obvious signs of duplication or splicing — as David Altshuler noted in Science, a referee would have had to monitor an ongoing experiment for this fraud to be detectable. These two Science papers must have been among the most scrutinized of the past few years, given the momentous technical advances, but nobody raised the alarm. Only after suspicions were aroused by the egg donor misrepresentations, and whistle-blowing by research associates, did some young Korean scientists notice image manipulations. An editorial image-screening process may have detected the duplications in this case, especially if enhanced by automated image-comparison software.

Independent replication would have been of key importance in this case. Why was this not done? There was a significant amount of trust as a number of world leaders had visited Hwang's laboratories and returned impressed by the apparent quality and scale of the operations, as well as by the willingness of the lead researchers to share protocols. On the other hand, it is emerging that few, if any, international laboratories ever received cell lines (let alone patient samples). It is a condition of publication in most journals that materials be shared and it remains unclear why any infringement of this policy was not reported, although distribution of the cell lines alone would not have uncovered the scam. The perceived dominance of the Hwang laboratory in the field appears to have been a contributing factor.

Rare frauds, and not so rare data beautification, deserve the attention of all in the biosciences. It is unlikely that fundamental changes to the peer review system would improve the detection of illicit data manipulation. Scientists and editors need to train their eyes to the problem: be watchful, but not distrustful. We will keep you posted as we roll out improved guidelines and add further levels of editorial assessment to ensure that all data in this journal deserves to be there.

Further reading on Connotea [http://www.connotea.org/user/bpulverer/tag/beautification%20fraud]