Nature | Comment

Peer review: Troubled from the start

Pivotal moments in the history of academic refereeing have occurred at times when the public status of science was being renegotiated, explains Alex Csiszar.

Article tools

Subject terms:

Paul D. Stewart/SPL

William Whewell, peer-review pioneer.

Referees are overworked. The problem of bias is intractable. The referee system has broken down and become an obstacle to scientific progress. Traditional refereeing is an antiquated form that might have been good for science in the past but it's high time to put it out of its misery.

What is this familiar litany? It is a list of grievances aired by scientists a century ago.If complaining about the faults of referee systems is nothing new, such systems are not as old as historical accounts often claim. Investigators of nature communicated their findings without scientific referees for centuries. Deciding whom and what to trust usually depended on personal knowledge among close-knit groups of researchers. (Many might argue it still does.)

The first referee systems that we would recognize as such were set in place by English scientific societies in the early nineteenth century. But these referees were never intended to play the part of supreme scientific gatekeepers. That notion emerged in around 1900 (see 'Past notes'). It was exactly then that some began to wonder whether referee systems might be fundamentally flawed. In this sense, peer review has always been broken.

Today, with the debate about the future of peer review more fraught than ever, it is crucial to understand the youth of this institution. What's more, its workings and its imagined goals have evolved continually, and its current tensions bear the marks of this. The referee system has become a mishmash of practices, functions and values. But one thing stands out: pivotal moments in the history of peer review have occurred when the public status of science was being renegotiated.

Scientific publicists

In 1831, William Whewell, a Cambridge professor and philosopher of science, proposed a scheme to the Royal Society of London. He suggested that it commission reports on all papers sent for publication in the semi-annual Philosophical Transactions. Written by teams of eminent scholars, these reports might, he argued, be “often more interesting than the memoirs themselves” and thus a great source of publicity for science1. Besides, authors would be grateful to know that their papers would be read carefully by at least two or three people. The society was just then launching a new journal to be called the Proceedings of the Royal Society, a cheaper monthly periodical to include abstracts of papers presented at the society. It had pages to fill and seemed the ideal place for these new reports.

At the time, editors of scientific journals made publishing decisions by personal fiat, perhaps in consultation with some trusted helpers. For publications that belonged to a scientific academy or society — such as the Philosophical Transactions — the vote of some committee of eminent persons would determine a manuscript's fate. (The temptation to conflate these practices with modern referee systems has led to the stubborn myth that the origins of the scientific referee can be traced back as far as the seventeenth century.)

Timeline: Past notes

How organized academic review has evolved over 300 years.

Jean-Baptiste Colbert Presenting the Members of the Royal Academy of Science to Louis XIV (oil on canvas), Henri Testelin (1616–95)/Bridgeman Images

1665  Henry Oldenburg, secretary of the Royal Society in London, creates the Philosophical Transactionsto simplify his correspondence. He uses no referee system.

1699  France’s Royal Academy of Sciences is given power by Louis XIV (picturedcentre, with academy members) to report on and approvebooks for publication and bypass the royal censors.

1752  After vicious satires of the Philosophical Transactions, the Royal Society establishes a committee to vote on what to publish.

1831  Cambridge professor William Whewell convinces the Royal Society to commission public reports on manuscripts. Might referees increase the visibility of science?

1833  By now the reports have become private and anonymous.

1892  A pamphlet ‘On the Organisation of Science’ published in London by ‘A Free Lance’ kick-starts a movement to standardize the selection and distribution of scientific papers. Might referees be guardians of the literature?

1892  A paper surfaces that was rejected by a Royal Society referee in 1845, outlining the kinetic theory of gases more than a decade before James Clerk Maxwell’s famous paper. Might referee systems be fundamentally flawed?

1968  British physicist John Ziman describes the referee as “the lynchpin about which the whole business of Science is pivoted”. Outside the United Kingdom and North America, many editors and scientists remain largely unconvinced.

1973  External refereeing becomes a requirement for publication in Nature10.

1991  An e-mail/FTP server at xxx.lanl.gov for freely sharing unreviewed physics preprints goes live. Later relocated to the web at arXiv.org, it becomes a touchstone for discussions about the end of peer-reviewed journals.

2006  PLoS ONE launches as an open-access journal that eschews ‘importance’ as a factor in peer review.

2007–11  EMBO Journal, the Frontiers series and BMJ Open, among other journals, experiment with open peer review, publishing reviewers’ names or notes alongside papers.

Whewell was not much concerned about preventing shoddy papers from being printed; he was not proposing a new mechanism to inform publishing decisions. Instead, he was one of many people campaigning to increase the public visibility of science and give a unified identity to the scientific enterprise in England. (It was he who, a few years later, coined the word 'scientist' to this end.) This movement had begun in 1830 and is now most remembered for Charles Babbage's Reflections on the Decline of Science in England, a screed about the paucity of state funding for, and public recognition of, science. But its more consequential legacy is the referee system.

Whewell was cribbing from a century-old custom at the French Academy of Sciences in Paris of writing reports that evaluated inventions and discoveries in the service of the king. There, researchers who were elected to the academy were paid by the state as a reward for scientific eminence, and politicians seemed to value their opinions. Indeed, to be an expert (a French word not yet common in English) was almost by definition to be a writer of reports. Whewell reckoned that those French académiciens must be doing something right.

The proposal to turn the Royal Society into a corps of expert judges in the style of the French academy was met with enthusiasm. But translating the report-writing practice across the Channel proved more complicated than Whewell expected.

News or views?

Whewell agreed to write the first report. His collaborator was a former student at Cambridge, John William Lubbock, a mathematically inclined astronomer who was also the Royal Society's treasurer. They jointly selected a manuscript submitted by George Airy, another up-and-coming astronomer. The paper, 'On an inequality of Long Period in the Motions of the Earth and Venus', used sophisticated mathematical methods to calculate how the orbits of these planets were influenced by the gravitational force each exerted on the other.

Whewell and Lubbock took turns reading the manuscript — copying technologies at the time left much to be desired. Both instantly knew what they thought of it. And they completely disagreed.

They argued about the paper for months. Both wrote draft reports, which could not have been more different. Whewell's focused on the significance of the problem and on Airy's remarkable conclusions. Lubbock's picked at the inelegant ways in which Airy had constructed his equations. Most fundamentally, they argued about what a reader's report ought to be. Whewell wanted to spread word of the discovery and to place it in the bigger picture (think Nature's News & Views and Science's Perspectives). “I do not think the office of reporters ought to be to criticize particular passages of a paper but to shew its place,” he told Lubbock. If they picked out flaws, he warned, authors would be put off. Lubbock had other priorities: “I do not see how we can pass over grievous errors,” he wrote.

Feeling that they had reached an impasse, Lubbock went to the author himself to deliver his suggestions for improvement. Airy was understandably irritated that his manuscript was being subjected to this strange new procedure. “There the paper is,” he wrote to Whewell, “and I am willing to let my credit rest on it.” He had no intention of changing his text. Lubbock threatened to pull out, but ultimately relented and swallowed his criticisms, acknowledging that this was “the first report which the Council have ever made” and trying to see the bigger picture. He thanked Whewell for putting his “shoulder to the wheel” and signed his name to the report2.

With disaster averted, Whewell's version of the report was read publicly at the society on 29 March 1832, and was printed in the Proceedings, while Airy's full paper appeared in the Transactions. Lubbock's critiques never became public.

Not long before, the Astronomical Society of London (now the Royal Astronomical Society) and the Geological Society of London had also begun to experiment with similar reports. It was a geologist, George Greenough, who introduced the term 'referee' in 1817, importing into science a term he knew from his days as a law student3. But it was the Royal Society's system of reports that caused the British scientific world to take notice. The practice gradually spread to other societies, including the Royal Society of Edinburgh and the Linnean Society of London. But it was not really until the twentieth century that journals unaffiliated with any society slowly followed suit.

Anonymous judges

The struggle between Whewell and Lubbock represented two distinct visions of what a referee might be. Whewell was the authoritative generalist, glancing down on the landscape of knowledge. He was unconcerned with — and probably not in a position to critique — the details. Such referees were, according to the Royal Society's president, “Elevated by their character and reputation above the influence of personal feelings of rivalry or petty jealousy”4. Lubbock was a younger specialist, Airy's equal. This allowed him to take a fine-tooth comb to Airy's arguments; it also put him in the position of reviewing a direct competitor.

Initially, Whewell's vision won out. But the system began to transform even as it lurched into existence. After a couple of years, the reports became shrouded in secrecy. The last Proceedings issue to include one was in mid-1833, and no negative reports were ever published. A letter Whewell wrote in 1836 shows that he himself had changed his view: he describes the referee as a defender of a society's reputation, working behind the scenes to exclude publications that do not belong. Neither the Royal Society's archives — nor the personal papers of those involved — are clear on how this happened, but we should not be surprised that it did. In England, unlike France, there was little precedent for public authorities judging from on high what constituted good or bad science. Signing one's name to explicit criticism of a colleague would have been ungentlemanly.

More familiar was the anonymous critic who purported to speak for the public, epitomized by the anonymous book reviews that dominated English periodicals throughout the period, from the Quarterly Review to the lowly Mechanics' Magazine (the practice survives today in The Economist). Through anonymity, as one uncredited editor argued in 1833, “the individual is merged in the court which he represents, and he speaks not in his own name, but ex cathedra (with full authority)”5. Justifications of the anonymity of the scientific referee took a similar view.

It took just a decade for the referee to become an established scientific persona, and not a noble one. An 1845 exposé in a London magazine painted a picture of referees as scheming judges quite possibly “full of envy, hatred, malice, and all uncharitableness”. Hidden away in some secret chamber, this scientific judiciary, the article implied, used the cover of anonymity to advance their personal interests — perhaps through undetectable acts of piracy — at the expense of helpless authors6.

It was only near the turn of the twentieth century that the idea began to take hold that editors and referees, taken as one large machinery of judgement, ought to ensure the integrity of the scientific literature as a whole. Amid calls to curtail the “veritable sewage thrown into the pure stream of science” (a suggestion7 by the physiologist Michael Foster in 1894), English scientific societies debated combining their publishing apparatuses, with a standardized referee system overseeing all of scientific publishing. (The plan was abandoned, in part because it would have meant convincing publishers of independent journals, such as the Philosophical Magazine, to go out of business.)

“The referee was reimagined as a universal gatekeeper with a duty to science.”

Nonetheless, the referee was gradually reimagined as a sort of universal gatekeeper with a duty to science. As this idea gained ground, many began to worry that the system itself might be intrinsically flawed, a force that impeded creative science and which ought to be abolished. Such worries culminated in what was surely the first formal inquiry into the workings of referee systems — in 1903, by the Geological Society of London. The inquiry found that opinion was sharply divided on the subject, receiving several vitriolic statements about the injustices and inefficiencies of the systems in use. The 'referee' was in such disrepute that they nearly banned the use of the term in all society business.

But referee systems survived, and were slowly set up by independent journals as well. Outside the Anglophone scientific world, referee systems remained rare. Albert Einstein, for example, was shocked when an American journal sent a paper of his to a referee in 1932. The idea that any legitimate scientific journal ought to implement a formal referee system began to take hold in the decades following the Second World War.

Apotheosis and fall

In the 1960s, refereeing emerged as a symbol of objective judgement and consensus in science. The referee was, in the words of the physicist and science writer John Ziman, “the lynchpin about which the whole business of Science is pivoted”8. Just as in 1830s England, the relationship of science to the public was at the foreground of these changes. The scientific community was once again working hard to solidify perceptions of its role in society. The very phrase 'scientific community' dates from this time. Researchers wanted to preserve autonomy while holding on to the massive government funding that had come their way since the Second World War. Allocations for basic research in the United States, for instance, swelled by a factor of 25 in less than a decade9.

'Peer review' was a term borrowed from the procedures that government agencies used to decide who would receive financial support for scientific and medical research. When 'referee systems' turned into 'peer review', the process became a mighty public symbol of the claim that these powerful and expensive investigators of the natural world had procedures for regulating themselves and for producing consensus, even though some observers quietly wondered whether scientific referees were up to this grand calling.

Current attempts to reimagine peer review rightly debate the psychology of bias, the problem of objectivity, and the ability to gauge reliability and importance, but they rarely consider the multilayered history of this institution. Peer review did not develop simply out of scientists' need to trust one another's research. It was also a response to political demands for public accountability. To understand that other practices of scientific judgement were once in place ought to be a part of any responsible attempt to chart a future path. The imagined functions of this institution are in flux, but they were never as fixed as many believe.

Journal name:
Nature
Volume:
532,
Pages:
306–308
Date published:
()
DOI:
doi:10.1038/532306a

References

  1. W. Whewell to P. M. Roget, 22 March 1831; Royal Society of London Library [DM/1].

  2. J. W. Lubbock to W. Whewell, 27 January 1832; Trinity College Library, Cambridge [a/216/61].

  3. George Greenough Papers; University College London [Add. 7918/1621].

  4. Proc. R. Soc. Lond. 3, 140155 (1832).

  5. New Monthly Magazine 39, 26 (1833).

  6. Wade's London Rev. 1, 351369 (1845).

  7. Nature 49, 563564 (1894).

  8. Ziman, J. Public Knowledge: An Essay Concerning the Social Dimension of Science (Cambridge Univ. Press, 1968).

  9. Kaiser, D. Nature 505, 153155 (2014).

  10. Baldwin, M. Making Nature: The History of a Scientific Journal (Univ. Chicago Press, 2015).

Author information

Affiliations

  1. Alex Csiszar is associate professor of the history of science at Harvard University, Cambridge, Massachusetts, USA.

Corresponding author

Correspondence to:

Author details

For the best commenting experience, please login or register as a user and agree to our Community Guidelines. You will be re-directed back to this page where you will see comments updating in real-time and have the ability to recommend comments to other users.

Comments for this thread are now closed.

Comments

6 comments Subscribe to comments

  1. Avatar for Drew
    Drew
    You can eliminate the biases by allowing public comments as others suggest. A simple "like"/"dislike" won't be as effective as it could be. You'll need someone to evaluate the strength of the argument. For example, I am a statistician and chemist. With a few calculations, I can determine the probability a paper will not be reproducible. I see serious violations of statistical theory and practice in most papers I read. I do not tolerate P-value fishing. However, if my opinion on the stats is weighted equally with someone that is not skilled in statistics, you will maintain the current mess scientific articles are in now.
  2. Avatar for Sneha Kulkarni
    Sneha Kulkarni
    This is an excellent piece. It has been rightly pointed out that the power reviewers wield is enormous. Could this be the reason why the system has been exploited over and over again by malicious authors to gain access to publication? Peer reviewers are defined as ‘experts’, but the question arises: Who defines who an expert is? There have been many suggestions to do away with the secrecy in peer review system and even the system itself, but some other form of review will need to take its place. The number of papers that get published every year is only increasing and some form of credibility is expected by the scholarly community as well as the public. It would be interesting to see which way the peer review system of the future would tilt – towards having ‘referees’ or ‘reviewers.’
  3. Avatar for John Cisne
    John Cisne
    May I suggest the “Facebook System,” similar to arXiv, perhaps, but more informal and free-ranging, with “Likes” or the equivalent to determine which posts make it into “Trending” (~“accepted”). Who better to cull the data than Facebook staff adept in creating sellable consumer statistics? At the risk of being even more impertinent, might I suggest the “$-factor” (or factors) designed to supplement or replace the h-factor and such, not to mention the “I-factor” aimed at identifying those most effective in gaming the system (for instance, commenters evidently concerned at least as much with improving their own citation statistics as with constructive criticism to help the author). However impertinent (or worse) these remarks may appear, the germs of a good idea or two may be found, despite the way in which they are expressed.
  4. Avatar for Nicholas Robinson
    Nicholas Robinson
    Excellent piece. Now, if only homeopathy et al. could be peer-reviewed out of existence.
  5. Avatar for Reinhard Werner
    Reinhard Werner
    Hardly. The example shows the problem, though: You will easily find enough homeopaths to do the reviewing. Elsevier has a peer reviewed journal called "Homeopathy". Neither the system of peer review nor the reputation of the publisher help as a criterion of scientific validity.
  6. Avatar for Ed Rybicki
    Ed Rybicki
    We now have new means of publishing that neither require "journals", nor anonymous reviewers: while I hate the names, the arXiv and biorXiv repositories surely represent the future of scientific publishing, with online publication followed by comment?

Taking a gamble

prediction-markets

The power of prediction markets

Scientists are beginning to understand why these ‘mini Wall Streets’ work so well at forecasting election results — and how they sometimes fail.

Newsletter

The best science news from Nature and beyond, direct to your inbox every day.

The polling crisis

election-polling

How to tell what people really think

This year’s US presidential election is the toughest test yet for political polls as experts struggle to keep up with changing demographics and technology.

Mitochondrial replacement

mitochondrial-replacement

Reports of 'three-parent babies' multiply

Claims of infants created using mitochondrial-replacement techniques stir scientific and ethical debate.

US presidential race

Trump-supporters

The scientists who support Donald Trump

Science policy fades into background for many who back Republican candidate in US presidential race.

ExoMars

lost-mars-lander

Europe’s probe feared lost on Mars

Sister craft successfully enters Martian orbit but loses contact with Schiaparelli lander.

Nature Podcast

new-pod-red

Listen

This week, making egg cells in a dish, super-bright flares in nearby galaxies, and trying to predict the election.

Science jobs from naturejobs