Cataloguers of the Royal Society developed the first record of published scientific research.

In 1830, Charles Babbage had an unusual idea. Exasperated by how little recognition science was getting in England, the computer pioneer and scientific provocateur suggested that quantifying authorship might be a way to identify scientific eminence.

Like many of Babbage's radical ideas, this one persuaded almost nobody, but it eventually proved prophetic. Before the end of the century, listing papers and comparing publication counts had become a popular pursuit among scientific authors and other observers. Within a few decades, academic scientists were coming to fear the creed of 'publish or perish' (see 'Catalogues and counts').

This transformation can inform current debates about the value of algorithms for quantifying scientific credibility and importance. History shows how search technologies and metrics are not neutral tools that simply speed up efforts to locate and evaluate scientific work. Metrics transform the very things that they measure. By changing the reward structure, they alter researchers' behaviour — both how results are communicated and which topics receive the most attention.

But there is a second, more subtle, transformation that we must be alert to. The processes by which scientific merit is judged have long been central to the public perception of scientific authority. As these processes change, we must also consider the ways in which broader cultural beliefs about scientific expertise are transformed.

Broken pieces of fact

Babbage's suggestion to count authors' papers was met with various criticisms. One author did the calculation for each fellow in the Royal Society in London, and showed that this was a terrible guide to scientific eminence. Another pointed out1 that “a far more satisfactory criterion” would have been “the value of those papers”.

Back then, scientific reputations were built not on periodicals but on books and other proofs of genius that demonstrated mastery of a subject. Babbage himself had little respect for most scientific journals, and he limited his proposal to counting papers in the venerable Philosophical Transactions of the Royal Society of London. As late as 1867, the British physiologist Michael Foster, in a retrospective written on the life of Karl von Baer, heaped praise on the embryologist's multivolume masterwork, On the Development of Animals, and dismissed his periodical publications. These, Foster claimed2, were just “specimens of those broken pieces of fact, which every scientific worker throws out to the world, hoping that on them, some time or other, some truth may come to land”.

Alamy

But things were beginning to change. A young engineer working for the US Coast and Geodetic Survey (now the National Geodetic Survey) had suggested that it would be useful if some catalogue could be devised to keep track of the publications of European scientific societies. Once the idea crossed the Atlantic and percolated up to the Royal Society, its scope grew to become a list of all periodical papers containing original scientific research published since 1800. Some questioned the need to preserve so much insignificant writing. The physicist William Thomson (later Lord Kelvin) warned that the project would lead the society to financial ruin.

The main argument for what would become the Catalogue of Scientific Papers was that periodical publishing was a mess. Although many authors published in the journals of scientific societies, vast quantities of valuable information appeared in popular-science magazines, encyclopaedias and general-interest weeklies. Authors distributed huge numbers of offprints that sometimes did not even make clear what journal they had come from.

When the society's indexers got down to work in 1867, they realized that the situation was worse than they'd imagined. For thousands of papers, they couldn't even figure out who the author was. Many who published in periodicals chose to remain anonymous, or signed only their initials. In other cases, it was hard to tell to what extent the writer of a paper was responsible for its contents, or whether another person ought to be credited. Moreover, vast numbers of papers were published in various forms in different periodicals, and it was no easy matter deciding what should count as the same publication. Today, such publishing habits would probably lead to accusations of misconduct; not very long ago this was business as usual.

The Royal Society's cataloguers did what they could, contacting editors and authors to match names to papers. They turned a significant portion of the society's library into a bibliographic workroom, and made their job simpler by excluding all general-interest periodicals from the search, as well as anything that smacked of reading for non-specialists. They compiled lists of which periodicals ought to be included in the count, and circulated them to other experts and academies for feedback. The decision about whether to index some doubtful titles sometimes made it all the way to the society's council for a vote.

As their work progressed, the directors of the project came to realize that their charge to produce a master list of all 'scientific papers' published since 1800 might actually influence publishing practices in the future. They hoped that authors would be more careful about where they published — or at least sign their contributions3. They probably did not anticipate the full consequences of what they were about to unleash.

Counting what counts

When the first volumes of the Catalogue of Scientific Papers appeared at the end of 1867, reaction across Europe and the United States was swift and wide-ranging. One observer wrote in awe that the catalogue made science look like a coral-island, a majestic edifice that grew imperceptibly larger with the addition of each new fact embodied in each paper. Some were less enthusiastic. One Royal Society fellow complained that the editors had distorted “the progress and history of discovery both in Physical and Natural Science” by excluding so many valuable contributions from “journals not professedly scientific”, accounts of scientific voyages, independently published treatises, encyclopaedia articles (which at the time often included original research), and much more4.

Many observers hurried over the prospect of how helpful the catalogue would be for finding information and began comparing the productivity of individuals. By quantifying the contributions of each author, the catalogue seemed tailor-made for keeping score. A writer in Nature got down to business5: “Dr. Hooker appears for 58 papers; his late father for 72; and the late W. Hopkins, who did so much in mathematical geology, for 33 .... the indefatigable Isaac Lea, of Philadelphia, for 106, mostly about shells...”. And so forth. In a detailed review in a Viennese newspaper, the mineralogist Wilhelm von Haidinger began by urging prudence, warning that the mere comparison of numbers was no basis on which to make judgements of value6. But even he admitted that the numbers were somehow irresistible. Within two years, von Haidinger had taken his numerical analysis further. He published a study based on the catalogue that included a chart that compared the number of highly productive scientific authors in each region of Europe, lamenting the low position of Austria according to this ranking7.

Such enthusiasm for counting had practical consequences. Within a decade of those first volumes appearing, the forms submitted by candidates for admission to the Royal Society transformed into long lists of papers. By the early 1870s, obituaries and biographical encyclopaedias were routinely noting the number of papers written by a researcher, and even following the chronology sketched out by those papers as guide-posts to a career. By 1900, even Foster, the physiologist once so sceptical of scientific periodicals, had changed his tune. Original science belonged in periodicals, he explained. Putting new findings in books — as Charles Darwin had famously done — was “out of place and even dangerous”8. To be an expert on scientific subjects meant being an author of scientific papers.

Publish or perish

There is a direct line from these developments to twentieth-century worries about scientific publishing going off the rails. A letter to Nature in 1932 lamented the growing practice of candidates submitting a “list of strictly technical publications” to the Royal Society, leading to the result that “our journals are filled with masses of unreadable trash” published by ambitious scholars hoping to strengthen their applications9.

Charles Babbage, inventor of the difference engine, was an advocate of counting papers.

This was around the same time that the phrase publish or perish began to circulate in academia. It did so first in the United States, where the spread of research universities was turning science into something resembling a profession. The slogan became shorthand for the corrupting influence of narrow, bureaucratic performance measures of research.

In the 1960s, Eugene Garfield launched a radically different search tool, known as the Science Citation Index. He hoped that it might end the harmful culture of publish or perish by showing that some papers were more cited — and hence more valuable — than others.

Immediately, commentators warned that new measures based on citations would only make things worse, leading to a “highly invidious pecking order” of journals that could distort science10. The journal impact factor made its public debut in 1972, soon after the US Congress called on the National Science Foundation to produce a better account of the benefits wrought by public funding of science. There is no doubt that the citation index changed practices of scientific publishing, just as the rise of counting papers had followed the introduction of the catalogue before.

Today, advocates of altmetrics argue that well-made algorithms can mimic and aggregate the everyday acts of judgement that researchers make when they read, cite, link or otherwise engage with published research. These algorithms, they claim, will turn out to be as good or better at replicating established processes — such as peer review — that are supposed to delimit what constitutes important and trustworthy research.

Whether or not these claims turn out to be true, they ignore the question of whether we deem the procedures that experts use to evaluate ideas to be intrinsically valuable (that is, independent of the content of those judgements).

Scientific judgement does not happen in a cultural vacuum. The rise of processes such as peer review to organize and evaluate research was never simply about getting scientific judgement right; it was about balancing scientists' expert cultures with public demands for accountability. The Catalogue of Scientific Papers was itself part of a cultural moment in which indexes and card catalogues were celebrated for their potential to set knowledge free and even foster world peace. Interest in altmetrics has grown alongside widespread fascination with the potential of online platforms to make scientific communication both more open and more democratic.

At a time when the public status of the scientific expert is becoming increasingly uncertain, these questions are more important than ever. In a democracy, the procedures by which we decide what constitutes valuable scientific knowledge fundamentally depend on public conceptions of the aims of the scientific enterprise.

The question of whether new metrics might one day replicate the results of peer review (when it is working well) is a red herring. How we choose to judge what constitutes good science is just as important as the end results of those judgements. Even algorithms have politics.