Paid crowdsourcing is coming to biology. This powerful approach will support research, though it could also promote unjust conditions for some workers.
The participation of large groups of amateurs in scholarly projects is nothing new. The Oxford English Dictionary was famously a massive collation of examples of word use amassed by a dispersed group of annotators combing through published texts. The Audubon Christmas Bird Count, in which tens of thousands of volunteers contribute to an annual bird census, has been running for over a century. After World War II, subatomic particles were hunted by highly trained but non-scientist scanners, typically women, who annotated data from bubble chambers.
Present-day crowdsourcing depends heavily on digital information and accesses the ‘crowd’ via the internet. In biology, crowdsourcing has been used to solve protein structures, to reconstruct neuronal connections in electron micrographs, to pick particles for cryo-electron microscopy, to curate a bacterial interactome, and to detect sleep spindles in human electroencephalograms, to give just a few examples.
In most such projects, the crowd consists of volunteers who have an interest in the work at hand. Indeed, in its earliest sense, the term “amateur” derives from amator, or “one who loves.” Even so, it can be difficult to keep the sustained attention of a volunteer group. For time-sensitive or very tedious projects, the myriad attractions of life and the internet may offer volunteer citizen scientists more appealing ways to spend their spare time.
Two other approaches can keep the crowd engaged. In the first, the scientific task is incorporated into an existing online game that already has a large and enthusiastic following. For instance, researchers at the Human Protein Atlas, which aims to map the location of all proteins in cells and tissues, have embedded imaging data into the role-playing game EVE Online, opening a portal to its roughly half-million subscribed gamers. It will be of great interest for the research community to evaluate the accuracy of data obtained in this way.
Alternatively, contributors to crowdsourced projects can be paid for their efforts. A platform like Amazon Mechanical Turk (AMT) allows anyone—a researcher, for example—to post a task for payment. Online workers then decide whether to take on this task given the time involved and the level of compensation.
In this month’s issue of Nature Methods, researchers describe Quanti.us, a tool that enables such transactions for crowdsourced annotation of biological images (p587). The authors conclude that for object discrimination, tracking, and segmentation, the performance of a collective of paid non-expert workers can almost always equal that of experts and exceed that of automated methods. They also show that a deep learning model trained on crowd-annotated images can perform just as well as one trained on expert-annotated ones. Also in this issue, scientists with experience in volunteer-driven crowdsourcing discuss the strengths and weaknesses of the paid and volunteer models (p579).
Quanti.us may be used in a paid or a free mode. In the latter case, a project could be distributed to a community of students, colleagues or other collaborators. In the paid mode, worker engagement depends on both the complexity of the task (for instance, how many annotations are required per image) and the pay. Not surprisingly, a more complex job needs to offer better pay for large numbers of workers to engage. The authors report that they can find conditions that result in much faster image annotation compared to what can be achieved in the typical lab or extrapolated from a volunteer platform.
Fast and accurate crowdsourced image annotation could be a boon for many research projects. It will free scientists from tedium, provide annotations to train powerful machine learning algorithms, and make possible analyses that might otherwise be too onerous to even contemplate. But paid online microwork has a less positive side as well.
Paid crowdsourcing takes place in an unregulated labor market where workers have neither formal collective bargaining power nor basic government protection (although AMT workers do organize online in active communities where they review requesters and tasks). A 2016 Pew Research Center study found that a full quarter of AMT workers, canvassed over a two-week period, reported that they derive their principal income from online microwork. Most workers are in the United States, but several studies report an increasing contribution from people in India. Not surprisingly, average payment rates on AMT are estimated to be well below the US federal minimum wage (which, at $7.25 an hour, is itself well below the minimum wage in several US states). Indeed, at 6 cents per image, the image-analysis tasks described in this issue are likely to be examples of this.
The forces that compel people to take underpaid jobs are many; microwork platforms cannot be held responsible for them. What is more, this type of flexible work is likely to be beneficial for some as a source of supplementary income or a gap measure. Paid online microwork could be considered as a part of the now prevalent ‘gig economy’, where workers take on short-term projects as independent contractors rather than assuming a stable arrangement as an employee. Certainly, traditional institutions like universities and businesses are no strangers to the practice of lowering labor costs by hiring temporary faculty or outsourcing. In other words, paid microwork is part of more general economic and societal trends.
Nevertheless, researchers planning crowdsourcing projects should keep in mind that paid microwork could perpetuate inequality and even be exploitative for some workers. An online community of AMT workers has released sensible guidelines that are worth considering. Should the use of paid crowdsourcing in biology become widespread, the scientific community would do well to come to a wider agreement about how to promote its principled use.
Crowdsourcing offers a powerful and exciting new way to transform biological data into knowledge and thus will benefit society in the long run. We should remain vigilant that this does not come at the expense of the economically vulnerable in the shorter term.