CORRESPONDENCE

TL;DR: how well do machines summarize our work?

The SciTLDR software tool (scitldr.apps.allenai.org) uses machine learning to summarize scientific texts (see Nature https://doi.org/ghmnjj; 2020). Although it is impressive how far natural language processing has come, there is a risk it could distort scientific discourse by stripping away important context and over-amplifying results.

SciTLDR tends to extract one or two key statements from the original text and edits them into a cohesive sentence, sometimes removing parenthetical phrases and using synonyms for common words or phrases. Such changes are mostly innocuous, but they could omit qualifiers that the authors deem relevant. When the software replaces “we investigated” with “we identified”, for instance, it changes the meaning by seeming to present results rather than simply setting a research context.

And what happens when these tools are applied to, for example, anti-vaccination research or papers denying climate change? When I submitted abstracts from retracted works to the SciTLDR online demo, the summary statements of the results were often stronger than those in the original paper because they lacked context. They failed to acknowledge that the paper had been retracted, as a human writer would. Given the long-running threats posed by anti-science movements, caution is needed when developing and deploying tools such as SciTLDR.

Nature 590, 36 (2021)

Nature Briefing

An essential round-up of science news, opinion and analysis, delivered to your inbox every weekday.

Subjects

Sign up to Nature Briefing

An essential round-up of science news, opinion and analysis, delivered to your inbox every weekday.

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing