Prestigious AI meeting takes steps to improve ethics of research

For the first time, the organizers of NeurIPS required speakers to consider the societal impact of their work.
A woman hand draws a poster about an ethical AI task force.

Artificial-intelligence research is coming under increasing ethical scrutiny.Credit: Michael Cohen/Getty

After a year of heavy scrutiny and seemingly endless controversy around artificial-intelligence (AI) technologies, the field’s most prestigious conference has tried to set a good example. For the first time, the Neural Information Processing Systems (NeurIPS) meeting, which took place completely online this month, required presenters to submit a statement on the broader impact their research could have on society, including any possible negative effects.

The organizers also appointed a panel of reviewers to scrutinize papers that raised ethical concerns — a process that could lead to their rejection.

“I think there’s a lot of value even in getting people to think about these things,” says Jack Poulson, founder of the industry watchdog Tech Inquiry in Toronto, Canada. He adds that the policy could help to shift culture in the field.

Researchers who work on machine learning are increasingly aware of the challenges posed by harmful uses of the technology, from the creation of falsified videos, or ‘deepfakes’, to mistakes by police who rely on facial-recognition algorithms in deciding who to arrest.

“There was previously a period of techno optimism,” says Iason Gabriel, an ethicist at the AI powerhouse DeepMind, a sister company of Google based in London. “Clearly, that has changed in recent years.”

Unintended uses

The idea of conference participants writing an impact statement was inspired by the Future of Computing Academy, a group led by Brent Hecht, a specialist in the human impacts of technology at Microsoft and at Northwestern University in Evanston, Illinois. In 2018, Hecht and his collaborators proposed that computer-science publications be required to state the potential side effects and unintended uses of their research. Unlike in other scientific disciplines, most peer review in computer science happens when manuscripts are submitted to conferences, rather than to scholarly journals. As the field’s largest and most prestigious conference, NeurIPS was a natural choice to test this proposal.

This year’s conference attracted 9,467 paper submissions. The reviewers assessed submissions mainly on their scientific value, but papers with potential to be accepted could be flagged for a full review by a separate ethics committee led by Gabriel. Of 290 papers that were flagged, 4 were ultimately rejected by the programme chairs because of ethical considerations, says Marc’Aurelio Ranzato, a computer scientist at Facebook AI Research in New York City who was one of the conference’s programme chairs.

“In general, I would say the ethics process has done well,” says Katherine Heller, a computer scientist at Google in Mountain View, California, who was the conference’s co-chair of diversity and inclusion.

Gabriel says that most problematic issues should have been caught, because any of the three anonymous peer reviewers could flag a paper, as could the subject-area chair. “A signal from any one of them would be enough to engage the review process,” he says. Still, he admits that the process was not infallible. For example, if all the reviewers happened to be men — not unusual in a male-dominated field — they might not be able to adequately assess whether an algorithm could affect women negatively. “I can’t rule out the possibility that there would be blind spots of this kind,” Gabriel says.

In addition, reviewers were not given specific guidance on what constitutes harm to society. For example, says Ranzato, some reviewers flagged papers that made use of databases containing personal information or photographs that were collected without explicit consent. The use of such databases has come under heavy criticism, but the conference organizers did not single out this issue to reviewers or provide a list of problematic databases. Still, Ranzato adds that the review policy is a step in the right direction. “Nothing is perfect, but it’s better than before.”

Policing AI

The last day of the conference featured a special session focused on the broader impact of AI on society. Hecht, Gabriel and other panellists discussed ways to address the industry’s problems. Hanna Wallach, a researcher at Microsoft in New York City, called for researchers to assess and mitigate any potential harm to society from the early stages of research, without assuming that their colleagues who develop and market end products will do that ethical work. Ethical thinking should be built into the machine-learning field rather than simply being outsourced to ethics specialists, she said, otherwise, “other disciplines could become the police while programmers try to evade them”.

Wallach and others, such as Donald Martin, a technical programme manager at Google in San Francisco, California, are working to redesign the product-development process at their companies so that it incorporates awareness of social context. AI ethics, Martin says, “is not a crisis in the public understanding of science, but a crisis in science’s understanding of the public”.

The revamped review process and the ethics-focused discussions are the latest in a series of efforts by NeurIPS organizers to improve practices in machine learning and AI. In 2018, the conference dropped an acronym that many people found offensive, and began a crackdown on sexist behaviour by participants. And last year’s meeting featured robust discussions of AI ethics and inclusivity.

The mood of this year’s conference was affected by events that occurred on its eve. On 2 December, Timnit Gebru, a leading researcher on racial bias in machine-learning algorithms, said that after a dispute over the publication of a paper she had co-authored, she had been dismissed from Google, where she co-led a team working on ethics in AI. Google has stated that it accepted her resignation; Gebru has said that she never resigned, but had merely threatened to do so at a later date if her conditions on the paper’s internal review were not met.

The situation has led many researchers to publicly question Google’s commitment to ethical AI. Thousands both at Google and elsewhere have signed a letter of solidarity, pointing out that Gebru was “one of very few Black women Research Scientists at the company, which boasts a dismal 1.6% Black women employees overall”.

On 16 December, members of the US Congress wrote to Google chief executive Sundar Pichai, saying that “the incident raises broader questions, the answers to which may meaningfully impact our work as legislators”. Google did not respond to a request for comment from Nature’s news team.

Nature 589, 12-13 (2021)

Updates & Corrections

  • Correction 23 December 2020: This story was updated to clarify that the ethics committee didn't have the power to reject a submission. It was the programme chairs who ultimately rejected submissions because of ethical considerations.

Nature Briefing

An essential round-up of science news, opinion and analysis, delivered to your inbox every weekday.


Sign up to Nature Briefing

An essential round-up of science news, opinion and analysis, delivered to your inbox every weekday.

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing