This month, the US National Science Foundation (NSF) will decide whether it will collect data next year on how many people from sexual and gender minorities (LGBTQ+ people) are in the US scientific workforce. Amazingly, there are still no official statistics on this.

As an LGBTQ+ neuroscientist who studies the mechanisms underlying stereotyping and other forms of bias, I understand personally and scientifically some of the myriad challenges that LGBTQ+ people face in science, technology, engineering and mathematics (STEM). When I was a junior faculty member, a colleague pulled aside a candidate for a postdoctoral position in my lab to tell him that I’m gay, in case it would be a problem. During a single interview for a tenure-track job, 13 people asked me if I had a wife. These experiences are par for the course for LGBTQ+ scientists. We face more career barriers and workplace harassment than do non-LGBTQ+ scientists, even when controlling for other demographic and career-related factors.

The NSF’s annual surveys of the US STEM workforce shape national policies and determine which groups count as being under-represented and are eligible for federal resources, such as diversity fellowships and funding. This census informs not just the NSF, but also the US Congress, the White House, the US National Institutes of Health (NIH) and other agencies. One key NSF survey must be completed by everyone who receives a PhD in the United States.

The NSF currently does not ask about sexual orientation and gender identity (SOGI) in its surveys, although it does ask about race, ethnicity, income and disability.

The scientific community can’t improve a situation that it refuses to measure. The US government needs data on LGBTQ+ scientists that can drive policies and effect change.

For four years, my colleagues and I have advocated this, with the support of major scientific organizations. The NSF has now completed pilot research with 7,800 respondents and 60 interviewees on the viability of SOGI questions in its national surveys of US university graduates, PhD recipients and PhD holders. It must soon inform the US Office of Management and Budget of its decisions about the 2023 survey cycle — including whether it intends to collect SOGI data.

Some say, as the NSF once did, that SOGI questions are too personal or sensitive. Yet the data show otherwise. In the pilot results that NSF has disclosed, break-off and non-response rates (which indicate discomfort) were extremely low for these questions. Moreover, LGBTQ+ and non-LGBTQ+ respondents alike overwhelmingly reported feeling comfortable providing SOGI demographics to the NSF. Other federal agencies, including the Census Bureau and Department of Labor, have been collecting SOGI data for years. The Department of Education began collecting SOGI data on its national surveys of high-school and college students six years ago. For these agencies, break-off and non-response rates for SOGI questions are actually lower than for questions on commonly tracked demographic data such as income and disability status.

Some LGBTQ+ scientists might fear that the data could be used against us. That is understandable. LGBTQ+ people of colour and transgender individuals, in particular, face regular harassment in many parts of the United States. But the questions would be voluntary, and the NSF has tested opt-out options such as ‘prefer not to say’. Furthermore, NSF survey data are anonymized, and the privacy and confidentiality of any personally identifiable data are strongly protected by federal law.

The question for the NSF, in my mind, isn’t whether to include SOGI questions. It’s what kind of questions will provide the most accurate picture of the LGBTQ+ STEM population, assessed in the most inclusive way possible.

Accuracy is paramount: miscounting LGBTQ+ scientists could affect the amount of taxpayer money allocated to fixing inequities. But inclusion is also crucial, which can mean striking balances. The NSF has tested inclusive question designs that entail expanded SOGI options, such as ‘pansexual’ and ‘non-binary’, with options to check all that apply and write in alternatives. To see whether this causes confusion and inaccuracies for non-LGBTQ+ respondents, it has also tested designs with more restricted options.

Although SOGI questions should be added to next year’s surveys, the NSF could consider ongoing testing and refinement. The phrasing of demographic questions is not meant to be eternal. For example, the NSF’s questions on race and ethnicity reflect federal standards that have been revised several times over the past 45 years and are now being reviewed again. Although the questions have been criticized as lacking nuance, the NSF has never hesitated to include them, because race and ethnicity data are crucial to resolving urgent opportunity gaps. SOGI questions should be no different.

Collecting the data is only the first step. US government STEM agencies must create practices and programmes to address any disparities identified, including for LGBTQ+ people with other marginalized identities. Universities, scientific societies and other institutions must also begin to collect SOGI data, legally and ethically, so that challenges can be resolved throughout the STEM ecosystem.

Fixing these issues is not only a moral imperative. By wasting human talent, LGBTQ+ inequities in STEM impede our global scientific potential.