## Introduction

In 2018, Professor D. Strickland received the Nobel Prize in Physics as the first woman in 55 years. From 1901 to 2018, the Nobel Prize in Physics has been awarded 112 times to 209 different candidates; among these are only two more women; namely M. Curie in 1903 and M. Goeppert Mayer in 1963. Women have historically occupied much fewer positions in academia than men have; hence, it is natural to expect more male Nobel laureates than female. However, the ratio of women in scientific professions has increased in all fields of science over the last decades (Shen, 2013). Despite this fundamental shift in the demography, the ratio of women Nobel laureates is still low and gives the impression of an increasing gender gap (Modgil et al., 2018). This gap is partially accounted for by the age discrimination of Nobel Prizes as laureates most often are well established senior researchers (Agarwal, 2018). In case seniority is the only important factor, we expect the Nobel awards to follow a binomial distribution with a probability given by the gender ratio among professors. For instance, if there is 10% women, we expect ceteris paribus a 10% chance that a woman is awarded the Nobel Prize. So does the gender ratio truly account for the few female Nobel laureates?

To investigate this, we compared the gender ratio of Nobel laureates in Physics; Chemistry, Economics, Physiology, and Medicine to the relevant gender ratios among scientists in the field. We use the gender distribution of senior faculty members in the US as proxy for a worldwide distribution and observe that women are awarded the Nobel Prize far less often than the faculty gender ratios suggest. More specifically, we find the probability that the distribution of Nobel Prizes is not favouring men, to be less than 4% for within all of the investigated fields.

## Methods

We used data of faculty members resolved on gender and fields from the National Science Foundation2 as a proxy for a global distribution. This data only covers the period from 1973 to 2010; hence, we extrapolated the data with a logistic function to obtain the gender ratios, r, for different fields from 1901 to 2010. We note that the average age for Nobel laureates is 55 years and the findings, worthy of a Nobel Prize, are on average done 15 years earlier (Nobel, 2019). While we do not have access to the number of female and male faculty members, resolved by age, we define a lag time, δ. With this, we presume that the relevant research originate from senior faculty members δ years before.

We use a hierarchical Bayesian inference model and Hamiltonian Monte Carlo sampling (Carpenter et al., 2017). We model the number of women laureates, fij, within the scientific field j, in year i as a stochastic binomial variable:

$$f_{ij} \,{\sim}\, B( {N_{ij},\theta _{ij}}),$$
(1)

where B is the binomial distribution. Nij and θij are the number of Nobel Prizes awarded and the corresponding success probability, i.e., the probability of a women being awarded, in year i, within the field j, respectively. In the case of no bias, we expect θij to be equal to the gender ratio, ri−δ,j some δ years earlier. In order to quantify any bias we model the success probability, θij as

$$\theta_{ij} = {\mathrm{logit}}^{-1}[{\mathrm{logit}}({\mathrm{r}}_{ij}) + {\mathrm{log}}(\alpha_{j})],$$
(2)

where αj is a positive, time independent, stochastic variable. We note that for rij << 1 and αjrij << 1 we can approximate Eq. (2) to the simple relation θij = αjrij Here, αj is a bias parameter, such that when αj = 1, we have θij = rij i.e., women are awarded the Nobel Prize exactly as often as the gender ratio suggests. We use a hierarchical structure for the variable α, assuming, for each scientific field, j, that the mean and standard deviation of log(αj) is drawn from stochastic (hyper) variables μ and σ. Hence, we assume some similarity between the four different αj’s. We use

$$\mathrm{log}(\mathrm{\alpha }_{j})\,\sim{\mathrm{N}}\left(\mathrm{\mu },\mathrm{\sigma} \right)$$
(3)
$${\mathrm{\mu \sim N}}\left( {0,1} \right)$$
(4)
$${\mathrm{\sigma \sim N}}\left( {1,0.5} \right),$$
(5)

where N is the normal distribution. We notice that for μ = 0 we have that Median (αj) = 1 corresponding to no gender bias. Hence, we choose a weakly informative prior distribution for αj with a median of 1, see Fig. 3. We further note that the results were found significantly robust on the choice of the hyper parameter μ (Eq. (4)) and on the standard deviation of the normal distribution, Eq. (5).

## Results

Since the first Nobel Prizes were awarded in 1901 there has been 688 Nobel laureates within the fields of Chemistry, Economics, Physics, and Medicine; among these are only 20 women (21 prizes as M. Curie received the prize twice), see Fig. 1. Among the Nobel laureates of economics there is one woman; namely Professor E. Ostrom (2009) which corresponds to 2%. In Medicine, 12 women have been awarded over the years which 6% of the laureates. It is obvious that these differences reflect, to some extent, the gender ratios within the field. However, the gender distribution of faculty members evolves and for every instance in time, the gender distribution among senior faculty members is different from junior faculty members. As the average age of Nobel laureates is 55 years (Nobel, 2019), we assume that the Nobel laureates are sampled from a gender distribution of senior faculty members. Moreover, Nobel laureates did their ground breaking findings a few decades prior to the award (the average is 15 years (Nobel, 2019)). To account for this, we assume that today’s Nobel laureates are sampled from senior faculty members δ years ago.

We examined the fraction of female faculty members relative to all faculty members which we denoted gender ratio, r. We used the gender ratios of senior faculty members at US university departments as a proxy for a global distribution. The data were retrieved from the National Science Foundation (NSF, 2018) and covers the period from 1973 to 2010. For completeness, we fitted with a logistic function and extrapolated the data back in time to cover the entire period of Nobel awards from 1901 to 2010, see Fig. 2. In the data, both Chemistry and Physics are gathered under Physical sciences. Hence, we used this gender ratio for both the Physics and Chemistry Nobel Prizes. Furthermore, for the Nobel Prize in Economics we used the gender ratio of senior faculty members from Social sciences. Most probably, this leads to a slight overestimation of the bias within economics, since economics may have a smaller gender ratio than the overall ratio within Social sciences. We use a hierarchical model to quantify possible gender bias in the awarding of Nobel Prizes using Bayesian inference through Hamiltonian Monte Carlo sampling, see Methods section. The gender bias is described by the parameter α and when α < 1(α > 1) women are awarded the Nobel Prize less (more) often than the gender ratio suggests. The sampled prior and posterior probability density distributions, p(α|r, δ), is illustrated in Fig. 3, for a lag of δ = 10 and ratios r. From the prior distribution (grey), we confirm that we chose a weakly informative prior, allowing value of α both well below and above 1. For all four Nobel Prizes, the posterior distributions shows a significant bias against women with mean values of the posterior probability density 〈α〉 < 1 and a total probability of being larger than unity, $$P\left( {\alpha \ge 1} \right) = 1 - {\int}_0^1 {p\left( {\alpha |r,\,\delta } \right)d\alpha }$$, found to be less than a few percent. To investigate how sensitive the measured bias is to the choice of δ we repeated the analysis in the range from 0 ≤ δ ≤ 20. For all values of δ, sample values of α were predominantly smaller than unity. This is summarized in Fig. 4, which shows the probability of α being larger than 1, P(α ≥ 1) versus delay, δ.