Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

The data do not support the existence of an ‘Old Boy network’ in science. Some critical comments on a study by Massen et al.

Matters Arising to this article was published on 14 August 2020

The Original Article was published on 10 October 2017

arising from: J.J.M. Massen et al.; Scientific Reports (2017).


In “Sharing of science is most likely among male scientists”1 Massen et al. report on an intriguing study. They were interested in the positive response rate of academics whom they asked to share either a paper or a dataset. Controlling for a number of variables, the authors conclude that male scientists are more likely to share but mainly with other male scientists, which they putatively ascribe to an “Old Boy network.” However, upon close inspection, the data do not warrant their conclusions.

To begin with, there is a problem with the interaction term on which the authors base their main claim: in our view, it is both misinterpreted and misspecified. Their claim (i.e., “Sharing of science is most likely among male scientists”) involves the effects of Sex of Requester and Sex of Participant. In regression terms, this translates into a two-way interaction (i.e., Sex of Requester: Sex of Participant). Massen et al., however, base their claim on a three-way interaction of the above with Condition (i.e., paper vs. dataset requests). This is questionable because a model with just a two-way interaction and a control for condition is not only more parsimonious and conceptually closer to what they set out to show but also yields a superior Akaike Information Criterion (AIC) value (i.e., a lower one). AIC is the benchmark used by the authors themselves, and it indicates that the three-way interaction is in fact superfluous. Now crucially, in the parsimonious model, the interaction between Sex of requester and Sex of Participant is not significant (p = 0.061).

Beyond the specification of this crucial interaction term, we take issue with the authors’ modelling procedure. Upon request, they reported that their model does not include lower-order terms (in particular, the aforementioned crucial interaction term Sex of Requester: Sex of Participant). There are two problems with this. First, this approach violates the ‘Principle of marginality’2,3, constituting an “arbitrary imposition on the model.”4 In principle, there can be reasons to deviate from this principle under specific (and very rare) circumstances5, but there should be an independent motivation for doing so. No such motivation was given, and we cannot see a reason ourselves. Second, omitting the crucial lower-order term sex of requester: sex of participant performs a different test than implied. The three-way interaction without the two-way interaction contrasts male-male paper requests with every other combination of the three predictors. Since this also includes male-male data requests, the statistical meaning of the interaction term is obfuscated. (As an aside: the interaction in the regression pits everything against male-male sharing in the low-cost condition, i.e., paper requests, and not, as implied in the results section, against male-male sharing in the high-cost condition, i.e., data requests).

In short, considering both the misinterpretation and the misspecification of the crucial interaction term, no clear connection between the authors’ analysis and their claims remains, in our view.

Next, we noticed some problems in the regression model in the supplementary materials and failed to replicate the analysis following the same protocol. The authors use a backward model selection procedure based on AIC. Although one could take issue with such a procedure in hypothesis-driven research (as opposed to exploratory research), we did follow the same procedure, and arrived at different conclusions. Starting from a model that contains all effects that the article lists as significant and using an AIC-based backwards selection procedure, we obtain a model which does contain the variables condition, h-index, sex of requester, and sex of participant, as well the interaction sex of requester: sex of participant, but of these regressors, only one is significant, viz. Condition (paper vs. dataset). This result is also confirmed by a (non-parametric) conditional inference tree analysis (see Supplementary Information, The tree only selects one variable: paper vs. dataset. In sum, we do not see how one would arrive at the model the authors report; if we include lower-order terms (as is standard procedure), none of the explanatory variables that Massen et al. retain in their final model reach statistical significance at p < 0.05. We do not mean to defend the p < 0.05 threshold uncritically, but given the multiple testing inherent in backwards model selection, we believe it to be a fairly liberal maximum.

In order to assess the danger that lies in the authors’ procedure numerically, we ran a simulation that reveals how often the focal interaction term (i.e., sex of the requester by sex of the participant) is to be expected to turn up as significant (i.e., p < 0.05) given random data. In other words, the simulation was designed to reflect the interaction’s true false positive rate. In each iteration, the simulation generated random data (for binary predictors—i.e., sex of the requester, sex of the participant, status of the requester, condition and response—values were chosen randomly with p = 0.5; values of h-index where randomly sampled from the original dataset to mimic their distribution). This simulation was run 10,000 times. Results indicate an alarming false positive rate of the focal interaction of close to 30%, which echoes Ioannidis’s and Nuzzo’s concerns that such a research design is liable to type 1 errors.6,7 Considering that this interaction term does not even reach statistical significance in Massen et al.’s study if the model is properly specified (i.e., including lower-order terms), we question the validity of their most central finding. If we were to follow standard procedure (see8, inter alia), regression analysis attributes the variance in the data to the main effect of Sex of Participant, i.e., remains limited to a higher generosity of male respondents.

In sum, taking into account (1) the misinterpretation and misspecification of the crucial interaction term, (2) the failure to reach statistical significance with a valid model formula, and (3) the 30% expected false positive rate following Massen et al.’s model selection procedure in a simulation with random data, we doubt the validity of the authors’ central claim that “Sharing of science is most likely among male scientists.” We acknowledge a trend that is visible in Massen et al.’s Fig. 1, but under multivariate control, this trend breaks down. It is very likely to be a spurious artefact of the procedure applied.

With a difficult topic such as a pro-male gender bias in academia, leading journals, such as Scientific Reports, have a responsibility to maintain high standards in methodology. It has been credibly suggested by Duarte and his colleagues9 and subsequent commentaries by Baumeister, by Funder, and by Pinker that research relating to these issues is often not discussed with the much needed scientific objectivity (see also10,11). Though there are certainly studies that purport to show that women are at a disadvantage in academia, a number of carefully argued recent studies have shown that the underrepresentation of women in some academic fields is not due to an Old Boy network, where males favor their own in hiring.12,13 The final verdict has not fallen in this hotly debated issue, and much remains unclear. Awaiting new research, we ought not abandon the null hypothesis: by Occam’s razor, the Old Boy network remains a mirage in the dataset at hand.

Code availability


  1. 1.

    Massen, J. J. M., Bauer, L., Spurny, B., Bugnyar, T. & Kret, M. E. Sharing of science is most likely among male scientists. Sci. Rep. 7, 12927 (2017).

    ADS  Article  Google Scholar 

  2. 2.

    Nelder, J. A. A reformulation of linear models. J. R. Stat. Soc. Ser. A Gener. 140, 48–77 (1977).

    MathSciNet  Article  Google Scholar 

  3. 3.

    Jaccard, J. Interaction Effects in Logistic Regression (Sage, London, 2001).

    Book  Google Scholar 

  4. 4.

    Venables, W. N. & Ripley, B. D. Modern Applied Statistics with S-PLUS 184 (Springer, London, 1999).

    Book  Google Scholar 

  5. 5.

    Garcı́a, M. A., Seco, G. V. & Pol, A. P. The two-way mixed model: a long and winding controversy. Psicothema 25, 130–136 (2013).

    Google Scholar 

  6. 6.

    Ioannidis, J. P. A. Why most published research findings are false. PLoS Med 2(8), e124 (2005).

    Article  Google Scholar 

  7. 7.

    Nuzzo, R. Scientific method: statistical errors. Nature 506(7487), 150–152 (2014).

    ADS  CAS  Article  Google Scholar 

  8. 8.

    Zuur, A. F., Ieno, E. N., Neil, J. W., Saveliev, A. A. & Smith, G. M. Mixed Effects Models and Extensions in Ecology with R (Springer, London, 2009).

    Book  Google Scholar 

  9. 9.

    Duarte, J. L. et al. Political diversity will improve social psychological science. BBS 38, 1–13 (2015).

    Google Scholar 

  10. 10.

    Winegard, B. M., Winegard, B. M. & Deaner, R. O. Misrepresentations of evolutionary psychology in sex and gender textbooks. Evol. Psychol. 12(3), 474–508 (2014).

    PubMed  Google Scholar 

  11. 11.

    Von Hippel, W. & Buss, D. Do ideological driven scientific agendas impede understanding and acceptance of evolutionary principles in social psychology? In The Politics of Social Psychology (eds Crawford, J. T. & Jussim, L.) 7–25 (Psychology Press, London, 2017).

    Google Scholar 

  12. 12.

    Ceci, S. J. & Williams, W. M. Sex differences in math-intensive fields. Curr. Direct. Psychol. Sci. 19(5), 275–279 (2010).

    Article  Google Scholar 

  13. 13.

    Williams, W. M. & Ceci, S. J. National hiring experiments reveal 2:1 faculty preference for women on STEM tenure track. PNAS 112(17), 5360–5365 (2015).

    ADS  CAS  Article  Google Scholar 

Download references


We are grateful to the authors of the original study for making their dataset generously available, which is the only way to advance science. For the analysis we used the open source statistical software package R (version 3.4.1; R Core Team 2017, and the lme4 package (version 1.1-13; Bates, D., Maechler, M., Bolker, B., Walker, S. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software 67(1), 1–48 (2015)). We also want to thank Stefan Gries for sharing his thoughts on the technical aspects of our re-analysis.

Author information




F.V.d.V. and B.H. analysed the data and wrote the article together.

Corresponding author

Correspondence to Freek Van de Velde.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Van de Velde, F., Heller, B. The data do not support the existence of an ‘Old Boy network’ in science. Some critical comments on a study by Massen et al.. Sci Rep 10, 13784 (2020).

Download citation


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing