Intellectual synthesis in mentorship determines success in academic careers

As academic careers become more competitive, junior scientists need to understand the value that mentorship brings to their success in academia. Previous research has found that, unsurprisingly, successful mentors tend to train successful students. But what characteristics of this relationship predict success, and how? We analyzed an open-access database of 18,856 researchers who have undergone both graduate and postdoctoral training, compiled across several fields of biomedical science with an emphasis on neuroscience. Our results show that postdoctoral mentors were more instrumental to trainees’ success compared to graduate mentors. Trainees’ success in academia was also predicted by the degree of intellectual synthesis between their graduate and postdoctoral mentors. Researchers were more likely to succeed if they trained under mentors with disparate expertise and integrated that expertise into their own work. This pattern has held up over at least 40 years, despite fluctuations in the number of students and availability of independent research positions.

these situations, the protege has synthesized the work going on in two loosely-related disciplines and increased their chances of success. The authors also find some connection between an individual's success and the rate at which their mentors train proteges and the experience level of the postdoctoral mentor (but not doctoral mentor).
The authors' work is novel and fits into the existing literature of similar analyses conducted on other data sets. It should promote further discussion and study of the role of academic networks on researchers' career paths, particularly because the authors have reported their results on a wide variety of possible factors that were examined and found to have little or no impact on an individual's success. I am not qualified to comment on the statistical methods used by the authors, but I applaud them for examining a number of factors and reporting the conclusions they were able to draw from them (or that no realiable connection was found). (I will note that the content of Figure 4 disagrees with the caption, with one mentioning an 8.9% gain and the other 9.3% for the same variable.) The primary recent work that the authors cite and compare to, particularly with regard to mentor experience, is the work of Malmgren et al. (ref. 34), which looked at the impact of doctoral mentor experience in mathematics by looking at data from the Mathematics Genealogy Project. Malmgren et al. analyzed data from the first 60 years of the 20th century, and because of the content of the Mathematics Genealogy Project database, were restricted to studying only the doctoral advising relationship. In contrast, the authors here have a dataset consisting primarily of individuals who completed their academic training since 1990 and only considered individuals who had postdoctoral training when conducting their analysis. With regard to the mentor experience analysis discussed on lines 307--321, it would be interesting to see if a change to consider dyads as in the paper of Malmgren et al. impacts the conclusions in this paper, since the restriction to individuals with postdoctoral training that took place before the analysis may have an influence. Also of concern in the authors' approach here is that, even after conducting a significant amount of data inference to correct for missing data, mentor academic age was not available for 49% of triplets. The data used by Malmgren et al. would have had much more complete information. The Mathematics Genealogy Project dataset is certainly different to the one considered by the authors here, since it does not include information on postdoctoral mentorship. However, it is a more robust and curated database than that studied by the authors. (For example, restricting the math dataset to just doctoral degrees awarded in the 1990s would result in over 42,000 mentor-protege dyads to study.) I would also like to see the authors expand the information on the semantic analysis portion of their paper, since that is where their principal conclusion lies. In particular, the author identification used to link researchers to publications were primarily done automatically. The authors cite a 90% agreement rate for the user-curated data with the automated matching, but there is no discussion of what that rate is for the researchers considered in the data set studied in this paper. This is another area where the Mathematics Genealogy Project dataset is more robust than the one under study, since research identification with MathSciNet is quite curated. Although the MGP-MathSciNet identification also involves automation, the MathSciNet database of publications is entirely curated with significant attention paid to author identification by skilled bibliographers. Unfortunately, the lack of postdoctoral mentor data in the Mathematics Genealogy Project dataset means that the authors intellectual synthesis analysis cannot be replicated. This paper should fuel further discussion and study into what conclusions can be drawn about the role of mentors on researchers' academic careers, and perhaps even encourage the creation of more robust datasets (crowdsourced or curated) to enable further studies. However, there are some shortcomings in terms of the restrictions placed upon the dataset studied relative to prior studies and other datasets that could be investigated in a similar manner to at least some of the authors' work here. (In some cases, such as the author identification, this may not be a shortcoming of the dataset but just a lack of detail in what the authors have provided about their subset of the overall dataset.)

Mitchel T. Keller Managing Director, Mathematics Genealogy Project
Reviewer #3: Remarks to the Author: This paper tackles the question on how mentorship determines chances of a success in scientific careers. The question is addressed by performing a statistical analysis and modelling of a dataset of academic genealogy in the life sciences, where for a number of scientists graduate and postdoctoral mentorship relations are known. The authors use the number of scientific trainees a scientist has during his/her career as a proxy to gauge success. Then the analysis shows the role of different variables, some related to career characteristics of trainees and mentors, like duration of postdoctoral training or academic age of mentors, other related to network/relational properties, like finding common ancestors between graduate and postdoctoral mentors, in determining success.
There are a number of interesting findings in the paper, the most novel ones -as pointed out by the authors -being the fact that the postdoctoral mentor proliferation (number of trainees of the postdoctoral mentor) are a greater effect than the graduate mentor on the odds of getting a permanent position, and that successful scientists are those that have done research similar to their mentors, while the mentors have dissimilar research.
The core of this paper is a thorough and detailed analysis of the dataset mentioned above. Everything looks technically correct and the language is clear, but the paper reads at times like an applied statistics or econometric paper, to the point of sounding aseptic. I feel this is more than just a matter of style, as it limits the reader -especially if from the research policy or science of science community -to go ahead and look into the findings and their implications. This aspect of the paper is also reflected on the references: the authors cite a lot of statistics papers, which I appreciate because we do need sound and robust statistical analysis, but the paper lacks many references of science of science and research policy, and relative discussion. For example, since the paper deals with modelling academic careers, reference and connections to the findings of papers about data-driven models of success in scientific careers would have been expected (see for example, Petersen 2017)). Also I would have expected at least some comments if not an analysis of variables for which there is consensus that they are fundamental for scientific success, like collaboration patterns or teams (for a review of the field, with many relevant references, Fortunato et al. "Science of Science", Science (2018)).
I have then two major issues regarding the analysis and interpretation of some of the results: -one issue regards the limited temporal dimension of the analysis, with important implications on the interpretations of the presented results. It is not clear whether there is an overall trend for the scientists that are more successful to go from a graduate mentor to a "better" postdoctoral mentor. If such a trend exists, then the statement/explanation that "postdoctoral mentor has a greater effect than the graduate mentor on the odds of securing a permanent position" should be definitely revisited: it might simply be that those scientists that (randomly?) move to a better postdoctoral mentor (measured with proliferation rates) have then better access to opportunities and are more likely to be successful themselves.
-the other issue is about the important finding, also summarised in the title, that successful scientists are those that make a synthesis between two different line of research, represented by graduate and postdoctoral mentors, with important implicants about bridging different areas of knowledge. My issue with this is that again we have no insight about the temporal evolution of the career of the scientist: can it simply be that scientists who move to a more thriving line of research or subfield for the postdoctoral training have more chances to get a permanent position? From this point of view the higher success rate is not due to the intellectual synthesis of the trainee, but to the fact that s/he changed topic and moved to a research area with more opportunities.
The two hypotheses above can be ruled out with the data at hands, and in general more explanations and intuitive insights should be given on why the authors observe what they observe. In general I do believe that in a study of careers one needs to look into and understand the temporal evolution of variables and understand how these connect to the probability of staying in academia and of achieving success. Some of the consequences/explanations I am suggesting might be possible already with the analysis at hand, but if it is the case, they are too much hidden behind the technical details.
In summary, the authors have clearly put a lot of effort into the analysis and have done a great job with the technical details, but the main results, their consequences and the overall narrative is too much hidden by the technical nature of the presentation. Also I think it is fundamental to make a better job in offering more support and clearer explanations for some of the main observations of the paper. I think these issues can definitely be fixed with a through revision of the paper.
Minor points: -References 33 and 34 are duplicates, and the correct one to be used -to my knowledge -is 33 (not Nunes but Amaral) -line 58: intellectualy ---> intellectually -line 66 -67: there is a parenthesis missing, and the sentence is somehow messed up. -line 94: "about about" -page 6, figure 2: x-and y-label text is too small, same with legend. When the paper is printed is hard to read.  We have included a discussion of these alternative and more complex temporal models in the text, although we have chosen to keep the simpler model in the main text because it captures the same network and synthesis effects (mark 10 on page 12 and mark 26 on page 8). In addition, we have included a more general discussion of the complexity of this data set and the potential how larger datasets might address concerns about long-term trends and sampling bias (mark 21 on page 19).
As an aside, we also computed the potential influence of NIH funding after reading your remark.
Although econometry is not our area of research, we followed the procedure used in 6 and were able to recompute the graphs therein. After feeding the total NIH funding of the year of last postdoc (in constant iii 2009 dollar adjusted for inflation in biomedical research, using the BRDPI price index) as a variable to the model, we found that it mediated no significant impact in the odds of continuing in research. See graph below. The only effect of this variable is an interaction with the general temporal variable "Training end date" on long-term proliferation, possibly as a collinear balancing artifact (one variable becomes a significant positive factor while the other becomes a significant negative factor). Our understanding from this analysis is that a complete understanding of temporal effects will require an in-depth analysis incorporating multiple factors, and that generally, the effects of intellectual synthesis appear robust to inclusion of extraneous variables. We do not mention this analysis in the manuscript due to its inconclusive nature. iv Second, data availability is likely to be grossly inconsistent in terms of coverage of different fields. The data in Figure 7 suggest that other are likely poorly covered when compared to neuroscience .
(R1.2) We agree that the strong representation of neuroscience could bias the broader bioscience results.
To test for this possibility, we performed a new analysis on the data. We broke the data into two groups: neuroscience only and all fields except neuroscience (split based on whether the trainee was listed in neuroscience) and fit the same regression model on these separate datasets. This analysis revealed similar results in both datasets. There are some differences between them, but the main intellectual synthesis effects are preserved. The results are included in a set of new supplemental figures in Supplementary Information (mark 27 on page 13) that are presented in main text (mark 13 on page 14).
By the way, the values on the x-axis for "mentor proliferation rate" do not match the values mentioned in the caption.

(R1.3) Fixed, thanks.
Third, there is no indication that triplets were restricted geographically at all. However, it is likely that career opportunities and coverage vary widely with geography and pedigree (institutional affiliation).
(R1.4) While we did not include explicitly institutional affiliations as a factor in the current study, we include many other variables correlating with their beneficial/antagonist effects: mentor's seniority, mentor's trainee proliferation (this metric is correlated with lab size and has been shown in previous works to be correlated with awards, such as membership in the US National Academy of Sciences), number of co-publications (linked to the overall productivity of mentors). Thus, we already control for many of the factors underlying the beneficial/antagonist aspects associated with prestigious/unknown institutional affiliations.
As for geography, we agree that some sort geographical factors could correlate with the odds of obtaining a permanent position. Indeed, as a sizable portion of life science postdocs in the US have been trained in another country at some earlier point of their life, and we could expect that international mobility has some impact on academic success, in terms of obtaining a position in the US, for example, or in another country. However, this process is fraught with non-trivial technical difficulties, as some of the key variables of interest (nationality and visa situation) are simply out of reach. Also, it is a very intricate and potentially divisive issue, and we believe that treating it as an add-on to this study can not possibly be done in a satisfying way.
To address this concern, we chose to be more thorough in our interpretation of our results. We specifically tried to be more explicit about our thought process and the underlying assumptions in the introduction (mark 2 on page 3) and discussion (mark 20 on page 19).
Fourth, the definition of similarity for trainees and mentors is presented but not supported. Why should I trust the metric presented in Figs 1B,C? Why not assign topics to the publications and look at a vector over topics instead of over terms? the projection on terms in likely very very noisy. v (R1.5) The dimensionality reduction applied to the semantic space can actually be thought of as a way of measuring topics. We initially analyzed the overlap of PubMed keywords. Upon working with our co-authors who specialize in these approaches, however, we found that the current semantic analysis provided a highly correlated but much finer-grained measure of similarity. This particular method for latent semantic analysis has been validated and is the subject of a publication (see Fig. 4  I believe that all these issues are easily addressable by the authors. They do not need to get any new data and likely only need to write some more code. Reviewer #2 (Remarks to the Author): In this paper, the authors conduct an analysis of a database of about 20,000 researchers in biomedical sciences, focusing on individuals with both a graduate degree and postdoctoral training recorded in the database. (They use the term "triplets" to describe these structures, but the definition, as best as I can tell, appears only in the caption of Figure 2, which is not until after the term has been used multiple times.) (R2.1) Thanks, fixed (mark 4 on page 4).
They look at a variety of features of the mentorship network in order to determine what increases the success of a protege. (The authors define success as training either a graduate student or postdoctoral fellow with a record in the database under study.) The authors' principal conclusion is that what they term "intellectual synthesis" is the strongest of the factors they studied at influencing continued academic research. Their conclusion is based on an analysis of abstracts of papers written by the individuals in the database, using a stemmed keyword approach. They find that proteges who publish papers involving keywords in common with their doctoral advisor and their postdoctoral mentor are more successful, provided those two keyword sets have some but not too many commonalities. The authors conclude that in these situations, the protege has synthesized the work going on in two loosely-related disciplines and increased their chances of success. The authors also find some connection between an individual's success and the rate at which their mentors train proteges and the experience level of the postdoctoral mentor (but not doctoral mentor).
The authors' work is novel and fits into the existing literature of similar analyses conducted on other data sets. It should promote further discussion and study of the role of academic networks on researchers' career paths, particularly because the authors have reported their results on a wide variety of possible factors that were examined and found to have little or no impact on an individual's success. I am not qualified to comment on the statistical methods used by the authors, but I applaud them for examining a number of factors and reporting the conclusions they were able to draw from them (or that no realiable connection was found). (I will note that the content of Figure 4 disagrees with the caption, with one mentioning an 8.9% gain and the other 9.3% for the same variable.) (R2.2) Thank you, we have corrected this discrepancy to reflect the actual value of 8.9% in the caption (mark 9 on page 12).

vi
The primary recent work that the authors cite and compare to, particularly with regard to mentor experience, is the work of Malmgren et al., which looked at the impact of doctoral mentor experience in mathematics by looking at data from the Mathematics Genealogy Project. Malmgren et al. analyzed data from the first 60 years of the 20th century, and because of the content of the Mathematics Genealogy Project database, were restricted to studying only the doctoral advising relationship. In contrast, the authors here have a dataset consisting primarily of individuals who completed their academic training since 1990 and only considered individuals who had postdoctoral training when conducting their analysis. With regard to the mentor experience analysis discussed on lines 307-321, it would be interesting to see if a change to consider dyads as in the paper of Malmgren et al. impacts the conclusions in this paper, since the restriction to individuals with postdoctoral training that took place before the analysis may have an influence. However, we should note that while PhDs that did not complete a postdoc are the norm in mathematics, at least for the time periods analyzed in Malmgren et al., performing a postdoc is basically a requirement in the life science careers that we study in our work (a trend that started in the 1970s, thus aligned with the bulk of the Academic Tree subset studied here, cf. reference 43 -the US National Research Council report of 1981). In this context, designing a fair comparison between the role of the graduate advisor of Math PhDs from 1900-1960 and whose early career has been influenced mostly by this advisor, and the graduate advisor of more recent Life Science PhDs which we know are also influenced by their postdoctoral advisor, is difficult.
We still sought to address this question by computing the model on the same trainee subset but using only features from their graduate studies. This tests the hypothesis that the absence of effects of graduate mentor's age is not a confound of postdoctoral training variables, based on the same dataset of PhDs that did a postdoc (because it is representative of the norm in life science).
In this case we still found no significant effect of graduate mentor's age, suggesting a difference between mathematics and bioscience fields. See figure below. We added this additional analysis and supporting figure at mark 18 on page 19 (main text) and mark 25 on page 4 (supplementary materials).  Also of concern in the authors' approach here is that, even after conducting a significant amount of data inference to correct for missing data, mentor academic age was not available for 49% of triplets. The data used by Malmgren et al. would have had much more complete information. The Mathematics Genealogy Project dataset is certainly different to the one considered by the authors here, since it does not include information on postdoctoral mentorship. However, it is a more robust and curated database than that studied by the authors.
(For example, restricting the math dataset to just doctoral degrees awarded in the 1990s would result in over vii 42,000 mentor-protege dyads to study.) (R2.4) We agree, it is frustrating that the crowd-sourced system used in the Academic Family Tree does not require entry of training dates. We chose to be conservative in our method for measuring training dates because we observed that long-term trends have had such a strong influence on trainee success rates. Generally were were able to infer training dates based on publication information. Since publication data were also required for the analysis of intellectual synthesis, we decided to focus on this smaller subset of researchers.
More generally, we agree that a comparison with Malmgren et al. is highly relevant, and we have expanded this part of the discussion to consider differences in data sampling, in addition to the network analysis results already discussed (mark 19 on page 19).
I would also like to see the authors expand the information on the semantic analysis portion of their paper, since that is where their principal conclusion lies. In particular, the author identification used to link researchers to publications were primarily done automatically. The authors cite a 90% agreement rate for the user-curated data with the automated matching, but there is no discussion of what that rate is for the researchers considered in the data set studied in this paper. This is another area where the Mathematics Genealogy Project dataset is more robust than the one under study, since research identification with MathSciNet is quite curated. Although the MGP-MathSciNet identification also involves automation, the MathSciNet database of publications is entirely curated with significant attention paid to author identification by skilled bibliographers. Unfortunately, the lack of postdoctoral mentor data in the Mathematics Genealogy Project dataset means that the authors intellectual synthesis analysis cannot be replicated.
(R2.5) This is a helpful point. We have revised the text to cite the study that developed and validated the semantic analysis method (mark 24 on page 23). We also now report accuracy of author identification for the subset of researchers included in the paper, which is 93% (mark 23 on page 23).
This paper should fuel further discussion and study into what conclusions can be drawn about the role of mentors on researchers' academic careers, and perhaps even encourage the creation of more robust datasets (crowdsourced or curated) to enable further studies. However, there are some shortcomings in terms of the restrictions placed upon the dataset studied relative to prior studies and other datasets that could be investigated in a similar manner to at least some of the authors' work here. (In some cases, such as the author identification, this may not be a shortcoming of the dataset but just a lack of detail in what the authors have provided about their subset of the overall dataset.) Mitchel T. Keller Managing Director, Mathematics Genealogy Project (R2.6) Thanks. We agree that a developing more complete and accurate datasets will be valuable for further study of these issues. We have included points about this future direction, which we plan to pursue, in the revised Discussion (mark 22 on page 20).

Reviewer #3 (Remarks to the Author):
This paper tackles the question on how mentorship determines chances of a success in scientific careers. The question is addressed by performing a statistical analysis and modelling of a dataset of academic genealogy in the viii life sciences, where for a number of scientists graduate and postdoctoral mentorship relations are known. The authors use the number of scientific trainees a scientist has during his/her career as a proxy to gauge success.
Then the analysis shows the role of different variables, some related to career characteristics of trainees and mentors, like duration of postdoctoral training or academic age of mentors, other related to network/relational properties, like finding common ancestors between graduate and postdoctoral mentors, in determining success. There are a number of interesting findings in the paper, the most novel ones -as pointed out by the authorsbeing the fact that the postdoctoral mentor proliferation (number of trainees of the postdoctoral mentor) are a greater effect than the graduate mentor on the odds of getting a permanent position, and that successful scientists are those that have done research similar to their mentors, while the mentors have dissimilar research.
The core of this paper is a thorough and detailed analysis of the dataset mentioned above. Everything looks technically correct and the language is clear, but the paper reads at times like an applied statistics or econometric paper, to the point of sounding aseptic. I feel this is more than just a matter of style, as it limits the reader -especially if from the research policy or science of science community -to go ahead and look into the findings and their implications. This aspect of the paper is also reflected on the references: the authors cite a lot of statistics papers, which I appreciate because we do need sound and robust statistical analysis, but the paper lacks many references of science of science and research policy, and relative discussion. For example, since the paper deals with modelling academic careers, reference and connections to the findings of papers about data-driven models of success in scientific careers would have been expected (see for example, Petersen et al. (R3.1) Thank you for pointing us to this interesting literature. They were helpful to improve the introduction (mark 1 on page 2), and overall to relate our approach focused on academic genealogy to the existing body of bibliometric works.
Also thanks to your feedback on style, we streamlined our presentation of the results to avoid distracting some potential readers. In particular, we opted to some of the most technical aspects of the results about model selection (negative binomial / Poisson distribution and zero-inflated / hurdle model structure) to Supplementary Information, and kept only a brief summarized form in main text (mark 8 on page 9).
I have then two major issues regarding the analysis and interpretation of some of the results: -one issue regards the limited temporal dimension of the analysis, with important implications on the interpretations of the presented results. It is not clear whether there is an overall trend for the scientists that are more successful to go from a graduate mentor to a "better" postdoctoral mentor. If such a trend exists, then the statement/explanation that "postdoctoral mentor has a greater effect than the graduate mentor on the odds of securing a permanent position" should be definitely revisited: it might simply be that those scientists that (randomly?) move to a better postdoctoral mentor (measured with proliferation rates) have then better access to opportunities and are more likely to be successful themselves.
ix (R3.2) This is a relevant concern that we failed to consider in the original manuscript. To test this possibility, we re-fit the model with a new term "postdoc vs. graduate mentor proliferation", computed as the ratio of postdoc/graduate mentor proliferation rates. This number should identify systematic benefits of moving to a more prolific postdoc mentor. This additional term had no significant predictive power, and thus we infer that the proposed pattern of mentorship is not the common mode for successful trainees. We have included this analysis in a new section of the results (mark 14 on page 15). -the other issue is about the important finding, also summarised in the title, that successful scientists are those that make a synthesis between two different line of research, represented by graduate and postdoctoral mentors, with important implicants about bridging different areas of knowledge. My issue with this is that again we have no insight about the temporal evolution of the career of the scientist: can it simply be that scientists who move to a more thriving line of research or subfield for the postdoctoral training have more chances to get a permanent position? From this point of view the higher success rate is not due to the intellectual synthesis of the trainee, but to the fact that s/he changed topic and moved to a research area with more opportunities.
(R3.3) We agree that this is a likely model for at least some successful trainees, and agree that it may be difficult to tease apart in the current dataset, as a successful postdoctoral mentor is themself likely to be in a thriving subfield. Thus the concern does not seem to reflect a potential bias as much as a different interpretation of the results.
We did consider one alternative model with a new interaction term, (trainee-postdoc mentor similarity * postdoc mentor proficiency), contrasted with a second term: (trainee-graduate mentor similarity * graduate mentor proficiency). The first term should be high for trainees who moved into the "better" subfield of their postdoc mentor, while the second term contrasts this effect and should be high for trainees who stayed in the "better" subfield of their graduate studies. These terms were not linked to increased (or decreased) odds of finding a permanent position in this dataset. See graph below.
x Interestingly, there is a very small effect of the "Postdoc mentor rate x similarity" term on long-term proliferation, which is an effect hard to interpret. It is quite small, and we know that interaction terms of significant variables tend to be significant themselves and may balance the effect of their original variables (in this regression, the effect of the "Postdoc mentor rate" on the number of trainees is indeed larger than in the original regression), so we avoid reading too much into this new effect.
We include a description of these results in the new section on network-semantic interactions (mark 15 on page 15). We have also included a discussion of different models for postdoctoral mentor influence in the discussion (mark 16 on page 17), commenting that the specific benefits of a successful postdoctoral mentor may be variable but that the main effects of the paper (greater influence of postdoctoral mentor and intellectual synthesis) still hold, despite these alternative mechanisms.
The two hypotheses above can be ruled out with the data at hands, and in general more explanations and intuitive insights should be given on why the authors observe what they observe. In general I do believe that in a study of careers one needs to look into and understand the temporal evolution of variables and understand how these connect to the probability of staying in academia and of achieving success. Some of the consequences/explanations I am suggesting might be possible already with the analysis at hand, but if it is the case, they are too much hidden behind the technical details.
In summary, the authors have clearly put a lot of effort into the analysis and have done a great job with the technical details, but the main results, their consequences and the overall narrative is too much hidden by the technical nature of the presentation. Also I think it is fundamental to make a better job in offering more support and clearer explanations for some of the main observations of the paper. I think these issues can definitely be fixed with a through revision of the paper. Minor points: xi -References 33 and 34 are duplicates, and the correct one to be used -to my knowledge -is 33 (not Nunes but Amaral) (R3.5) Fixed.
-line 66 -67: there is a parenthesis missing, and the sentence is somehow messed up.
Research performed as a graduate student or postdoc may be more or less aligned to the mentor's own bars at the bottom the graph. G: the similarity between co-mentors (blue) is higher than among a randomly picked pair of researchers (red). H: closer common ancestor distance leads to greater publication similarity, and this effect is cumulative with the higher proximity of researcher that co-mentor the same trainee.

203
Several variables had a negative Shapley score ( The impact of :::::::::: mentorship : variables on the odds of ::::::: trainees obtaining of an independent research position 216 and on :::: their : long-term training ::::::::::: proliferation : rates are summarized in Table 1 and   proliferation scale (i.e. one more trainee per decade) results in increased odds by 9.3 ::: 8.9% of the protégé to find a permanent position, and this effect is statistically significant. The long term effect this change is also significant, and the protégé's proliferation rate is then increased by 2.6%.

↑
Overall, our study justifies :::::::: supports : the long-standing 387 advice that prospective students should look at the training track-record ::::: record : of potential mentors to assess 388 their quality 5,26 . We show here that besides boosting the odds of securing a permanent research position, 389 highly prolific mentors also tend to have highly prolific trainees, a desirable quality as it is globally linked 390 to academic achievements 8,15? .

↑ 549
For computing similarity between the two mentors in a triplet, we included only publications prior to any and postdoctoral mentors in the mentorship graph ( Fig 1A).

562
• graduate mentor age, postdoc mentor age: number of years since the mentor completed their own 563 training ("academic age" in 14 ).

564
• mentor publication similarity: publication similarity (cosine distance between average publication vec-

572
To avoid bias in the similarity measure due to co-authored publications (which would have artificially 573 increased publication similarity), we specifically excluded them in the publication similarity computations.

574
That is, graduate mentor/trainee similarity was computed using publications where they do not appear as proliferation for those who do continue, as: Parameters β i and δ i indicate the relative weight of the i th variable in predicting π and f , respectively.

Reviewers' C omments:
Reviewer #1: Remarks to the Author: I commend the authors on all the work they did to improve the manuscript. I think the additional tests dramatically strengthen the confidence on the reported results.
However, I still think that several changes need to be made. First and foremost, I believe that the analyses in the main text should be restricted to neuroscience. The number of triplets for neuroscience is about six-fold the number for all other life science fields.
I think the main text could state that analyses for other life science disciplines are not in conflict with the results for neuroscience but I do believe that the appearance that all results hold for all life science fields is an over-reach.
Second, I believe that the analyses reported in Fig. 3 can be improved. The data in Fig  Third, I realized that I am not sure what the bars and scale represent in Fig. 4 and similar figures. I think it would be more useful to have the bars identifying the 95% C I for the estimates of the coefficients.
Reviewer #2: Remarks to the Author: I am satisfied by the authors' revisions and comments in response to the remarks in my report and those of the other the other two reviewers. While there are limitations to the data set the authors have studied in this work, I believe it is robust enough to support their conclusions. Furthermore, publication of this article will hopefully lead to further development of data sets that can be used to conduct further studies in the future. I support the publication of this article in Nature C ommunications.
I have only a few minor editorial comments for changes to this version: Line 156: "each variables" should be "each variable" Line 190: The hyphen in "Forward-" appears to be in error.
Line 246: It's not clear that the word "in" belongs in this line.
Reviewer #4: Remarks to the Author: Through a thorough and detailed revision of the manuscript, claims byLienard and colleagues about success in academia and the role of intellectual synthesis in mentorship now appear to me very solid.
I am satisfied with the pinpoint replies to my two major comments about testing for i) (lack of) trend for successful scientists to go from a graduate mentor to a "better" postdoctoral mentor ii) (no clear) trend for successful scientists who move into a more thriving line of research for the postdoctoral training. The absence of either effect is reported in detail in the main text and in the supplementary material.
I appreciated that the revised version of the paper maintains all the rigour of the first submission, but it is now much easier to read. This applies in particular to Section 2.2 about model of academic success in life science (mark 8), which is now much more accessible to non statisticians and suitable for the wide readership of Nature C ommunications.
I also found convincing the discussion of similarity and differences with the paper by Mamgren et al. raised by comments from another reviewer, as well as evidence for the importance of intellectual synthesis both in the neuroscience and other life sciences.
Also, references from the research policy and science of science community have been adequately inserted and discussed in the manuscript (please note that reference 43 arXiv:1607.05606 has recently been published in the Journal of Informetrics).
Taken together, the paper is well-written and timely in its findings, sheding new light on the determinants of academic success (a widely debated topic in the literature) in the life sciences, and I would now recommend acceptance in Nature C ommunications.
I commend the authors on all the work they did to improve the manuscript. I think the additional tests dramatically strengthen the confidence on the reported results.
However, I still think that several changes need to be made. First and foremost, I believe that the analyses in the main text should be restricted to neuroscience. The number of triplets for neuroscience is about six-fold the number for all other life science fields.
I think the main text could state that analyses for other life science disciplines are not in conflict with the results for neuroscience but I do believe that the appearance that all results hold for all life science fields is an over-reach.
We agree with this remark, and we made changes to the scope of the paper to more clearly reflect the bias of our dataset toward neuroscience. We altered the text in several places throughout the manuscript, including in the abstract, to be perfectly explicit about it.
In addition, we added more details on our control analysis that compares the neuro and non-neuro parts is not restricted to neuroscience.
Overall, even though we gained additional confidence that our results likely generalizes to all of life science, we fundamentally agree with your remark and modified the text to reflect clearly the scope and limits of our analysis.
Second, I believe that the analyses reported in Fig. 3 can be improved. The data in Fig 2A shows  For the case of Fig. 3D, I would plot the value of the mean and skewness versus mentor group.
To the best of our knowledge, nobody has yet reported the distributions of publication similarity, or their link with continuing in academia. We thus believe that the graphs of Figure 3 are an important part of this study, especially as we believe it fits within the flow of the paper. Showing that these distributions are well-behaved also strengthens the confidence in the modeling effort, as it demonstrates that it is not based on pathologically biased/skewed data.
This being said, we understand that the visualization we originally used may look "dry", so we tried to improve it based on the reviewer's suggestion, by showing the mean difference and 95% confidence interval (also reinforcing the conclusion of the two-sample Kolmogorov-Smirnov test). We also ii investigate specifically the possibility that the beneficial effects of greater/lower publication similarity would only be valid for the high (or low) proliferation mentors. We include this analysis as supplementary information and mention it in main text (mark 1 on page 6). Fig. 4 and similar figures. I think it would be more useful to have the bars identifying the 95% CI for the estimates of the coefficients.

Third, I realized that I am not sure what the bars and scale represent in
We are afraid that an alteration of this figure, introduced in our previous revision, is responsible for the confusion. Indeed, we did not realize that the scale bar (just below the text "scale: z=1") looked like an error bar, suggesting that the error bars represent z-scores (they are really the 95% CI).
We fixed this in all the graphs, in main text and supplementary materials, and rewrote the caption of the graph in main text to avoid any confusion (mark 2 on page 11).
We appreciate, and are thankful for, your reviews which helped us improve our manuscript. I have only a few minor editorial comments for changes to this version: Line 156: "each variables" should be "each variable" Line 190: The hyphen in "Forward-" appears to be in error.
Line 246: It's not clear that the word "in" belongs in this line.
We corrected the remaining typos. We kindly thank you for your review and your help in making our manuscript better. iii Also, references from the research policy and science of science community have been adequately inserted and discussed in the manuscript (please note that reference 43 arXiv:1607.05606 has recently been published in the Journal of Informetrics).
Taken together, the paper is well-written and timely in its findings, sheding new light on the determinants of academic success (a widely debated topic in the literature) in the life sciences, and I would now recommend acceptance in Nature Communications.
We fixed this reference.
The effect of training end date on the odds of continuing in academia was found to be very strong  To control for this possibility, we fit the same models using a temporal subset of data and without the

207
Network variables were broadly found to make a larger contribution to overall model performance than 208 publication variables (Fig. 5). This relatively greater influence is consistent with the Shapley values and the 209 variable ordering in Table 1, and is found across all formulations of the model studied (Table 3). We restricted our main analysis to data that was available before the end of training to avoid any confound 211 associated with continuing versus not continuing in academia. To investigate whether the semantic content of papers published after the end of the postdoc continue to influence the career outcomes, we further included 213 them as extra variables in the model. We observed substantial explanatory power for this late-publication 214 similarity in explaining continuation in academia, and specially so for the postdoc advisor-trainee similarity 215 (supplemental Fig. 20). This finding suggests that strong ties formed during training and transitioning into 216 a collaboration with the former advisors has a beneficial impact on the trainee's career. This also reinforces 217 the idea that the postdoctoral advisor has a larger influence on the future career than the graduate advisor, 218 as was found using variables available at the end of the postdoc (Supplemental Figs. 4 and 20). The :::::::::::: composition :: of :::: the ::: life ::::::: science ::::::: dataset ::: is :::::::::: dominated ::: by :::::::::::: neuroscience ::::::::: graduates. :::::::: Indeed, ::: 62% : of :::: the 221 :::::: triplets :::::::::::: (n = 14, 953) ::::: have :: a :::::: trainee ::::::::: identified ::: as ::::::::: belonging :: to :::: the :::: field :: of ::::::::: Mentor graph distance showed a clear inverse relationship with mentor publication similarity (Fig. 2H), 241 which has large explanatory power in the model (Table 1). Indeed, these two variables have a weak but 242 significant correlation (r = -0.192 and p < 0.05, Fig. 6A). Thus, although mentor graph distance had low 243 Shapley value and low importance according to the forward and backward CSA algorithms (Table 1, Fig 6B), 244 we considered the possibility that it might somehow influence trainee outcomes. Mentor graph distribution 245 shows a striking bimodal distribution that suggests a more complex nonlinear relationship with other model 246 variables (Fig. 6C). The distribution of mentor graph distance is broadly similar for trainees who did or 247 did not continue in academia. However for trainees with very short mentor graph distance (< 4 steps) 248 the probability of continuing in academia appears to be consistently lower. We grouped the data into two

308
Given that trainees benefit from mentors with dissimilar research, one might also expect trainees to benefit 309 from mentors separated by a large distance in the genealogy graph. However, mentor network distance (i.e., 310 distance to a common ancestor) does not predict trainee continuation or proliferation (Table 1, Fig. 6B).

311
This suggests that mentor graph distance may be too crude a measure of intellectual similarity to provide

493
• graduate mentor age, postdoc mentor age: number of years since the mentor completed their own 494 training ("academic age" in 13 ).

495
• mentor publication similarity: publication similarity (cosine distance between average publication vec-496 tors) between mentors for papers published before they started training the protégé and excluding 497 any co-authored publications.

498
• graduate mentor/trainee similarity, postdoc mentor/trainee similarity: publication similarity between 499 trainee and mentor for publications before the end of postdoctoral training and excluding any co-500 authored publications.

501
• publications with graduate mentor, publications with postdoc mentor: number of co-authored publica-502 tions between mentor and trainee prior to the end of postdoctoral training.

503
To avoid bias in the similarity measure due to co-authored publications (which would have artificially 504 increased publication similarity), we specifically excluded them in the publication similarity computations.

505
That is, graduate mentor/trainee similarity was computed using publications where they do not appear as 506 co-authors. In practice, the publication corpus of the trainee was thus mostly composed of publications 507 co-authored with the postdoctoral advisor. were excluded from modeling to take into account the time needed to train students or postdocs when 519 continuing in academia, resulting in 1,345 triplets that could be used to screen the impacts of all factors.

520
The exclusion of non-significant factors ( Table 1 in Results) increased the number of triplets available to fit 521 the final model to 2,109. formalism. In this framework, the probability of continuing in research is modeled by a binomial variable 529 and the proliferation of researchers that moved on to a permanent research position is modeled by a count 530 variable. Given a vector of predictor variables, X (Section 5.5), the model simultaneously describes π (x), 531 the probability of continuing in an academic career after postdoctoral training, and f (X), the expected 532 proliferation for those who do continue, as: Parameters β i and δ i indicate the relative weight of the i th variable in predicting π and f , respectively.

534
The career length is introduced as an offset, log (C), because we are ultimately interested in comparing To confirm that our choice of model formulation and predictors was appropriate, we compared its 554 goodness-of-fit against several alternative formulations (hurdle and zero-inflated, with Poisson and Negative 555 Binomial count models) and predictor sets (Table 3). For each configuration, we evaluated the predictive 556 log-likelihood, computed on held-out data that was not used for fitting 21,51,53 . This cross-validation frame-557 work is useful for comparing models that do not assume normally distributed errors and that differ in their 558 number of free parameters 21 . More specifically, we used k -fold cross-validation 51 , where the data is split 559 into k equal-sized random folds, y 1 , ..., y k . We define θ −j as the model parameterization (β and δ from Eq. In regression, a notion equivalent to the Shapley values has been developed to quantify the relative impor-  Table 1). 584 5.8 Sensitivity and uncertainty analysis. 585 We computed 95% bootstrapped confidence intervals for descriptive statistics of the dataset and model 586 prediction 18 . They are shown as error bars and shaded areas throughout the figures. 587 5.9 Data availability.
37 Figure 9: Coefficients of a model focused on graduate mentor features (academic age, training rate, copublishing and similarity with trainee) and excluding features related to postdoctoral studies.

6
Figure 10: Alternative model computed with an additional interaction term, "Postdoc ÷ graduate mentor rates", which was designed to be high for trainees that moved to a more prolific postdoctoral mentor :::: with ::::: higher :::::::::::: proliferation :::: rate ("upward mobility" hypothesis). The lack of significance of :: for : this additional term showed that there is no systematic benefit associated with such a strategy.
8 Figure 11: Alternative model computed with a new interaction term, "Postdoc mentor rate × similarity", contrasted with a second term: "Graduate mentor rate × similarity". The first term should be high for trainees who moved into the "better" subfield of their postdoc mentor, while the second term contrasts this effect and should be high for trainees who stayed in the "better" subfield of their graduate studies. These terms were not linked to increased (or decreased) odds of finding a permanent position in this dataset. Interestingly, there is a slight influence of the "Postdoc mentor rate × similarity" term on long-term proliferation.
One interpretation is that it corresponds to a long-term fatigue effect of disengagement from opportunistic trainees who embraced the research line of their postdoctoral mentor. However, it may also be a spurious effect that parallels the increased long-term effect of the "Postdoc mentor rate" in this regression, compared to the original regression.

9
10 Time-dependence of regression coefficients 39 Fig. 12 shows the regression coefficients obtained when training the model on temporal subsets of the data 40 and without the time-controlling variable of "postdoc end year". Except for this omission, the variables 41 included in the model were the ones obtained after the :::: main : selection process (cf. Table 1 in main text).

42
The optimized coefficients from Fig. 12 display much more variability than with the regression shown in main 43 text, due to lower sample sizes and the exclusion of temporal variables, but overall display similar trends as 44 the full model. 45 We also controlled for long term temporal effects through a series of higher-order terms into the regression 46 model (Fig. 13). This expansion of higher-order terms has the advantage of modeling arbitrary temporal trends, as in Taylor series, to the risk of over-fitting temporal trends 39 . Here, not surprisingly ::: Not ::::::::::: surprisingly, 48 these additional terms were able to capture some additional variance in the data, but they had no large impact 49 on the other factors of interest in the regression. In particular, the intellectual synthesis effects appeared 50 robust to the inclusions of finer temporal controls.
To visualize the contribution of each variable, we also display the cross-validated predictions of the model without this variable in purple ("partial model"). Lines and shaded areas represent respectively the mean values and their 95% bootstrapped confidence interval. A. B.