Changing institutions is an integral part of an academic life. Yet little is known about the mobility patterns of scientists at an institutional level and how these career choices affect scientific outcomes. Here, we examine over 420,000 papers, to track the affiliation information of individual scientists, allowing us to reconstruct their career trajectories over decades. We find that career movements are not only temporally and spatially localized, but also characterized by a high degree of stratification in institutional ranking. When cross-group movement occurs, we find that while going from elite to lower-rank institutions on average associates with modest decrease in scientific performance, transitioning into elite institutions does not result in subsequent performance gain. These results offer empirical evidence on institutional level career choices and movements and have potential implications for science policy.
Despite their importance for education, scientific productivity, reward and hiring procedures, our quantitative understandings of how individuals make career moves and relocate to new institutions and how such moves shape and affect performance, remains limited. Indeed, previous research on migration patterns of scientists1,2 tended to focus on large-scale surveys on country-level movements, revealing long-term cultural and economical priorities3,4,5,6. At a much finer scale, research on human dynamics and mobility has emerged as an active line of enquiry7,8,9,10,11,12,13, owing to new and increasingly available massive datasets providing time resolved individual trajectories14. While these studies cover a much shorter time scale than a typical career, they uncover a set of regularities and reproducible patterns behind human movements7,10,15. Less is known about patterns behind career moves at an institutional level and how these moves affect individual performance.
Here we take advantage of the fact that scientists publish somewhat regularly along their career16,17 and for each publication, the institution in which the work was performed is listed as an affiliation in the paper, documenting career trajectories at a fine scale and in great detail. These digital traces, offering data on not only individual scientific output at each institution but also career moves from one institution to another, can provide insights for science policy, helping us understand how institutions shape knowledge, the typical moves of individual career development and help us evaluate scientific outcomes associated with professional mobility.
We use the Physical Review dataset to extract mobility information, publication record and citations for individual scientists. The data consists of 237,038 physicists and 425,369 scientific papers, out of which 4,052 different institutions are extracted after the disambiguation process for authors and affiliations (see SM for disambiguation process). To reconstruct the career trajectory of a scientist, we use the affiliation given in each of his/her publications (Fig 1). For authors with multiple affiliations listed on a paper we consider the first affiliation as primary institution. We compute the impact of each paper by counting its cumulative citations collected 5 years after its publication18,19,20,21.
Three characteristics are computed for each institution i (Fig. 2): the institution size (Ai), representing the total number of distinct authors that published at least one paper at institution i; the number of papers (Pi) published under affiliation i; the cumulative number of citations (Ci) collected by all papers Pi. We find that P(A) follows a fat tailed distribution, indicating significant population heterogeneity among different institutions (Fig. 2a). While most institutions are small, a few have a large number of scientists, often corresponding to large institutes or universities. We observe similar disparity in P(C) (Fig. 2b): few institutions acquire a large number of citations, while most research labs or universities receive few citations.
Figures 2c–d show the correlation between the institution size A and both the average publications impact C/P and the average productivity P/A of institutions. The average productivity and impact of an institution are different but complementary measures of scientific performance. We find the institution size has little influence on productivity (R2 = 0.43) (Fig. 2d), yet it positively correlates with the impact of publications (R2 = 0.85), indicating that large institutions offer a more innovative/higher impact environment than smaller ones as captured by citations per paper (Fig. 2c). Also, as larger institutions have more internal collaborations, the number of co-authors in publications from large institutions might be larger and, as a consequence, attracts more citations18.
Many institutions are small with few citations, hence they account for very small portion of the data. For the rest of the paper, we will focus on the thousand most cited institutions, accounting for more than 99% of papers. They correspond to institutions with at least 698 citations within the APS data over the 120-year period (shaded area in Fig. 2).
Mobility is often important in furthering a professional career4. In science, the best lab for the type of research you are doing is usually not where you are22,23,24. Nowadays changing countries is a rite of passage for many young researchers who follow the resources and facilities3,16. As the patterns and characteristics of these migrations are blurry, we need to systematically study the mobility of scientists. Thanks to the large disambiguated data spanning the last 120 years that we have compiled, a systematic study of scientific mobility is now possible.
The strong correlations between the three quantities (A, P, C) indicate any of the three could characterise an institution, serving as a proxy of its ranking against others. Here, we choose C (the total number of citations) as our parameter to approximate the ranking by reputation. Other parameters such as the h-index of an institution or the number of papers P could also be used25,26,27. But the results should be insensitive to this choice owing to good correlations between these quantities (R2 = 0.96 and R2 = 0.92 respectively). The top-ranked institutions all correspond to well-known universities or research labs with long tradition of excellence in physics (Fig. 3), corroborating our hypothesis that C is a reasonable proxy for ranking. We can also observe the similarity and stability of other rankings when comparing with other metrics.
We focus on authors with similar career longevity, restricting our corpus to those who began their career between 1950 and 1980 and published for at least 20 years without any interruption exceeding 5 years. Following these criteria, we arrived at a subset of 2,725 scientists to study the mobility patterns and their impact on their careers. A total of 5,915 career movements are recorded for this corpus.
In Figure 4a we select three individuals as exemplary career histories. Each line represents one individual, with circles denoting his/her publications, allowing us to observe his/her location. The size of the circle is proportional to citations the paper acquires in five years, approximating the impact of the work. By studying the whole corpus, we compute P(m), the probability for a scientist to have visited m different institutions along his career (Fig. 4c), finding that career movements are common but infrequent: Only 14% of them never moved at all (m = 1). For the ones that move, they mostly move once or twice, P(m) decaying quickly as m increases. We also compute P(t), the probability to observe a movement at time t, where t = 0 corresponds to the date of the scientist's first publication. We find that most movements occurred in the early stage of the career (Fig. 4b), supporting the hypothesis that changing affiliations is a rite of passage for young researchers4. This likely corresponds to the postdoc period where graduates broaden their horizons through mobility. This may also reflect the increasing cost of relocation and family constraints as family developed3,5. A third characteristic is the geographical distance of movements, Δd. Existing literature hints for somewhat competing hypothesis in the role geography plays in career movements. Indeed, research on human mobility suggests that regular human movements mostly cover short distances with occasional longer trips, characterized by a power law distance distribution7,8,10,28; in contrast, country-level surveys find increasing cross-country movements mostly due to cultural exposure and life quality concerns, indicating potential dominance in long distance moves in career choices comparing with typical human travels1,2,3,5,29,30,31. We measure the distance distribution over all moves observed in our dataset, finding that our result is supported by a combination of both hypothesis. We find the probability to move to further locations decays as a power law32,33, whereas the null model predicts this probability to be flat (Fig. 4d). This observation is consistent with studies on human mobility, that short distance moves dominate career choices. Yet, when comparing the power law exponents, we find the exponent characterizing career moves (γ = 0.65 ± 0.053) is much smaller than those observed in human travel (γ ≈ 2), corresponding to higher likelihood of observing long range movements. This observation might be explained by the influence that scientific collaborations can have on career movements as similar low exponents are observed for collaboration network between cities34.
Taken together, the preceding results indicate that career moves mostly happen during the early stage of a career and are more likely to cover short distances. The observed location in both time and space raises the question of how individual moves as a function of institutional rankings. To this end, denoting with Ti, j the number of transitions from the institution of rank i to the one of rank j, we measure P(i, j), the probability to have a transition from rank i to rank j as
Interestingly, we find that most movements involve elite institutions (rank is small) and transitions between bottom institutions are rare (Fig. 5a). This is due to the fact that elite institutions are characterised by larger populations, hence translating into more events.
To account for the population based heterogeneity, we compare the observed P(i, j) with the probability Pnull(i, j) expected in a random model where we randomly shuffle the transitions from institution i to j while preserving the total number of transitions from and to each institution. Formally, in this null model, we have
and we compare P(i, j) with the null model by computing the matrix
M(i, j) is the ratio between the probability P(i, j) to have a transition from rank i to j divided by the probability Pnull(i, j) when the movements are shuffled, measuring the likelihood for a move to take place by accounting for the size of the institutions. Hence, M(i, j) = 1 indicates the amount of observed movements is about what one would expect if movements were random. Similarly, M(i, j) > 1 indicates that we observe more transitions from i to j than we expected, whereas M(i, j) < 1 corresponds to transitions that are underrepresented. We find that career moves are characterized by a high degree of stratification in institutional rankings (Fig. 5b). Indeed, we observe two distinct clubs (red spots in Fig. 5b), indicating that the overrepresented movements are the ones within elite institutions (lower-left corner) or within lower-rank institutions (upper-right corner) and scientists belonging to one of the two groups tend to move to institutions within the same group. On the other hand, both upper-left and lower-right corners are colored blue, indicating cross group movements (transitions from elite to lower-rank institutions and vice-versa) are significantly underrepresented. Also, scientists from medium-ranked institutions move to the next institution with a probability that is indistinguishable from the random case. In other words, their movements indicate no bias towards middle, elite or lower-ranked institutions.
The high intensity of stratification in career movements raises an interesting question: how does individual performance in science relate to their moves across different institutional rankings?
To answer this question, we need to quantify the performance change for each individual before and after the move. Imagine that a scientist moves from i to j and published n papers at location i and m papers at j. The impact of a paper k can be approximated by ck, the number of citations cumulated within 5 years after its publication18,19,20,21. Let and be the lists of number of citations for papers published before (c−) and after (c+) the transition from i to j (Ti, j). To quantify the change in performance, we introduce
where and are the average of c+ and c−, respectively and σc corresponds to the standard deviation of the concatenation of both c+ and c− while preserving the moment when the movement took place (see SM for more information about σc). Therefore, Δc* captures the statistical difference in the average citations between papers published before and after the movement normalized by the random expectation when the same author's publications were shuffled. A positive Δc* indicates papers following the move on average result in higher citation impact, hence representing an improvement in scientific performance. A negative value corresponds to a decline in performance.
To quantify the influence of movements on individual performance, we divide all movements into two categories based on the performance change: movements associated with positive and negative Δc* and measure M(i, j|Δc* > 0) and M(i, j|Δc* < 0). We find the observed stratification in career moves is robust against individual performance (Fig. 5c–d). That is, the two clubs emerge for both categories in a similar fashion as in Figure 5b, indicating the pattern of moving within elite or lower-rank institutions is nearly universal for people whose performance is improved or decreased following the move. Comparing Figure 5c and Figure 5d, we find the red spot in lower-left corner is more concentrated in Figure 5d than in Figure 5c, hinting that being more mobile in the space of rankings may lead to variable performance. To test this hypothesis, for each transition Ti, j we calculate the rank difference between the origin and destination (Δrij = i − j).
A positive value of Δrij indicates i > j, hence a movement to a lower-rank institution, whereas Δrij < 0 corresponds to transitions into institutions with a higher rank. In Figure 6 we measure the relation between Δc* and Δr. When scientists move to institutions with a lower rank (Δr > 0), we find that their average change in performance is negative, corresponding to a decline in the impact of their work. Yet, what is particularly interesting lies in the Δr < 0 regime. Indeed, when people move from lower rank location to elite institutions, we observe no performance change on average. This is rather unexpected, as transitioning from lower-rank institutions to elite institutions is thought to provide better access to ideas and lab resources, which in turn should fuel scientific productivity. A possible explanation may be that scientist who have the opportunity to make big jumps in the ranking space may have already had an excellent performance in their previous institutions. A move therefore will not affect their impact.
In summary, we extracted affiliation information from the publications of each scientist, allowing us to reconstruct their career moves between different institutions as well as the body of work published at each location. We find career movements are common yet infrequent. Most people move only once or twice and usually in the early stage of their career. Career movements are affected by geography. The distance covered by the move can be approximated with a power law distribution, indicating that most movements are local and moving to faraway locations is less probable. We also observe a high degree of stratification in career movements. People from elite institutions are more likely to move to other elite institutions, whereas people from lower rank institutions are more likely to move to places with similar ranks. We further confirm that the observed stratification is robust against the change in individual performance before and after the move. When cross-group movement occurs, we find that while going from elite to lower-rank institutions on average results in a modest decrease in scientific impact, transitioning into elite institutions, does not result in gain in impact.
The nature of our dataset restricted our study on a sample of scientists. As a result of this selection process, our results are biased towards physicists from 1960s to 1980s with high career longevity. Yet, these limitations also suggest new avenues for further investigations. Indeed, as datasets become more comprehensive and of higher resolution, newly available data sources like Web of Science or Google Scholar can provide new and deeper insights towards generalization of the results across different disciplines, temporal trends and more. Further investigations regarding the influence of career longevity on scientific mobility should also be considered as it could reveal as well results of importance. Taken together our results offer the first systematic empirical evidence on how career moves affect scientific performance and impact.
The data provided by the American Physical Society (APS) contains over 450,000 publications, each identified with a unique number, corresponding to all papers published in 9 different journals, namely Physical Review A, B, C, D, E, I, L, ST and Review of Modern Physics, spanning a period of 117 years from 1893 to 2010. For each paper the dataset includes title, date of publication (day,month,year), author names and affiliations of each of the authors. A separate dataset also provides list of citations within the APS data only, using unique paper identifiers. About 5% of publications with ambiguous author-affiliation links or massively authored were removed from this dataset (see SM for more details).
Author Name Disambiguation
To derive individual information, one has to reconnect papers belonging to a single scientist. Since no unique author identifier is present in the data, author names must be disambiguated. The dataset contains about 1,2 millions of author-paper pairs. To overcome the ambiguities present in the data, we design a procedure that uses information about the author but also metadata about the paper such as coauthors and citations. By computing similarities between authors, our procedure can successfully detect single authors as well as homonymies (see SM for more details about the disambiguation method). A total of 237,038 distinct scientists are detected by our method.
A major disadvantage when dealing with publication data is the inconsistencies and errors associated with affiliation names on papers. A total of 319,829 different affiliation names are identified in the dataset. The disambiguation procedure for affiliations uses geocoded information as well as a similarity measure between affiliation names in order to disambiguate institutions. The disambiguated set of authors also plays a crucial role in the procedure (see SM for more details about the disambiguation method). A total of 4,052 distinct institutions are identified by our algorithm.
Resolving individual career trajectory
Based on the information present in the publications of a scientist, we can reconstruct his/her career trajectory. In order to detect career movements, i.e. changes in a scientist's institution, one has to remove artificial movements induced by short-term stays and by errors and typos in the affiliation names on the papers. To do so, only institutions reported in at least two consecutive papers are considered in a career trajectory.
Ranking the institutions
Three variables are considered to rank an institution: (i) the total number of papers, Pi, published with institution i, (ii) the cumulated number of citations, Ci, corresponding to institution i, (iii) the h-index, Hi, of institution i. The variable Ci is defined as where ck is the number of citations within the APS data of paper k cumulated within 5 years after its publications. An institution has an h-index H if H of its P papers have at least H citations each and the other (P – H) papers have no more than H citations each. H for papers indicates the cumulative number of citations obtained within 5 years after the publication.
Binning the institutions
About 6,000 transitions between 1,000 institutions are detected for our subset of scientists. In order to have a statistically significant number of transitions to derive the values of P(i, j) and M(i, j) (Fig. 5), institutions are binned logarithmically according to their rank (r) into five groups.
Auriol, L. Labour market characteristics and international mobility of doctorate holders: results for seven countries. OECD STI Working Papers 2, 37–37 (2007).
Auriol, L. Careers of doctorate holders: employment and mobility patterns. OECD STI Working Papers 4, 30–30 (2010).
Van Noorden, R. Global mobility: Science on the move. Nature 490, 326–329 (2012).
Schiermeier, Q. Career choices: The mobility imperative. Nature 470, 563–564 (2011).
Jans, G. et al. Study on mobility patterns and career paths of eu researches. Tech. Rep., European Commission (2010).
Solimano, A. The International Mobility of Talent: Types, Causes and Development Impact: Types, Causes and Development Impact (Oxford University Press, 2008).
Brockmann, D., Hufnagel, L. & Geisel, T. The scaling laws of human travel. Nature 439, 462–465 (2006).
Simini, F., González, M. C., Maritan, A. & Barabási, A.-L. A universal model for mobility and migration patterns. Nature 484, 96–100 (2012).
Pastor-Satorras, R. & Vespignani, A. Epidemic spreading in scale-free networks. Phys. Rev. Lett. 86, 3200–3203 (2001).
González, M., Hidalgo, C. & Barabási, A. Understanding individual human mobility patterns. Nature 453, 779–782 (2008).
Bagrow, J. P., Wang, D. & Barabasi, A.-L. Collective response of human populations to large-scale emergencies. PloS one 6, e17680 (2011).
Lu, X., Bengtsson, L. & Holme, P. Predictability of population displacement after the 2010 haiti earthquake. Proc. Natl. Acad. Sci. U.S.A. 109, 11576–11581 (2012).
Szell, M., Sinatra, R., Petri, G., Thurner, S. & Latora, V. Understanding mobility in a social petri dish. Sci. Rep. 2, 457, 10.1038/srep00457 (2012).
Blondel, V. D. et al. Data for development: the d4d challenge on mobile phone data. arXiv:1210.0137 [cs.CY] (2012).
Song, C., Koren, T., Wang, P. & Barabási, A.-L. Modelling the scaling properties of human mobility. Nature Physics 6, 818–823 (2010).
Petersen, A. M., Riccaboni, M., Stanley, H. E. & Pammolli, F. Persistence and uncertainty in the academic career. Proc. Natl. Acad. Sci. U.S.A. 109, 5213–5218 (2012).
Petersen, A. M. et al. Reputation and impact in academic careers. arXiv:1303.7274 [physics.soc-ph] (2013).
Jones, B. F., Wuchty, S. & Uzzi, B. Multi-university research teams: Shifting impact, geography and stratification in science. Science 322, 1259–1262 (2008).
Radicchi, F., Fortunato, S. & Castellano, C. Universality of citation distributions: Toward an objective measure of scientific impact. Proc. Natl. Acad. Sci. U.S.A. 105, 17268–17272 (2008).
Barabási, A.-L., Song, C. & Wang, D. Publishing: Handful of papers dominates citation. Nature 491, 40–40 (2012).
Wang, D., Song, C. & Barabási, A.-L. Quantifying long-term scientific impact. Science 342, 127–132 (2013).
Zhang, Q., Perra, N., Gonçalves, B., Ciulla, F. & Vespignani, A. Characterizing scientific production and consumption in physics. Sci. Rep. 3 (2013).
Börner, K. & Penumarthy, S. Spatio-temporal information production and consumption of major us research institutions. http://ivl.cns.iu.edu/km/pres/2005-borner-sptiotmp-sweden.pdf, (Date of access:09/06/2013) (2005).
Mazloumian, A., Helbing, D., Lozano, S., Light, R. P. & Börner, K. Global multi-level analysis of the ‘scientific food web’. Sci. Rep. 3, 1167, 10.1038/srep01167 (2013).
Hirsch, J. E. An index to quantify an individual's scientific research output. Proc. Natl. Acad. Sci. U.S.A. 102, 16569 (2005).
Hirsch, J. E. Does the h index have predictive power? Proc. Natl. Acad. Sci. U.S.A. 104, 19193–19198 (2007).
Lehmann, S., Jackson, A. D. & Lautrup, B. E. Measures for measures. Nature 444, 1003–1004 (2006).
Erlander, S. & Stewart, N. F. The Gravity Model in Transportation Analysis: Theory and Extensions, vol. 3 (Vsp, 1990).
Levin, S. G. & Stephan, P. E. Are the foreign born a source of strength for us science? Science 285, 1213–1214 (1999).
Zucker, L. G. & Darby, M. R. Star scientists, innovation and regional and national immigration. Tech. Rep., NBER (2007).
Franzoni, C., Scellato, G. & Stephan, P. Foreign-born scientists: mobility patterns for 16 countries. Nature Biotechnol. 30, 1250–1253 (2012).
Newman, M. E. Power laws, pareto distributions and zipf's law. Contemporary physics 46, 323–351 (2005).
Milojević, S. Power law distributions in information science: Making the case for logarithmic binning. J. Assoc. Inf. Sci. Technol. 61, 2417–2425 (2010).
Pan, R. K., Kaski, K. & Fortunato, S. World citation and collaboration networks: uncovering the role of geography in science. Sci. Rep. 2, 902, 10.1038/srep00902 (2012).
We thank Nicolas Boumal and colleagues from the Center for Complex Network Research (CCNR) for the valuable discussions and comments. D.W., C.S. and A.L.B. are supported by Lockheed Martin Corporation (SRA 11.18.11), the Network Science Collaborative Technology Alliance is sponsored by the U.S. Army Research Laboratory under agreement W911NF-09-2-0053, Defense Advanced Research Projects Agency under agreement 11645021 and the Future and Emerging Technologies Project 317 532 “Multiplex” financed by the European Commission. P.D. is supported by the National Fund for Scientific Research (FNRS) and by the Research Department of the Communauté française de Belgique (Large Graph Concerted Research Action). R.S. acknowledges support from the James S. McDonnell Foundation.
The authors declare no competing financial interests.
Electronic supplementary material
About this article
Cite this article
Deville, P., Wang, D., Sinatra, R. et al. Career on the Move: Geography, Stratification and Scientific Impact. Sci Rep 4, 4770 (2014). https://doi.org/10.1038/srep04770
Scientific Reports (2022)
Scientific Reports (2022)
The quality of the web of science data: a longitudinal study on the completeness of authors-addresses links
Scientific Reports (2021)
Communications Physics (2021)