Do scientists trace hot topics?

Wei, Tian; Li, Menghui; Wu, Chensheng; Yan, Xiao-Yong; Fan, Ying; Di, Zengru; Wu, Jinshan

doi:10.1038/srep02207

Download PDF

Article
Open access
Published: 16 July 2013

Do scientists trace hot topics?

Tian Wei¹,
Menghui Li^2,3,
Chensheng Wu^1,4,
Xiao-Yong Yan^1,5,
Ying Fan¹,
Zengru Di¹ &
…
Jinshan Wu¹

Scientific Reports volume 3, Article number: 2207 (2013) Cite this article

7109 Accesses
26 Citations
31 Altmetric
Metrics details

Subjects

Abstract

Do scientists follow hot topics in their scientific investigations? In this paper, by performing analysis to papers published in the American Physical Society (APS) Physical Review journals, it is found that papers are more likely to be attracted by hot fields, where the hotness of a field is measured by the number of papers belonging to the field. This indicates that scientists generally do follow hot topics. However, there are qualitative differences among scientists from various countries, among research works regarding different number of authors, different number of affiliations and different number of references. These observations could be valuable for policy makers when deciding research funding and also for individual researchers when searching for scientific projects.

Quantifying progress in research topics across nations

Article Open access 23 March 2023

Interdisciplinarity revisited: evidence for research impact and dynamism

Article Open access 12 November 2019

SciSciNet: A large-scale open data lake for the science of science research

Article Open access 01 June 2023

Introduction

The phenomenon of “the rich get richer”, which is also called preferential attachment in the field of complex networks¹, is quite common in many fields^2,3 (for example, see references cited in Table I); however, the scientific field is composed of scientists, a special group of people who focus on proposing, investigating and implementing original and creative ideas. Therefore, it is plausible that the “the rich get richer” phenomenon is less pronounced in the fields investigated by scientists than in other areas. Ideally scientists choose their fields of investigation according to their scientific interest and the scientific value of the investigated questions but not due to the hotness of the investigated fields. In this work, using published papers from the American Physical Society (APS) Physical Review journals beginning in 1976 and ending in 2009, we test whether the subject of a new paper is more likely to be in a hot field than in a relatively unknown field when the paper is published. We also compare scientists from different countries. Such comparisons could provide insightful and interesting information. In China, modern scientific development is still very young. It is believed among many scientists that there are many more Chinese scientists that are followers than original thinkers compared with many other countries. In this work, we offer direct empirical support for this hypothesis. Finally, we also determine if the degree of tracing hot fields differs for papers with different number of authors or affiliations and different number of references. Interestingly, it is found that scientists who collaborate with more authors or more affiliations tend to follow hotter fields than those who works with a few collaborators or affiliations. Moreover, papers with a small number of references, on average, are more interest-driven or value-driven, whereas papers with a large number of references are more hotness-driven. These empirical discoveries, particularly if it is also performed in other fields and for a larger periods of time, could provide valuable information for policy makers.

Table 1 A list of some values of α, the degree of preferential attachment in different networks collected from literature, showing also the number of nodes N (and M for bipartite networks), the number of links E. The first column is the name of the investigated database and the discussed relationships are discussed within the brackets. Please note that APS-PACS, Google Earth and Digg are bipartite networks between individuals and groups while all other networks describe relations among individuals

Full size table

Results

Empirically, the phenomenon of preferential attachment has been found in many systems. We compile a list of typical systems, where their positive exponent α values are shown in Table I, indicating that new links are more likely to attach to nodes with larger degree. Later, we will compare our results on hotness tracing of newly published papers against other phenomena listed in this table.

Firstly, we examine the phenomenon of preferential attachment of papers in the PR-PACS data set (see Methods for details). In a log-log plot, Fig. 1 (a) displays the accumulated distribution function κ(k) with respect to the size k of the field that a newly published paper belongs to. The positive exponents α indicate that new papers are more likely to focus on hot fields (larger sizes), i.e., new links preferentially attach to nodes (PACS) with larger sizes. Or to say, generally, scientists do publish more new papers in current hot fields. We obtain the exponents α by least-square fitting from k = 1 to k = 300 as the curves deviate from the straight line for large k due to low statistics. For different years t, all of κ(k) follow power law, namely κ(k) ~ k^α⁺¹, but slightly different parameters α (as shown in Fig. 1 (b)). We also plot the distribution of fields' size N(k) as an inset in Fig. 1 (a), which follows a highly skewed distribution.

Compared with the preferential attachment phenomena in other fields listed in Table I, the exponents α = 0.52–0.87 from our PR-PACS data are near the lower end of all the exponents. Out of all the other 13 entries in Table I, only sexual contact in sexual networks (α = 0.32–0.80)⁴, users attached to membership of groups of Digg (α = 0.69)⁵ and friendship relations in Linkedin (α = 0.6)⁶ are approximately at the same level of hotness tracing. It might be easy to “follow” a star member in a social networking website, such as Flickr (α = 1.0)⁶; however, it might not be so easy to shift a research field, join a user group of a different product or change sexual partners towards hotter choices. Besides the difficulty in changing fields, it might also because that scientific works are more independent and more interest-driven or value-driven than other fields.

Secondly, we test whether the intensity of tracking hot topics of scientific research differs in different countries. Therefore, we classify the papers according to countries of the first author's affiliation and calculate the absolute contribution ratios r ^c of several major countries within PR-PACS data set. They are USA (33.07%), Germany (9.95%), Japan (6.94%) and China (3.73%). As seen from Fig. 2 (a), in the year 2008 the exponent (α = 1.06) of P. R. China is larger than that of other countries, e.g., USA 0.74, Germany 0.86 and Japan 0.93. Moreover, as shown in the inset of Fig. 2 (a), the exponent α for China is generally larger than that of other countries for different years. These results indicate that the phenomenon of tracing hotness is more severe among Chinese scholars. In addition, it is found that UK (3.80%) and France (5.55%) lie between USA and Germany, while Russia (1.89%) lies between Germany and China.

To provide a comparative study, we also calculate the relative contribution ratio R^c(k) of papers from country c and belonging to the fields with size k (see Methods for the details). As seen from Fig. 2 (b), the relative contribution ratio R^c(k) of Chinese scholars is smaller than 1 in cold fields (smaller k) but larger than 1 in hot fields (larger k), indicating that Chinese scholars make less contributions to cold fields than their average contribution but more to hot fields than their average. Meanwhile, the situation of USA is opposite to that of China. This difference also indicates that Chinese scientists are more keen to follow hot topics than United States scientists from another aspect. This agrees with our previous observations.

Considering the fact that scientific studies in China are still young, it is understandable that a large percentage of them are on hotness-driven fields rather than value-driven fields. The discovered order – the USA, Germany, Japan and China – of degree of hotness tracing makes sense intuitively. These results are more or less consistent with our intuitions. How different positions are related to scientific policies of that country, or even the culture and values of that country, although is definitely worth a further investigation, is outside the scope of the current study. We simply want to demonstrate the capability of the methods that are discussed above in analyzing publication records and to present some basic discoveries using the methods in this work.

Next, we measure the influence of different number of authors and affiliations on the degree of tracing hot topics. Therefore, we classify the papers according to their number of authors and number of affiliations. It is argued in Ref. 7 that research works with many authors or many affiliations typically focus more on hot topics because it might exactly be the hotness of the paper subject that made collaboration attractive among the scientists and that a joint task team is generally more likely to focus on short-term projects rather than long-term projects. Here we make such an examination based on the PR-PACS data. We can see from Fig. 3 that overall, α, the degree of severity of tracing hot topics, increases with the number of authors and affiliations. These results provide empirical supports for the arguments in Ref. 7. In a sense, global collaboration is not necessarily a good strategy for high-quality, value-driven research topics as suggested in Ref. 7.

Moreover, it is also claimed that the size of teams and the share of multi-university collaboration grows rapidly in recent decades^8,9,10 and teams and multi-university collaborations tend to produce more highly cited papers^9,10. It is believed that generally, the papers belonging to hot fields can easily acquire more citations than that of cold fields as there are more papers focusing on similar topics. Therefore, the fact discovered in^8,9,10 does not imply that multi-university collaboration and larger teams are making bigger and significant progress. It might more due to that fact that the fields they are working on is bigger. We check this hypothesis in this section to see if the papers with more authors and more affiliations are more likely to attach to hot fields. It is indeed so as shown in Fig. 3. From this view point, our results are consistent with the claims in Ref. 9,10.

Finally, we investigate the effect of the number of references on tracking hot topics. We classify the papers according to their number of references. It is obvious that the average number of references in papers today is much larger than that of early times. For earlier times, one can intuitively hypothesize that a pioneer paper or a paper of good quality typically cited less references. However, today, the number of references may or may not relate to how innovative the paper is. Here, we examine this hypothesis. As seen from Fig. 4, the exponents α, the degree of severity of tracing hot topics, increase with the number of references, which could indicate that papers with a larger number of references are more likely to be on hot topics. Notice that the absolute values of α for larger number of references in recent years are larger than that in earlier years, suggesting that scientists trace hot fields a bit more severely in recent years than in earlier times. Therefore, the hypothesis is reasonable overall.

Discussion

We have found that in the tested fields of science for papers published by the APS physical review journals, hot fields attract more newly published papers; however, scientific works are not as hotness-driven compared with other fields. Among the major countries, this phenomenon of tracing hotness occurs more in China compared with other countries, which could be related to the fact that China is still a developing country in terms of scientific research. We also found that papers with more authors, more affiliations and more references were on subjects that were more hotness-driven. This observation could potentially be valuable to policy makers that fund scientific projects.

Here only data of physics publications were analyzed. A cross-field comparison would be interesting, particularly if one can do similar studies on math and social science, such as economics. Similar studies can be applied on producing, selling or buying products. That is to investigate when a product is manufactured, sold or bought, how often is it related to the number of the product that has been manufactured, sold or bought already. Such studies could be valuable for marketing research.

It might also be interesting to determine how often outstanding/important works when they are first published are in hot fields. Fields gradually or suddenly become hot after major breakthroughs are presented in a few pioneer papers, which later on, might be awarded or honoured. Using records of awards such as the Nobel prize, the Fields Medal, the Turing Award etc., it would be possible to perform a statistical analysis of papers with major breakthroughs. All the award-winning papers could be collected, where the same analysis to determine α can be performed, which could then be compared with the overall α.

The above investigation can be performed, not only at the macro level of counties of authorship but also at the mesoscopic level of affiliations of authorship. In this way, one might be able to compare tradition, culture and research quality at various levels of academic units. In principle, one could also collect all published papers of one author and treat all those papers as a subset and study the preferential attachment phenomenon of those papers if the size of such collection is large enough. It would then be possible to use it to measure the creativity and originality of a researcher.

Methods

Data description and notations

Our data set is a collection of all papers published by the American Physical Society (APS) Physical Review journals from 1976 to 2009. Each paper is recorded as a data entry, which includes its title, date of publication, classification number according to the AIP-Physics and Astronomy Classification Scheme (PACS), author(s), affiliation(s) and reference(s) to other papers within the data set. The entire data set contains more than 320, 000 entries, including a variety of article types, such as article, brief report, rapid communication, comment, reply, erratum, essay, announcement, editorial, announcement and so on. Here, we will only consider those research papers, e.g., article, brief report and rapid communication, with PACS numbers and refer to it as the PR-PACS data set. At last, we have N = 315, 082 entries, which each entry, i.e. each paper, is denoted as i and M = 5, 472 PACS, which each entry, i.e. each PACS number, is denoted as λ.

We use PACS, the established hierarchical classification systems of physics, to identify the fields within the physics field. The investigated system of APS-PACS is a bipartite network, where there are two kinds of nodes, i.e., papers and PACS and one kind of links, i.e., relation between paper and PACS, which is determined by the researchers. Mathematically, we can use one matrix to characterize the relation between paper and PACS. A is an N × M adjacency matrix, with element a_iλ = 1 if paper i belongs to PACS λ, or to say, PACS λ appears in paper i, otherwise 0. Generally, one paper can belong to several fields (have several PACS), therefore there are a total of 900,832 links between papers and PACS, i.e., . The size of a field, , is defined as the number of papers that belong to it, describing the popularity (hotness) of the fields. Then, we can calculate the number N(k) of fields with size k. Based on PR-PACS data, our main concern is to observe whether new papers would like to attach to hot fields (with larger size k), i.e., whether new links preferentially attach to the nodes with larger degree.

Measuring preferential attachment

Here, we explain our method for this statistical analysis for testing preferential attachment on temporal data set. The basic idea is to investigate whether new links are likely to attach to nodes with larger degree (size). We calculate the empirical value of the relative probability T(k) that a new paper published within a short period Δt connects to a field which has a size of k before the time t¹¹ as follows. Since the corresponding time-dependent absolute probability P_k(t) that a new paper published in a field with size k is proportional to T(k)n_k(t)/N(t), where n_k (t) is the number of fields with size k and N(t) is the number of fields immediately before time t, then T(k) can be estimated by making a histogram of the sizes k of the fields to which each paper is added within the time period Δt in which each sample is weighted by a factor of ,

where k_λ(t) = k means that the field, to which the papers published within the period Δt belong, has size k at time t. We now have the empirical curve T(k) from the above statistical analysis. In order to conveniently compare T(k) with different time t, T(k) is normalized as ^12,13.

The preferential attachment hypothesis states that the rate T′(k) with which a node with k links acquires new links is a monotonically increasing function of k¹, namely

For BA model α = 1¹. To obtain a smooth curve from noisy data, we take the cumulative function form instead of T′(k):

Thus, κ (k) should be proportional to k^α⁺¹. We can now fit the empirical curve from the previous statistical analysis and then compare it against this hypothesized curve of preferential attachment. This is the general procedure of all the analysis presented in this work. This method has been extensively used to measure the preferential attachment phenomenon in different kinds of networks^{4,5,6,11,12,13,14,15,16,17,18,19}.

To test the preferential attachment of scientific research differs in different countries, we separated the entire data set according to countries of the first author's affiliation and then perform a comparison among the most contributed countries (USA, Germany, Japan, China). With this separated data set, we perform the examination of preferential attachment only counting the papers from authors in country c as

Here aff_i = c means the principle affiliation of this paper i is in country c. In counting k and n_k(t), we included papers from all countries, meaning that scientists from all countries face the temptation of tracing the same overall hotness in the entire PR-PACS data set. Similarly, besides countries, the above calculation can be applied to any features of papers, such as different number of authors, affiliations and references.

Measuring relative contribution ratio R^c (k)

Absolute contribution from a country is measured simply by a percentage of published papers from that country out of the total number of published papers,

where m_k is the number of papers (from country c) belonging to fields with size k. Here we present a more detailed breakdown of this absolute contribution by looking at each individual field what is the percentage of papers from that country out of all papers in that field and then normalized by the absolute contribution of that country,

This is a static measure, so it is easy to perform. In a sense it also describes how often scientists in that country are pursuing hot fields.

References

Barabási, A. L. & Albert, R. Emergence of scaling in random networks. Science. 286, 509–512 (1999).
Article ADS MathSciNet PubMed MATH Google Scholar
Simon, H. A. On a class of skew distribution functions. Biometrika. 42, 425–440 (1955).
Article MathSciNet MATH Google Scholar
Price, D. D. S. A general theory of bibliometric and other cumulative advantage processes. J. Am. Soc. Inf. sci. 27, 292–306 (1976).
Article Google Scholar
De Blasio, B. F., Svensson, A. & Liljeros, F. Preferential attachment in sexual networks. Proc. Natl. Acad. Sci. USA. 104, 10762–10767 (2007).
Article ADS CAS PubMed Google Scholar
Shi, X., Zhu, J., Cai, R. & Zhang, L. User grouping behavior in online forums. KDD '09. 777–786 (2009).
Leskovec, J., Backstrom, L., Kumar, R. & Tomkins, A. Microscopic evolution of social networks. KDD '08. 462–470 (2008).
Adams, J. Collaborations: The rise of research networks. Nature. 490, 335–336 (2012).
Article ADS CAS PubMed Google Scholar
Adams, J. D., Black, G. C., Clemmons, J. R. & Stephan, P. E. Scientific teams and institutional collaborations: Evidence from U.S. universities, 1981–1999. Res. Policy. 34, 259–285 (2005).
Article Google Scholar
Wuchty, S., Jones, B. F. & Uzzi, B. The increasing dominance of teams in production of knowledge. Science. 316, 1036–1039 (2007).
Article ADS CAS PubMed Google Scholar
Jones, B. F., Wuchty, S. & Uzzi, B. Multi-university research teams: shifting impact, geography and stratification in science. Science. 322, 1259–1262 (2008).
Article ADS CAS PubMed Google Scholar
Newman, M. E. J. Clustering and preferential attachment in growing networks. Phys. Rev. E. 64, 025102(R) (2001).
Article ADS CAS Google Scholar
Jeong, H., Néda, Z. & Barabási, A. L. Measuring preferential attachment in evolving networks. Europhys. Lett. 61, 567–572 (2003).
Article ADS CAS Google Scholar
Li, M., Gao, L., Fan, Y., Wu, J. & Di, Z. Emergence of global preferential attachment from local interaction. New. J. Phys. 12, 043029 (2010).
Article ADS Google Scholar
Szell, M. & Thurner, S. Measuring social dynamics in a massive multiplayer online game. Social Networks. 32, 313–329 (2010).
Article Google Scholar
Saavedra, S., Reed-Tsochas, F. & Uzzi, B. Asymmetric disassembly and robustness in declining networks. Proc. Natl. Acad. Sci. USA. 105, 16466–16471 (2008).
Article ADS PubMed Google Scholar
Eom, Y. H., Jeon, C., Jeong, H. & Kahng, B. Evolution of weighted scale-free networks in empirical data. Phys. Rev. E. 77, 056105 (2008).
Article ADS CAS Google Scholar
Aiello, L. M., Barrat, A., Cattuto, C., Ruffo, G. & Schifanella, R. Link creation and profile alignment in the aNobii social network. SocialCom '10: Proceedings of the Second IEEE International Conference on Social Computing. 249–256 (2012).
Zhao, J. Z., Lui, J. C. S., Towsley, D., Guan, X. H. & Zhou, Y. D. Empirical analysis of the evolution of follower network: a case study on Douban. Computer Communications Workshops (INFOCOM WKSHPS), 2011 IEEE Conference. 924–929 (2011).
Hu, H. B. & Wang, X. F. Disassortative mixing in online social networks. EPL. 86, 18003 (2009).
Article ADS CAS Google Scholar

Download references

Acknowledgements

This work was supported by NSFC Grant 11205014 and 60974084. The authors thank the APS Physical Review for sharing the data. There has been many very fruitful discussions between one of us (J. Wu) and Prof. Bertrand Roehner.

Author information

Authors and Affiliations

School of Systems Science, Beijing Normal University, Beijing, 100875, P.R. China
Tian Wei, Chensheng Wu, Xiao-Yong Yan, Ying Fan, Zengru Di & Jinshan Wu
Temasek Laboratories, National University of Singapore, 117508, Singapore
Menghui Li
Beijing-Hong Kong-Singapore Joint Centre for Nonlinear & Complex Systems (Singapore), National University of Singapore - Kent Ridge, 119260, Singapore
Menghui Li
Beijing Institute of Science and Technology Intelligence, Beijing, 100044, P.R. China
Chensheng Wu
Centre for Complex Systems Research, Shijiazhuang Tiedao University, Shijiazhuang, 050043, P.R. China
Xiao-Yong Yan

Authors

Tian Wei
View author publications
You can also search for this author in PubMed Google Scholar
Menghui Li
View author publications
You can also search for this author in PubMed Google Scholar
Chensheng Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Yong Yan
View author publications
You can also search for this author in PubMed Google Scholar
Ying Fan
View author publications
You can also search for this author in PubMed Google Scholar
Zengru Di
View author publications
You can also search for this author in PubMed Google Scholar
Jinshan Wu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.W., Z.D. and Y.F. designed the research. T.W., M.L., X.Y. and C.W. analyzed the data. T.W. and J. Wu participated in the writing of the manuscript.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Rights and permissions

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareALike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/

Reprints and permissions

About this article

Cite this article

Wei, T., Li, M., Wu, C. et al. Do scientists trace hot topics?. Sci Rep 3, 2207 (2013). https://doi.org/10.1038/srep02207

Download citation

Received: 22 March 2013
Accepted: 01 July 2013
Published: 16 July 2013
DOI: https://doi.org/10.1038/srep02207

This article is cited by

Quantifying progress in research topics across nations
- Kimitaka Asatani
- Sumihiro Oki
- Ichiro Sakata
Scientific Reports (2023)
The aging effect in evolving scientific citation networks
- Feng Hu
- Lin Ma
- Zi-Ke Zhang
Scientometrics (2021)
Quantifying patterns of research-interest evolution
- Tao Jia
- Dashun Wang
- Boleslaw K. Szymanski
Nature Human Behaviour (2017)
The correlation between editorial delay and the ratio of highly cited papers in Nature, Science and Physical Review Letters
- Zhenquan Lin
- Shanci Hou
- Jinshan Wu
Scientometrics (2016)
Fashion rules in physics

Nature (2013)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.