Growth rates of modern science: A latent piecewise growth curve approach to model publication numbers from established and new literature databases

Growth of science is a prevalent issue in science of science studies. In recent years, two new bibliographic databases have been introduced which can be used to study growth processes in science from centuries back: Dimensions from Digital Science and Microsoft Academic. In this study, we used publication data from these new databases and added publication data from two established databases (Web of Science from Clarivate Analytics and Scopus from Elsevier) to investigate scientific growth processes from the beginning of the modern science system until today. We estimated regression models that included simultaneously the publication counts from the four databases. The results of the unrestricted growth of science calculations show that the overall growth rate amounts to 4.10% with a doubling time of 17.3 years. As the comparison of various segmented regression models in the current study revealed, the model with five segments fits the publication data best. We demonstrated that these segments with different growth rates can be interpreted very well, since they are related to either phases of economic (e.g., industrialization) and / or political developments (e.g., Second World War). In this study, we additionally analyzed scientific growth in two broad fields (Physical and Technical Sciences as well as Life Sciences) and the relationship of scientific and economic growth in UK. The comparison between the two fields revealed only slight differences. The comparison of the British economic and scientific growth rates showed that the economic growth rate is slightly lower than the scientific growth rate.


Introduction
Growth of science is an ongoing topic in empirical and theoretical studies on science of science. In a recent overview of science of science studies, Fortunato et al. (2018) stated that "early studies discovered an exponential growth in the volume of scientific literature … a trend that continues with an average doubling period of 15 years". The investigation of growth processes leads to results that can be used to characterize science. For example, if the literature doubles every 15 years, science would be characterized by immediacy: "the bulk of knowledge remains always at the cutting edge" (Wang & Barabási, 2021, p. 163). Results on growth processes can also be used to investigate the validity of theories on the development of science: Does science follow a slow, piecemeal process or a process with normal science interrupted by revolutionary periods with an increased level of activity (Kuhn, 1962;Tabah, 1999)? Popular early studies on growth of science have been published by the theoretician of science, Derek John de Solla Price (1965;1951, 1961 who can be seen as the pioneer in investigating growth of science processes (see de Bellis, 2009). According to Price (1986), the development of science follows the law of exponential growth: "at any time the rate of growth is proportional to the … total magnitude already achievedthe bigger a thing is, the faster it grows" (p. 4). Although empirical and theoretical studies in previous decades have confirmed exponential growth, a precise estimation of the growth rate based on reliable and sound publication data has not been done yet.
In most of the studies on growth of science published hitherto, bibliometric data have been used to measure growth of science (an alternative measure is the number of researchers, for instance). It is an advantage of using bibliometric data (compared to other data) that largescale, multi-disciplinary databases are available based on worldwide publication productions.
Another advantage is the characteristic of most scientific disciplines that publications are the main outcome: "science would not exist, if scientific results are not communicated.
Communication is the driving force of science. That is why scientists have to publish their research results in the open, international scientific literature. Thus, publications are essential" (van Raan, 1999, p. 417). According to Merton (1988), "what we mean by the expression 'scientific contribution': an offering that is accepted, however provisionally, into the common fund of knowledge" (p. 620).
In a previous study (Bornmann & Mutz, 2015), two authors of the current study investigated the growth of science based on data from the Web of Science database (Clarivate Analytics;Birkle, Pendlebury, Schnell, & Adams, 2020). Bornmann and Mutz (2015) not only used annual publication numbers but also cited references data (see Marx & Bornmann, 2016, for an overview of the use of cited references data in scientometrics). They argued that Web of Science data (publication counts) are scarcely suitable to investigate early periods of modern science, since early publications are not sufficiently covered. Cited references may have the advantage of covering these early periods and a wider range of document types, including journal articles, books, book contributions or proceedings, which are still not fully included in the databases. However, cited references data can only serve as a less-than-ideal proxy of publication numbers, because non-cited publications are not considered. In recent years, new bibliographic databases have been introduced: Dimensions (Herzog, Hook, & Konkiel, 2020;Hook, Porter, & Herzog, 2018) from Digital Science and Microsoft Academic (Wang et al., 2020) which can be used to study growth processes in science from centuries back. Thus, it is the intention of the current study to use both databases for investigating these processes and compare the results with those from Web of Science and Scopus (Elsevier; Baas, Schotten, Plume, Côté, & Karimi, 2020).
With Dimensions, Microsoft Academic, Web of Science, and Scopus, we considered in this study (the most) important multi-disciplinary literature databases currently available.
The comparison of the empirical results based on the four databases may point to an 5 assessment of growth processes in science that might be interpreted as validsince the assessments can be made independently of the use of single data sources. We investigated the growth processes not only for all annual publications in the databases, but also for two broad fields: (1) Physical and Technical Sciences and (2) Life Sciences (including Health Sciences).
We selected these broad fields and did not consider further fields such as social sciences and humanities. Only for these two fields, we can be sure that publication data can be used as valid proxy for research activity.
In this study, we additionally undertook a comparative analysis of economic and scientific growth processes. According to Price (1986), the theoretical basis for the study of econometrics is similar to that for the study of scientometrics: both follows the law of exponential growth (differences lie in the parameters). Previous scientometrics research revealed that growth of science is related to economic development (Fernald & Jones, 2014;Salter & Martin, 2001). Although a national science system producing high-quality research is without doubtan important condition for national wealth, we primarily consider money as necessary input to the science system (and thus, economic growth as independent variable). In principle, national wealth can be achieved without a modern science system (as has been done for centuries), but (modern) science needs economy to exist and function.
Microsoft Academic offers a subject classification on different hierarchical levels.
-Life Sciences (including Health Sciences): "Biology" and "Medicine". "Publications", "Clinical trials", and "Patents". In the following, by using the term "Dimensions" in the text, we refer only to the Dimensions sub-database "Publications". The indexed publications therein are divided into six different publication types ("article", "chapter", "proceeding", "preprint", "monograph", and "book"). Dimensions offers the second largest coverage of the literature in this study (Visser et al., 2020). Dimensions offers a much larger coverage of books and book chapters than Web of Science or Scopus (Clarivate, 2020;Elsevier, 2020;Taylor, 2020 -Physical and Technical Sciences : "Mathematical Sciences", "Physical Sciences", "Chemical Sciences", "Earth Sciences", "Environmental Sciences", "Information and Computing Sciences", "Engineering", "Technology", and "Built Environment and Design".

Statistical analyses
Scientific growth processes do not necessarily run homogeneously over time, especially when a long-time horizon is chosen, for example, from the beginning of modern science in the 16th/17th century until today. Therefore, modern growth analysis has to simultaneously address three different problems: (1) Science can grow according to different growth functions which provide hypotheses about the nature of growth processes (e.g., unrestricted exponential).
(2) It can be assumed that science grows at different rates in different time periods or segments, i.e., growth rates vary over time.
(3) Growth functions might vary across different databases such as Scopus or Web of Science covering different time horizons. In the following sections, solutions to the three problems are presented which refer to growth functions (unrestricted and restricted exponential growth), segmented regression, and latent growth curve models.

Growth functions
The simplest growth function is that of unrestricted growth in the form of an exponential function, where the growth of science in each year is proportional to the volume of publications available in the previous year. An equal percentage of volume grows every year. For example, if we assume an annual growth rate of 10% and 100 publications in a certain year, then there are 100+0.10*100=110 publications in the following year. One year later, there are 110+0.10*110=121 publications (and so on). Another growth function assumed by Price (1963) is that of restricted growth: Science would run exponentially at the beginning, but with time the growth process approaches an upper capacity limit with constantly decreasing growth rates (s-shaped course). In view of the limited capacities of human and investment capital for research (and other sections of society), the latter thesis by Price (1963) seems to be more plausible than the simplest growth function: Since resources (human resources, capital) are limited, growth cannot be limitless either.
These considerations make it necessary to choose a statistical analysis approach that starts from different time segments, in which different growth rates apply and different growth functions are possible as well. The time segments themselves are not known in advance and have to be estimated. Such an opportunity is offered by the "segmented regression" or "piecewise regression" analyses, which start from different intervals of a dependent variable (in this case: time). These regression analyses apply different functional relationships and simultaneously make it possible to estimate time segments and parameters of the growth functions (Gallant & Fuller, 1973;McZgee & Carleton, 1970;Schwarz, 2015;Toms & Lesperance, 2003;Valsamis, Ricketts, Husband, & Rogers, 2019;Wagner, Soumerai, Zhang, & Ross-Degnan, 2002). In this study, we assume a time series in which the total number of publications y t is available per year, where t denotes the index of the time series, and t=0 the starting year of the time series (e.g., for the year 1665: t=year-1665). We assume two growth functions (see above): 13

Unrestricted exponential growth
The functional relationship for exponential growth assumes that the derivative of the function is proportional to the function itself: The resolution of this differential equation leads to a functional relationship, which can be represented in the following statistical model: where e b0 represents the initial volume of publications at the starting point of the time series (t=0), b 1 the growth constant, and t  the residual with the variance σ 2 as well as the correlation matrix of the residuals CORRɛtɛt-1. The latter is equated here with the identity matrix I, which means that the residuals do not (auto-)correlate. After the model estimation, we checked whether the residuals of the estimated model are actually auto-correlated or not.
In the simplest case of an autoregressive process of first order (AR(1)), the residuals at time t are (auto-)correlated with the residuals at time t-1.
If equation 1 is logarithmically transformed, a simple linear regression function can be obtained: The doubling time k as the time the growth process needs to double the population size at a given time point is: where k is the doubling time and g is the growth rate. The annual growth rate g as the percentage change between two time points is e b1 -1 for Eq. 1. For b 1 = 0.05, for example, g amounts to 0.051 or 5.1%.

Restricted exponential growth (Verhulst-Pearl)
For restricted exponential growth as a special case of a logistic growth model with a capacity limit C, the derivation of the function is proportional to the following function: The resolution of this differential equation leads to a functional relationship, which can be represented in the following statistical model (Tsoularis & Wallace, 2002, p. 28f.): It can be seen from equation 4 that if t->∞, the exponential expression in the denominator, e -b 1 t , goes towards zero and the function approaches the capacity limit C = e K .
At time t=0, the start of the time series, the exponential expression in the denominator, e -b 1 t , is equal to 1 and the function corresponds to the initial volume e b0 multiplied by the error term e εt . A limited growth is assumed only for the first segment to rule out or not the de Solla Price hypothesis of growth of science. The combination of s-shaped segments over time seems to be implausible in light of the empirical results on the growth of science by Bornmann and Mutz (2015).
If equation 4 is logarithmically transformed, the following linear regression function results: In the following, we call the "restricted exponential growth model (Verhulst-Pearl)" the "logistic growth model".

Segmented regression
Following classic theories of economic development, we consider the process of development in science and economy as a sequence of historical stages (Dang & Sui Pheng, 2015). In addition to the functional model, therefore, a statistical framework model is required. We used segmented regression which defines the regression models for different time segments and can be represented in the form of nested IF-THEN clauses for each segment j. In the case of unrestricted growth in all segments j, the following overall model applies with year t 0 as the starting year of the time series (e.g., 1665): where a j denotes the year at which the j th time segment j ends, and where a 0 = t 0the starting year of the time series. In addition to the parameters of the growth model, the year parameters a 1 to a j-1 are estimated. The same distribution of residuals is assumed for each segment.
Publication counts is a count variable. The variable includes positive integer values with zero. This implies that the values are distributed, for example, according to a Poisson distribution (Hilbe, 2014, p. 2). In this study, however, a logarithmic transformation (base e) of the publication data was favored over a Poisson model for the following reasons: (1) with regard to growth rates of science, unrestricted growth can be assumed, in which the logarithmic transformation leads to a simple linear regression function. The parameters of the function can be interpreted in terms of the original non-transformed growth function (Panik, 2014, p. 33). (2) If it can be demonstrated that the observed values are well explained by the function (because of low residual variance), then neither the distribution function nor the transformation play a major role. (3) Due to the smaller scale of the values resulting from logtransformation, there is a greater chance that complex statistical models converge in the estimation process.

Piecewise latent growth curve model with missing imputation
In this study, we used data from several bibliographic databases. We therefore needed to find an answer to the question of how the various datasets reflecting the same information (scientific output) should be analyzed statistically. It was one option to conduct the analyses for each database separately. This approach would accord with the analyses by Bornmann and Mutz (2015). Analyses for each database separately, however, run the risk of obtaining four different results that might reflect specific aspects of a database. Another option was to analyze the data from the different databases within one statistical model. This solution would still need solutions to the following problems: (1) The time intervals at which publication data are available vary from database to database. The largest time interval (from 1665 to 2018) is available from Dimensions. To analyze only the time interval for which all databases provide complete data would significantly limit the period of investigation of the development of science. (2) The publication data vary greatly in volume between the databases. Dimensions, for example, has the highest volume of publications when the entire time series is considered, whereas Web of Science has the comparatively lowest volume. Here, the question arises whether some form of data weighting according to volume is necessary.
The solution for these problems that we favored in this study was the application of The main challenge in the analyses was to obtain convergence of the estimation algorithm across all models and all imputations. Especially for models with many segments, convergence problems occurred due to different scaling of the variance components (high variability in the intercept and decreasing variability in the slopes with increasing number of 20 segments). Therefore, random effects were partly scaled (e.g., multiplied by 100 or 0.01) to establish convergence.
The statistical analyses in this study were done with the statistical software package SAS and the procedures PROC NLMIXED, PROC NLIN, PROC MI, and PROC MIANALYZE (SAS Institute Inc., 2015).

Results
In this section, the results of the model estimations are presented. The first five years of each time series were discarded for the estimations because they seemed to reflect only a pseudo segment or artifact without any empirical meaning. Therefore, the actual starting years were 1670 for Dimensions, 1805 for Microsoft Academic, 1905 for Web of Science, and 1866 for Scopus. Each time series ran until the year 2018.

Model comparison
Statistical model comparisons make it possible to rule out unrealistic models with poor model fit in order to get the model with the relatively best fit to the data. The model formulation is associated with certain assumptions about scientific growth (see Table 1 Model M 1 "Exponential growth" (see Table 1), for example, includes three parameters: intercept, slope and residual variance. If intercepts and slopes are allowed to vary across the four databases, two variance components were additionally estimated with overall five parameters. In M 3 , the covariance of intercept and slope only for the first segment was added as a further parameter.
Instead of statistical significance testing, model comparison is undertaken in this study based on the Schwarz`s Bayesian information criterion (BIC). The smaller the BIC, the better the model fits the data (see Table 1). Models represent overall hypotheses about the nature of growth (e.g., exponential). The BIC is corrected for the number of parameters. A selection of models (e.g., number of segments) was made that were still estimable given the number of parameters and that still showed model improvement in terms of BIC.
Comparing model 1 and model 2, it becomes clear that a simple fixed-effects model (M 1 ) does not fit the data well. The differences between the growth curves based on the various databases are too large, so that a mixed-effects model (M 2 ) can be assumed which results in a significantly smaller BIC. The hypothesis of logistic growth can be rejected as well since the exponential model fits better. Among the models in Table 1 Science publications. Since the explained variancemeasured in terms of the coefficient of determination (R 2 )exceeds .99, any autocorrelation among residuals or possible heterogeneity of residual variances can be neglected (equations 2 to 5). The covariance matrix of the residuals, CORRɛ t, ɛ t-1 , is assumed to be an identity matrix I. Table 1 demonstrates that the assumption of constant scientific growth over time is not realistic; hence, we can start with the premise that periods with different growth rates exist. This premise seems reasonable since, for example, the history of the 20 th century is characterized by two World Wars with drastic consequences for the science system worldwide. As the results by Bornmann and Mutz (2015) Table 2). For publication counts, a model with eight segments shows the best fit (see Table 2). We additionally compared the models using the mean square error (MSE) and the BIC derived from the MSE to select certain models (Kim & Kim, 2016) (see Table S2 in the Supplementary Information for the estimated parameters of the model). Notes: BIC = Schwarz's Bayesian information criterion, MSE = Mean square error, optimal models are grey shaded. *No convergence of the iterations in the estimation process.

Growth rates of science (all publications)
In our analyses of growth processes in science using publication data, we follow typical assumptions such as those formulated by Long and Fox (1995): "while research productivity is not strictly equivalent to publication productivity, publication is generally taken as an indication of research" (p. 51). In the statistical analyses of Microsoft Academic data, we considered all publications with known document types except patents, i.e., we excluded publications with unknown document type. Among publications without document type but with DOI, we identified book chapters and journal publications as well as conference papers and technical reports. We also found summaries and reports about conferences. Since not all publications can be seen as equal contributions to scientific progress, we analyzed the influence of the document type on our results by including documents with known and unknown document type without patents (see the results in the Supplementary Information, Figure S6). The differences between the results including all documents and only those documents with known document types are small. For all documents a further segment could be identified, which represent the period of Second World War (1940)(1941)(1942)(1943)(1944)(1945).

Growth rates of science for Life Sciences and Physical and Technical Sciences
In addition to the analyses including all publications, we have also conducted analyses for two broad fields: Life Sciences and Physical and Technical Sciences. The estimated parameters of the models are reported in Table S1 in the Supplementary Information. The results are visualized in Figure 2 and Figure 3. With the comparison of two broad fields, we wanted to find out whether different fields are characterized by similar or different growth rates in their historical developments. As the results in Figure 2a show, the overall annual average growth rate for Life Science amounts to 5.07% with a doubling time of 14.0 years.
The results for the Physical and Technical Sciences are similar, with a growth rate of 5.51% (see Figure 3a) and a doubling time of 12.9 years.
In agreement with the results for all publications in Figure 1b  In the segment reflecting the period after 1945, with an annual growth rate of 5.51% and a doubling time of 12.9 years, the growth in the Physical and Technical Sciences is higher than the growth rate in the Life Sciences. In the Life Sciences the growth rate is 4.79% with a doubling time of 14.8 years. The growth rate in the Physical and Technical Sciences is also (slightly) higher than the growth rate that we calculated based on all publications in this segment (see Figure 1b): 5.08%.

Comparative analysis of growth rates of science and of growth domestic product in UK
For a comparative analysis of economic and scientific growth (using similar statistical methods), we used data from UK as explained in section 1. We analyzed logarithmic transformed GDP and logarithmic transformed cumulative publication data to estimate the different segments of growth rates and the growth rates themselves. Both rates are percentages and can be directly compared. The publication counts were obtained by the Dimensions database. The average annual growth rate of science in UK since 1780 is 4.97% (see Figure 4a). This corresponds to a doubling time of 14.64 years. This annual growth rate is slightly higher than the average worldwide growth rate of 4.10% (see Figure 1a). The statistical analysis revealed eight segments with different growth rates (see Figure 4b). The growth is, therefore, more differentiated than the overall growth with five segments (see Comparable to worldwide results (see Figure 1b), a significant slowdown in scientific growth with a growth rate of 2.62% is apparent around the Second World War (between 1940 and 1948). While the overall analysis shows an unrestricted exponential growth after 1945 (see Figure 4b), the growth of science in UK took place in three stages: a strong growth of 6. 80% until 1959, which intensified between 1959 and 1983 (8.65%), and slowed down to 6.42% in the years after 1983. The growth rates in these three segments are even higher than the worldwide growth rate of 5.28% in the corresponding time segment (between 1945 and 2018 Halpenny, Burke, McNeill, Snow, & Torreggiani, 2010;Hart & Sommerfeld, 1998;Ntuli, Inglesi-Lotz, Chang, & Pouris, 2015). The results in Figure 5a reveal an annual GDP growth rate of 3.05% and doubling time of 23.5 years which is lower than the growth rate based on publication counts of 4.97% (see Figure 4a).
At first glance, economic growth and scientific growth do not seem to be linked necessarily. A more detailed view shows, however, that both growths are related at certain points over time (see Figure 4b and Figure 5b) While the slowdown in the economy did not begin until after 1987, science began to grow at a rate of only 6.42% as early as 1983.

Discussion
Modern science is based on knowledge-producing institutions and processes (Gieryn, 1982). Current research is a method of "systematically exploring the unknown to acquire knowledge and understanding. Efficient research requires awareness of all prior research and technology that could impact the research topic of interest, and builds upon these past advances to create discovery and new advances" (Kostoff & Shlesinger, 2005, p. 199).
Society expects a steady increase in scientific growth since only considerable growth processes would lead to growth in other sectors of society such as economics and health.
Since (public) investments in science are frequently justified on the basis of growth of science and science contribution to national economic growth (Wagner, Park, & Leydesdorff, 2015), measurements of scientific growth processes are ongoing topics. These measurements are usually based on numbers of publications, since the results of research mostly appear in publications: "in academic institutions, publications constitute in all scientific-scholarly subject fields an important form of academic output" (Moed, 2017, p. 63). The results of Digital Science (2016) show that especially the journal article becomes increasingly popular as a medium for presenting scientific results. The popularity of journal articles could also be the consequence of the higher than average growth in disciplines using journal articles.
The motivation by researchers for publishing their results (in journal articles) is especially fostered by the specificity of the scientific reward system: "Publications have another function as well [besides the open availability of research results]: The principal way for a scholar to be rewarded for his contribution to the advancement of knowledge is through recognition by peers. In order to receive such an award, scholars publish their findings openly, so that these can be used and acknowledged by their colleagues" (Moed, 2017, p. 62).
Although the publication of findings is so basic in science, researchers also process their findings in other forms of output (e.g., patents or presentations). An overview of indicators for measuring productivity based on these other forms can be found in Godin (2009). The problem of most of these indicators for measuring productivity or scientific growth, however, is that annual and historical data without missing values are scarcely available.
In this study, we used publication data from four literature databases to investigate scientific growth processes from the beginning of the modern science system until today. In accordance with the law of exponential growth, the results of the unrestricted growth show that the overall growth rate amounts to 4.10% with a doubling time of 17.3 years. This annual growth rate (over the various databases) is different from the Web of Science growth rate of 2.96% reported in Bornmann and Mutz (2015), since we considered in the current study a significantly longer time period than Bornmann and Mutz (2015): from 1900 until 2018 in this study (119 years) versus from 1980 until 2012 (33 years) in Bornmann and Mutz (2015).
As the comparison of various segmented regression models in the current study revealed, the model with five segments fits the data best. We demonstrated that these segments with different growth rates can be interpreted very well since they are related to either phases of economic (e.g., industrialization) and/or political developments (e.g., World Wars).
Obviously, the war efforts (allocation of funds) led to a visible decline in research (by output measure of publication) but research went on nevertheless, possibly with even more vigor.
However, that research was not being made available openly for security reasons (and researchers pulled in for the sake of war efforts from physics to languages, material science to mathematics/emerging computer science)and arguably the results of war-time research triggered post-war discoveries, too.

37
We additionally undertook two further analyses focusing on (1) growth in two broad fields (Life Sciences and Physical and Technical Sciences) as well as (2) the relationship between scientific and economic growth.
(1) The comparison between the two broad fields revealed that although slight differences are observable, these differences are not so great that they can be denoted as fundamental. For example, whereas the overall annual average growth rate for Life Science is 5.07% with a doubling time of 14.0 years, the overall growth rate for Physical and Technical Sciences is 5.51% with a doubling time of 12.9 years.
(2) In the investigation of the relationship of scientific and economic growth, we focused on UKone of the few countries with corresponding available (historical) data. The results showed that the scientific growth rate of UK's number of publications (4.97%) is slightly higher than the average worldwide growth rate (4.10%). Furthermore, the results demonstrated that the growth of UK's number of publications is more differentiated (with eight segments) than the worldwide growth (with five segments). The comparison of the British economic and scientific growth rates revealed that the GDP growth rate is lower than the scientific growth rate (3.05% versus 4.97%). Since GDP is not corrected for inflation in this study, results on the comparison of growth rates of science and economy should be interpreted with great care.
In the interpretation of the scientific growth rates that were mostly increasing in the historical development, two interpretations are possible: Either researchers were able to publish more publications in the same time or the increased publication counts can be traced back to an increase in the number of researchers. The study by Fanelli and Larivière (2016) targeted this question. Their results pointed to the second interpretation being more plausible. Fanelli and Larivière (2016) analyzed "individual publication profiles of over 40,000 scientists whose first recorded paper appeared in the Web of Science database between the years 1900 and 1998, and who published two or more papers within the first fifteen years of activity -an 'early-career' phase in which pressures to publish are believed to be high. As expected, the total number of papers published by scientists has increased, particularly in recent decades. However, the average number of collaborators has also increased, and this factor should be taken into account when estimating publication rates. Adjusted for coauthorship, the publication rate of scientists in all disciplines has not increased overall, and has actually mostly declined" (Fanelli & Larivière, 2016).
Two limitations mentioned by Bornmann and Mutz (2015) are still valid for the current study and should be considered in the interpretation of the results: The first limitation refers to the use of publication counts to measure growth processes. According to Tabah (1999), there are advantages and disadvantages in using these numbers: "although counting publications is simple and relatively straightforward, interpretation of the data can create difficulties that have in the past led to severe criticisms of bibliometric methodology … The main problems concern the least publishable unit (LPU), disciplinary variance, variance in quality of work, and variance in journal quality" (p. 264).
The second limitation concerns the interpretation of "growth" as an "increase in numbers".
According to Bornmann and Mutz (2015), "it is not clear whether an 'increase in numbers' is directly related to an 'increase of actionable knowledge', for example for reducing needs, extending our knowledge about nature in some lasting way or some other 'higher purposes' (p. 2221).
Both limitations might be targeted in future studies on growth processes of science.
The results of our study show that an exponential growth explains quite well the data and there is different speed in different epochs. However, our study does not target the questions why the growth processes are different and why an exponential growth is present. For example, we show that a regression with five segments have different growth speeds.
However, we do not empirically investigate these differences: how can we explain, e.g., that between 1660 and 1793 the growth rate is 3.29%, while between 1793 and 1810 it is 22.78% (Technical Sciences)? Therefore, future studies should try to explore empirically the reasons for different growth processes over time. Note. *p<.05 Figure S6. Plots for a) unrestricted growth (M 1 ) and b) segmented unrestricted growth (M 9 ) based on the number of publications from four bibliographic databases with all documents (known and unknown document types)