Introduction

Memories of past significant events, such as natural disasters and wars, that are shared by members of social groups like countries and families are called “collective memory”. This concept was proposed in 1925 by a sociologist Halbawchs1. Later, Assman classified collective memory into communicative memory and cultural memory depending on how the memory is passed down to future generations2. Communicative memory is maintained by everyday communications such as conversations with close people. By contrast, cultural memory is maintained by cultural formation (texts, rites, monuments) and institutional communication (recitation, practice, observance)2,3. Although these were initially treated as sociological concepts, recently, it has begun to attract attention as a target for empirical research. How much people remember World War II is investigated by country4 and age5 through self-reported surveys. Roediger et al. examined how much people forgotten the U.S. presidents through interviews with students6 and found two different functions that characterize forgetting.

Thanks to the recent advancement in digitization of public-domain knowledge and online user behaviors, there has been a growing effort to study collective memory quantitatively using large-scale empirical data. For example, Michel et al.7 investigated collective memory using the word frequencies in digitized books. Au Yeung et al.8 measured the extent to which collective memories were retained in different countries using large data sets of news articles. Page views and edit histories of Wikipedia articles about significant events, such as natural and man-made disasters, aviation accidents, and terrorist attacks, have been frequently used as indicators of collective memory9,10,11,12. Some studies used Wikipedia not only to measure the level of collective memory but also to understand the collective nature of people in more general sense, such as revealing the relationship between page views and turnout in elections13, building a model of people’s browsing behavior considering external factors14, and predicting the popularity of movies from page views 15. Singer et al.16 showed that mass media and current events (30% and 13% of respondents, respectively) dominated the motivation for people to access Wikipedia pages. It was found that Wikipedia page view activity strongly correlates with Google search activity17,18. These earlier studies warrant the use of Wikipedia page views as a quantitative metric of the general user behavior on the Internet.

Mathematical models were proposed to describe collective memory decay and validated with various empirical data, including Wikipedia page views. Candia et al. revealed the universal nature of decay patterns on a yearly time scale 3. They showed that the decay of collective memory can be modeled by a biexponential function \(C_{1}\textrm{e}^{-\alpha t}+C_{2} \textrm{e}^{-\beta t}\) using the number of citations of papers and patents as well as online attention to songs, movies, and biographies of Wikipedia views on a yearly scale.

Collective memory decay have also been investigated daily time scale. Kim et al.19 proposed a stretched exponential function \(\textrm{e}^{-(t/\alpha )^{\gamma t^{\delta }}}\) to describe daily page views of online academic articles. They successfully depict the dynamics that decay quickly in the beginning and slowly in the latter using the stretched exponential function. West et al.20 studied the daily collective user behavior of Twitter and news sites on the news about the deaths of celebrities between 2009 and 2014. They showed that the mention frequency can be modeled by a shifted power-law function \(C_{1} t^{-\alpha }+C_{2}\) with their exponents are \(\alpha =1.34\) and \(\alpha = 1.54\) for news and Twitter, respectively. García-Gavilanes et al.21 analyzed daily Wikipedia page view dynamics on articles of aviation accidents and found that the collective memory decays exponentially after it reaches a maximal value. They proposed a segmented model that assumes separated regimes behind the dynamics.

The earlier studies on daily collective memory decay dynamics typically considered only one dynamical process applied to data obtained in just one specific event category. Whereas a universal model 3 was proposed for annual collective memory decay using the data for multiple event categories, such year-by-year dynamics are only relevant at a slow, historical time scale, which would not be applicable to day-to-day dynamics. There is hence a need for a universal model of collective memory decay for a faster, daily time scale.

Here we propose a two-phase decay model for collective memory of various types of significant events and evaluate its validity using Wikipedia page view data. We compare the proposed decay model to several other existing decay models developed using data from Wikipedia22, blogs23, Twitter24,25, YouTube26, news sites27,28, book sales 29, and the number of articles read19. These earlier studies modeled collective memory decay in various mathematical forms, such as power-law, exponential, and stretched exponential, to which the proposed model is compared for performance evaluation.

Data and methodology

Data

Table 1 Summary of Wikipedia page data.

In this study we analyzed collective memory decay using English Wikipedia page view data. We selected the following five categories of significant events for analysis: earthquakes, deaths of notable persons, aviation accidents, mass murder incidents, and terrorist attacks. These events were also used in previous collective memory studies10,11,20,21,30. For the events in these categories, the date and location of the event are precise, which allows for the collection of unambiguous time series data. We obtained the Wikipedia pages of events listed in the summary article of each category in the English version of Wikipedia. The target period of event occurrence is from July 1st, 2015, to June 30th, 2020. Table 1 shows a summary of the dataset we obtained from Wikipedia.

Figure 1
figure 1

Examples of daily Wikipedia page view decay for 400 days in log-log scale. The panels show page views decay after an event occurrence in 2016: Kumamoto earthquake (left) and the death of Alan Rickman (right). In Alan Rickman’s case, there is a spontaneous increase of page views around 365 days.

Figure 1 shows two examples of Wikipedia page view decay from the event occurrence date (one for the 2016 earthquakes in Kumamoto, Japan, and the other for the death of Alan Rickman). The two examples commonly show that the Wikipedia page views peaked around the date of the event and gradually decayed over time. In addition, the peak height of the page views (i.e., how much attention an event receives) and the decay rate (i.e., how quickly people forget it) varied greatly from event to event.

For each of the collected Wikipedia pages, we obtained daily page view counts since the event occurrence date for 300 days from the infobox in each event Wikipedia page by using Wikimedia REST API (https://wikimedia.org/api/restv1/). The length of the data collection period was set to 300 days, shorter than one year, in order to avoid a possible “anniversary” page view increase toward the end of the 365-day cycle (such increase was seen in Fig. 1 right). If the page view peak was less than 1000 or occurred 5 or more days after the event date, the data was excluded from the analysis since we considered such events did not trigger significant collective memory responses. With these criteria, we acquired valid page view data for 34 earthquakes, 8684 deaths of notable persons, 43 aviation accidents, 37 mass murder incidents, and 123 terrorist attacks.

Model

In this study, we propose a unique two-phase mathematical model of collective memory decay that combines exponential and power-law phases. Our model does not assume a regime shift in the decay of collective memory but rather a change within the population that forms collective memory. First, we define the normalized daily page views t days after the peak \(t_c\) for each event as \(S(t)={S^{ raw }(t)}/{S^{ raw }(0)}\), where \(S^{ raw }(t)\) is the raw number of daily page views t days after the peak \(t_c\) (and therefore \(S^{ raw }(0)\) is the number of page views at the peak \(t_c\)). Next, we assume that there are two types of users: the first type is “temporary interest users”, whose page views decay rapidly as an exponential function of time with no interaction, and the second type is “long interest users”, whose page views decay following a power-law function of time which implies non-trivial interactions among those users. We made this assumption based on Ebbinghaus’ forgetting curve that the independent individual memory decays exponentially31. Combining these two types of users determines the total number of page views in our model (Fig. 2). This model can capture the shift from “fast decay” to “slow decay”. The model formula is mathematically expressed as follows:

$$\begin{aligned} S(t) =C_{1}\textrm{e}^{-\beta t} + C_{2}t^{-\alpha } \end{aligned}$$
(1)

\(C_{1}\) and \(C_{2}\) are constant parameters representing the amplitudes of the two decay dynamics. \(\beta\) is the decay rate of the initial exponential decay, and \(\alpha\) is the decay rate of the mid- to long-term power-law decay. This proposed model is different from the models of the previous research, and the idea that the basic properties of the user can be divided into two distinct groups is also unique to our research.

To evaluate the validity of this model, we quantitatively compared the accuracy of the proposed model with that of other models in the previous research, including bi-exponential19, stretched exponential3, and shifted power-law20 by measuring the coefficient of determination \(R^{2}\) and the Akaike Information Criterion (AIC).

Figure 2
figure 2

Proposed two-phase model of collective memory decay. The model consists of exponential decay, which corresponds to temporal interest users without interactions, and power-law decay, which corresponds to long-term interest users with interactions.

Results

Model fitting

Figure 3
figure 3

Fitting examples shown in Fig. 1 of the proposed two-phase model for 300 days. Blue dashed line and red dash-dotted line correspond to the simple exponential decay and power-law decay, respectively. Yellow solid line shows the proposed two-phase model which combined the exponential and power-law functions.

We performed model fitting for each normalized time-series data of page views S(t) with the following four nonlinear models that do not assume a regime shift: bi-exponential \(C_{1}\textrm{e}^{-\alpha t}+C_{2} \textrm{e}^{-\beta t}\) 3, stretched exponential \(\textrm{e}^{-(t/\alpha )^{\gamma t^{\delta }}}\) 19, shifted power-law \(C_{1} t^{-\alpha }+C_{2}\) 20, and the proposed model \(C_{1}\textrm{e}^{-\beta t} + C_{2}t^{-\alpha }\). In model fitting, we added a constant value \(\epsilon\) to each individual time series, where \(\epsilon\) was the minimum nonzero value across all individual time series. Then, we took base-10 logarithms of the empirical data and conducted parameter fitting of each model formula to the data in a log-log space using a nonlinear least-squares method, following the method by West et al. 20. Figure 3 shows examples of model fitting. Compared to the purely exponential (blue, dashed) and purely power-law (red, dash-dotted) decays, our proposed model (yellow, solid) can capture both the initial exponential decay and the mid- to long-term power-law decay simultaneously.

We compared the median of \(R^{2}\) and AIC of each model formula for each event category to compare the model performance. Tables 2 and 3 show the results. The proposed model showed the best performance for earthquakes, aviation accidents, and terrorist attacks. For deaths of notable persons and mass murder incidents, the shifted power-law model 20 performed slightly better, but the differences between its \(R^2\) and AIC values and those of our model were small. In fact, when we determined each sample individually, our model performed better in more than half of the cases in all categories. 82%, 59%, 59%, 55%, and 58% for earthquake, notable death, aviation, mass murder, and terror incidents, respectively. Interestingly, the shifted power-law model proposed to describe obituaries also performed well in our data targeting negative events.

Table 2 Median of \({R^2}\) for four decay models.
Table 3 Median of AIC for four decay models.

Decay parameters

The initial fast exponential decay is characterized by \(\beta\) and the late slow power-law decay by \(\alpha\). Figures 4 and 5 show probability density distributions of the parameter values of \(\beta\) and \(\alpha\). Note that 26 (0.3%) outliers (\(\beta > 2\)) are not shown in the distribution of death of notable persons (\(N=8,684\)). These distributions show a clear unimodal distribution with a distinct characteristic value for each parameter whose medians are shown in Table 4. These results suggest that there is a common pattern of collective memory decay, first in the fast exponential decay immediately after the event with the exponent \(\beta\) around 0.4, followed by the slow power-law decay with the exponent \(\alpha\) around 0.3.

An interesting observation is that the value of \(\alpha\) may be loosely related to the lasting societal impact of the events. Earthquakes tend to cause a massive damage to society and the characteristic value of \(\alpha\) for this category was large (0.48), implying that there was meaningful long-term collective memory decay going on for a long period of time. Meanwhile, deaths of notable persons would have minimal impact on society and its characteristic value of \(\alpha\) was small (0.22), implying that the long-term behavior was closer to a flat line (\(\alpha =0\)) and more likely dominated by constant random page views. Events in other categories would have societal impacts at intermediate levels, which may be reflected on their intermediate characteristic \(\alpha\) values as well. This observation remains largely speculative and would need further systematic investigation.

Figure 4
figure 4

Probability density distributions of parameter values of \(\alpha\) obtained using the proposed model in five categories. Most of the samples fall within a certain range (\(0< \alpha < 1\)) and show similar values (\(\alpha \sim 0.3\)) independent of the category.

Figure 5
figure 5

Probability density distributions of parameter values of \(\beta\) obtained using the proposed model in five categories. Most of the samples fall within a certain range (\(0< \beta < 2\)) and show similar values (\(\beta \sim 0.4\)) independent of the category.

Table 4 Median of \(\beta\) and \(\alpha\) for five categories for the proposed model.

Switching point of collective memory decay dynamics

Figure 6
figure 6

Overview of the detection of the collective memory switching point \(t^*\). (Inset) Because our model consists of power-law and exponential functions, we set \(t^*\) as the point at which one of the functions covers more than half of the total page views. In our case, the exponential function dominates first and is then replaced by the power-law function for all samples.

Our proposed model allows for detection of the “switching point” of collective memory decay where the dominant component in the model formula \(S(t) =C_{1}\textrm{e}^{-\beta t} + C_{2}t^{-\alpha }\) switches from exponential to power-law. Such a switching point \(t^{*}\) is defined as the first time point at which \(C_{2}t^{-\alpha } > C_{1}\textrm{e}^{-\beta t}\) in the fitted model (Fig. 6).

Figure 7 shows the probability density distributions of the switching points detected for five categories. The median values for all categories were quite similar (earthquake: \(t^{*}=10\); deaths of notable persons: \(t^{*}=11\); aviation accidents: \(t^{*}=10\); mass murder incidents: \(t^{*}=11\); and terrorist attacks: \(t^{*}=11\)), indicating a common pattern of the shift of collective memory decay dynamics at about the same timing (around 10 to 11 days after the peak), regardless of the event category.

Figure 7
figure 7

Probability density distributions of switching point \(t^*\) detected for five categories. Most of the samples fall within \(0< t^*< 40\) days, which is relatively short compared to the maximum length of the time series (300 days). The median values of \(t^*\) are 10 to 11 days for all categories.

Discussions

In this study, we collected daily English Wikipedia page view counts for five event categories and modeled their decay processes using a new two-phase model that combined initial exponential decay and mid- to long-term power-law decay in a single mathematical formula. To the limit of our knowledge, this study was the first attempt to develop a universal model of collective memory decay applicable to multiple event categories at daily time scales. We found that our proposed model showed consistently high accuracy across multiple event categories, and closely matching the best performance in the previously proposed decay models.

Our model also allowed for the detection of a “switching point” in collective memory decay at which the dominant decay dynamics switches from exponential to power-law. We found that the decay phase switches about 10 to 11 days after the peak, irrespective of the event category. This number is similar to what was reported in García-Gavilanes et al.21 that the first break point of the segmentation was 3-10 days for both English and Spanish Wikipedia page views of aviation accidents. This is a unique, non-trivial finding because it indicates a universal property of our society’s “collective attention span” which shows immediate attention period of the news.

There are some limitations in our study. Firstly, we only validated our model by the English Wikipedia page views of events that related to a negative impact on society. Therefore one still needs to be careful in considering the generality of the obtained results by using other data for events with a positive impact on society, such as a scientist winning a Nobel Prize and an actor winning an Academy Award. We expect that positive events’ decay patterns will be similar to the negative ones, because our previous study showed that the word frequency of names related to obituaries and Nobel Prizes exhibited a similar decay pattern in Japanese blog data23. Similarly, other than Wikipedia page views, Twitter mentions and number citations are also considered for future tasks. The assumptions of the model should also be noted. Here we focused on aggregated behavior of the user population, and we did not consider each individual user’s behavioral changes.

Therefore, future directions of research include consideration of more detailed information about specific events and modeling their influences on collective memory decay, such as more detailed event types, the popularity of the event, and the size of societal impacts the event created. Such systematic analysis will help understand the nature of collective memory in greater depth, possibly revealing the quantitative relationship between the event’s impact and the value of \(\alpha\) as indicated above. Also, we found the spontaneous increases in collective memory decay around 365 days which could be attributed to year-to-year recall. We recognize that investigating such spontaneous increases is another interesting future direction.