The dynamics of meaningful social interactions and the emergence of collective knowledge

Collective knowledge as a social value may arise in cooperation among actors whose individual expertise is limited. The process of knowledge creation requires meaningful, logically coordinated interactions, which represents a challenging problem to physics and social dynamics modeling. By combining two-scale dynamics model with empirical data analysis from a well-known Questions & Answers system Mathematics, we show that this process occurs as a collective phenomenon in an enlarged network (of actors and their artifacts) where the cognitive recognition interactions are properly encoded. The emergent behavior is quantified by the information divergence and innovation advancing of knowledge over time and the signatures of self-organization and knowledge sharing communities. These measures elucidate the impact of each cognitive element and the individual actor’s expertise in the collective dynamics. The results are relevant to stochastic processes involving smart components and to collaborative social endeavors, for instance, crowdsourcing scientific knowledge production with online games.

In modern statistical mechanics 1 , it has been recognized that the collective phenomena arise from interactions among the elementary units via a spontaneous transition to an organized state, which can be identified at a larger scale 2,3 . Recently, this unifying principle is gaining importance in other natural sciences, for instance for elucidating organization in living systems [4][5][6][7][8] , emergence of coherent activity in neuronal cultures 9 , and developing computational social science 10 . In social systems, interactions and cooperations among actors can lead to the recognizable collective behavior, for instance, the development of collective knowledge 11 , appearance of common norms 12 or language 13 . The quantitative study of the stochastic processes underlying these social phenomena utilizes the methods of statistical physics supported by analysis of the plethora of online empirical data. Some illustrative examples are the appearance of good and bad conduct in online games 14 and groupings induced by the exchange of emotional messages on social sites [15][16][17][18] . However, a deeper understanding of the mechanisms of collaborative social endeavors 11,19,20 remains a serious challenging problem in physics and social dynamics modeling.
The building of collective knowledge via social interactions is a subtle phenomenon that requires both cognitive elements and an organized effort to solve a particular query. In this stochastic process, the social system that enables transfer of knowledge and the cognitive subsystem are dynamically interlinked and influence each other at a microscopic scale 21 . In the relational epistemology, the exchange of values is an essential factor that permits the emergence of a collective value via interaction and cooperation among equal individuals 22 . In this concept, the collective knowledge is neither an entity over individuals nor their sum, rather, it is a property of the particular relations among the interacting actors. It reflects the actions of each individual as a meaningful, adjusted to the actions of others by means of new operation; its reciprocity and the acceptance of the confirmed values lead to a cooperation "that has a logical structure isomorphic to logical thought" 22 . On the practical side, modern information communication

Results
Fine-grained dynamics and cooperation. All our exemplifications are provided based on the analysis of data in mathematics from the system known as Mathematics which has become a universal clearinghouse for Q&A in the field 25 . In the data, the cognitive element of each artifact (question, answer or comment) has been systematically tagged according to the standard mathematics classification scheme. In addition, the fact that a unique identity is known for each actor (user) and each artifact together with the high temporal resolution of the data enable a detailed analysis of the underlying stochastic process. Assuming that the cognition-driven events occurred, we determine a set of tags as expertise of each user in the considered dataset. The dataset and the procedure are described in Methods. In the model (Supplementary information, SI), the actors (agents) have a defined range of expertise. Minimal matching of the expertise of an answering agent with the tags of the answered question is strictly obeyed. The considered agents have the activity patterns statistically similar with the patterns of users in the empirical data while their expertise is varied.
In the process, which is schematically depicted in Fig. 1a, an actor (U) posts a question (Q), which may receive answers or comments (A) by other actors over time. Subsequently, new Q and the already present Q&A are subject to further answers, and so on. Representing each action by a directed link, this process co-evolves a bipartite network, where actors are one partition and Q&A form another partition. An example of a single-question network from the empirical data is shown in Fig. 1b. The cognitive content of each question is marked by up to 5 different tags, which thus specify the required expertise of the answering actors. Matching by at least one tag is required. The actor's expertise is transferred to its answer. The excess expertise of the involved actors leads to the innovation [26][27][28] and an accumulation of expertise around a particular question. At the same time, it extends the sample space of matching events, thus accelerating the process in a self-organized manner.
The quantitative measures displayed in Fig. 1(c-f) signify a highly cooperative process with the cognitive elements encoded by tags in the empirical dataset. Specifically, the entropy in Fig. 1f shows a distinctly non-random pattern of the appearance of each tag. In accordance with the entropy, the use of different contents shows temporal correlations. The distribution of time intervals between consecutive events with a particular tag ranges over five decades, Fig. 1d, suggesting a variety of roles that different cognitive elements play in the process. The dynamics of tags closely reflects the heterogeneity of the users' activity profile and their expertise. Figure 1d also shows the broad distribution of the interactivity time of a particular user; the presence of a daily cycle is characteristic of online social dynamics 15,17 . The long delays between actions of some users, contrasted with a frequent activity of others, yield the power-law distribution of the number of activities N i per user (Fig. 3a in SI). Further, the role of each user in the process can be distinguished. For instance, in Fig. 1c, the probability for posting questions g i decays with the number of the user's actions N i . Essential for the cognitive process, however, is the broad range of the user's expertise. As discussed in Methods, it is measured by the entropy distribution shown in Fig. 1e. While the majority expertise includes between one and four tags, few individuals have an activity record for a large number of topics. Consequently, the appearance of a particular combination of cognitive elements shows a complex pattern. All distinct combinations of tags found in the dataset obey Zipf 's law, see Fig. 2. It is a marked feature of scale-invariance in the collective dynamics 28,29 . The ranking distribution of individual tags is also broad, Fig. 2 in SI. Furthermore, by directly inspecting the related time series, Figs 4 and 5, we find that an actively self-organized social process underlies the observed dynamics of cognitive elements.    Information divergence. To examine the influence of a particular cognitive element (tag) in the process, we define a set of conditional probability measures and compute the discrete Kullback-Leibler information divergence from the sequence of question-answer events in which that tag is present, Fig. 3. The empirical data are divided into a series of one-day time windows. In what follows, we use the time window index K, which runs in our examples as K = 1, 2, · · · 1498. As the activity on a particular question or answer typically extends over many time windows, for K ≥2 the space of events Q K for questions in the Kth window also includes Q&A which were active in the (K −1)th window, while only new answers in the Kth window make the sample space A K for answers. By focusing on the time-line of the tag κ, which annotates a particular cognitive content, four conditional probabilities, which are defined in Methods, are determined in every time window K. The information divergence [30][31][32] , defined as I(P(κ|A K , It determines the information gain about the κ-tag that is present in questions Q K if the answers A K are known. Using the chain relation P(κ|A K , Q K )P(A K |Q K ) = P(A K |κ, Q K )P(κ|Q K ), it can be expressed as is a measure of the likelihood that the presence of the κ-tag in questions triggers answers within the time window K. We compute O κ (K) for four most frequent tags,  (1), and two tags combination, R100 (2), all for the distribution of expertise ExpS, and the answers containing tag no.12, in the case of Exp1. Lines are shifted vertically for better display. On each line, the scaling region is indicated by a straight line, whose slope gives the displayed value of the exponent H within error bars ± 0.009. Fig. 3a, in the sequence of time windows K. A significant difference among tags is apparent; for instance, "real analysis" triggers more activity than "linear algebra", but still less than "calculus" and "homework" tags.
In view of Eq.
(2), the information divergence is expressed (apart from a multiplicative factor smaller than one) as the negative of a relative entropy, which measures the information loss when the probability of answers to questions containing a given cognitive content κ is approximated by the probability of answers to all questions. This probability is expected to increase with the accumulation of expertise around each question over time. Consequently, the information divergence tends to zero for a sufficiently large time. I κ (K), computed for 30 leading tags in the empirical data, Fig. 3b, levels to zero for the majority of tags at large K. However, in the case of four tags, for which the increase in the likelihood of new activity occurs, Fig. 3a, the information divergence still decreases within the entire time interval in the empirical data, four marked curves in Fig. 3b. Note that these topics of a broad interest often combine with new tags, i.e. via the expertise of new arrivals. In this way, triggered answers that match these new tags expand the sample space A K , which keeps the information divergence finite. This feature is compatible with the innovation growth, reported in Fig. 2. Accounting the contribution of each particular tag in the knowledge creation, the results of information divergence complement the statistical measures in Fig. 1 and support the occurrence of Zipf 's law.
Signatures of self-organization in the social process. The constraints of cognitive recognition at the level of tags affect the social process between actors as well as the structure of the co-evolving network. The time-series analysis is used to uncover prominent features of the coherent fluctuations in this process. We determine the fractal characteristics (see Methods) of the activity time series. In particular, we consider the time series of the number of all answers to the existing questions per time step as well as the time series of such events that contain a particular cognitive element. The results, Fig. 4, reveal that the clustering of events (avalanches) occurs as a distinguishing feature of self-organized processes. In addition, a high persistence is observed in the temporal fluctuations, both in the empirical data and simulations for a varied range of the agent's expertise, Fig. 5. Measured by Hurst exponent (H > 0.5), a similar persistence was found in the processes of thematic discussions 15 . While somewhat lower Hurst exponents characterize the fluctuations in prototypal online social interactions 17 and market dynamics 33 .
Several sequences of clustered events, determined (see Methods) from the corresponding time series, are reported in Fig. 4a. Considering a particular sequence, the avalanche size differences (returns) d λ = s λ+1 − s λ , λ = 1, 2 · · · λ max are found to exhibit non-Gaussian fluctuations. Fig. 4b shows the universal plot of the distributions for the appropriately scaled returns. It turns that the q-Gaussian expression f(x) = a[1 − (1 − q)(x/b) 2 ] 1/(1−q) , which was observed in a variety of complex dynamical systems [34][35][36][37] , well approximates these distributions (see Fig. 4 in SI). Interestingly, the values 36 for the nonextensivity parameter q obtained in these cognitive-driven processes are higher compared with the corresponding parameter in emotion-driven social dynamics 15 .
The considered time series and the results of their fractal analysis for the empirical data and simulations are reported in Fig. 5a-d. Note that the rate of new arrivals in the empirical data, p(t), is also used as a creation rate of new agents in the simulation (see Methods). It exhibits distinct temporal correlations, which are carried out from the user's real life. In this case, p(t) also shows an increasing trend that eventually yields the increase in the entire activity over time both in the empirical and simulated data, Fig. 5a,c. Hence, the detrended fractal analysis is performed, as described in Methods. Shown in Fig. 5b, the fluctuations in the number of answers containing all tags in the empirical data are characterized by the scaling exponent H = 0.85 ± 0.07. Similarly, persistent fluctuations with the exponents in the range H ∈ [0.62, 0.68] are found in the series of selected events that contain a particular tag. The results of an analogous analysis of the simulated data are shown in Fig. 5d. The time series of the number of answers with all tags and series containing a particular tag have the scaling exponents that are slightly higher, implying a stronger persistence, compared with the corresponding series of the empirical data. Here, we also consider temporal activity of three identified combinations of tags, three bottom curves, which exhibit a similar scaling behavior. These results show that the enhanced self-organization among actors emerges in the interactions with tag recognition, which is mandatory in the model, and, to a large extent, applies to the empirical data.
Knowledge-sharing communities. The coevolving bipartite networks, Fig. 6, emerge in various scenarios in the simulations and empirical data. Note that these networks are different from the single-question graph in Fig. 1b. In this case, each actor is a separate node while a compressed information on a particular question including all answers related to that question represents a single node of the question-partition. The structure of communities detected in these networks clearly stresses the importance of the actor's expertise. In particular, in the case Exp1, the communities containing a specified single expertise grow as independent clusters, Fig. 6b. The situations when the agents have more than one expertise permit formation of larger communities of agents and questions. For a broad range of the agents' expertise, the compact communities grow resembling the ones in the empirical data (see also Fig. 5 in SI). It is interesting to note that a dominant node representing a very active knowledgeable Scientific RepoRts | 5:12197 | DOi: 10.1038/srep12197 actor appears in each community. On the contrary, the pattern of communities is entirely different when the cognitive recognition does not drive the linking, Fig. 6d.

Conclusions and outlook
Knowledge building via social interactions is studied as a collective phenomenon in an extended spacenetwork of actors and their artifacts, where cognitive recognition interactions are active. We have considered an abundant empirical dataset with cognitive elements as mathematical tags and a two-scale dynamics modeling close to the data, which enabled a quantitative analysis of the process from the microscopic to global scale. Our approach permits to reveal the importance of each cognitive element, as well as the expertise of each actor and its activity pattern in the creation of the collective knowledge. Specifically, when the interacting actors possess a diversity of expertise, the process based on the meaningful (cognitive recognition) interactions leads to the innovation and the advance of knowledge of the emerging communities. When a broad spectrum of expertise is present in the population of actors, i.e. as in the empirical system, the process is quite efficient in creating the enlarged space where innovation can occur. In this case, the formation of coherent communities that share the knowledge is associated with the presence of several actors possessing a broad range of expertise. Notably fewer developed communities and a slower advance of knowledge characterize the population with a narrow distribution of expertise; entirely isolated communities and vanishing of innovation is found in the limiting case of a single expertise per individual. In contrast to the meaningful interactions, the case with ad hoc social linking leads to an entirely different outcome, even though, the individual actors possess a broad distribution of expertise. The advance of innovation measured at the system level appears fragmented in a variety of the emerging communities, each of which shares a limited amount of randomly accumulated knowledge.
The dynamics of social and cognitive elements, interwoven at the elementary scale, induces a type of self-organized process where several quantitative characteristics appear to be different from a prototypal social dynamics. Besides theoretical implications of our results in the study of cognitive-driven processes on networks, the presented approach can be directly applied in the analysis of other empirical systems that entail social collaborative efforts 19,20 . Examples include, but not limited to, social computing, crowdsourcing scientific knowledge production or scientific discovery games, and other emerging areas of increasing importance in the modern science and society [38][39][40] . The presented theoretical concept can prove to be useful in modeling physical systems at nanoscale 41 , for instance, the assembly of smart nanostructured materials with biological recognition.

Methods
Data structure. As a platform for scientific collaboration 23 , Mathematics is a part of StackExchange: expert answers to your questions network. For this work, the dataset was downloaded on May 5, 2014 from https://archive.org/details/stackexchange. It contains all user-contributed contents on Mathematics since the establishment of the site, July 2010, until the end of April 2014. Specifically, the considered dataset contains 77895 users, 269819 questions, 400511 answers and 1265445 comments. A detailed information is given about user id, the user's activity (posting, answering, commenting), time stamp, list of tags for questions, and id of the corresponding question or answer to which a given answer or comment refers. The set of tags in answer/comment is inherited from the related question.
Network mapping and topology analysis. Actions of users in Q&A dynamics are mapped onto a directed bipartite graph, where users, as one partition, interact indirectly via artifacts (questions, answers or comments), as another partition. At a user node i an incoming link is inserted to indicate that that user reads the corresponding artifact while an outgoing link stands for the user's posting of a new artifact. The path of directed links from a question to a user to answer accurately describes the relationship of the answer to the original question, as it is included in the empirical data and strictly observed in the model. We also introduce a compressed bipartite network, where each question-node includes a question with all answers and comments related to that question; typically, they contain a larger number of tags thus expanding the original question's attributes. The graphs layouts are done using Gephi; the community structure is detected by the maximum modularity algorithm 42 .
The user's activity and estimation of expertise. Assuming that a particular expertise of a user i is necessary to answer a given question (which is marked by a set of tags), we consider the amount of the user's actions related to a particular tag, κ. Each tag that appears in the data is considered, in total 1040 tags. Hence, we compute a fraction κ p i of the user's actions N i that is spent at κ-tag. For those tags where κ p i exceeds the average probability for that user, we set unity, indicating that the user i is an expert in these categories; thus, the user's i expertise list is formed containing in total n i Exp tags which received unity mark. The rest of tags receive zeros for that user. The entropy measure for each user, , remarkably quantifies the heterogeneity of the user's expertise, both in answering and posting questions, Fig. 1e and Fig. 3b in SI, respectively. In the model, the agent's expertise is specified from the list of 32 tags. Different populations of experts correspond to the situations where each agent gets a fixed number n i Exp tags. In particular, one-tag expertise (Exp1), two-tags expertise (Exp2), four-tags expertise (Exp4), etc., correspond to the agent's expertise list with two, four, etc. randomly selected tags. The case marked as ExpS is close to the empirical data, i.e., each agent gets a list of 2 S ≤ 32 tags, where the random number S is taken from the empirical distribution in Fig. 1e.
Tag-related entropy. Following 28 , we define T j as the time interval between the first occurrence of a tag j and the last activity in the dataset. Counting the total number of times m that the tag j occurred, we divide T j into m equal subintervals. Then for each i = 1, · · · m we count the number of events f ji (m) related to the tag j in the i-th subinterval and compute the entropy S j (m) of the tag's j sequence as For each tag in the dataset, the tag's entropy normalized with the corresponding factor log(m) is represented by a point in Fig. 1f.

Conditional probabilities of tag-related events. Four conditional probabilities appearing in Eqs
(1) and (2), are defined and computed as follows: P(κ|Q K ), probability that the κ-tag is present given the presence of questions Q K , is computed as the frequency of κ-tag in all questions; P(A K |κ, Q K ), the probability that answers A K exist given the questions Q K with κ-tag, is given by the fraction of users whose expertise includes κ-tag of all active users in Kth window; P(A K |Q K ), the probability that answers A K exist given the question Q K (independently on the presence of κ-tag) is obtained as the ratio of the number of matching tags of all active users in Kth window with all tags in the present questions; P(κ|A K , Q K ), the probability to find the tag κ given the questions and answers in the Kth window is determined from the above probabilities via chain relation.

Definition of temporally clustered events.
A cluster (or avalanche) represents a set of events enclosed between two consecutive drops of the time series to the baseline (noise level) 43-45 . Detrended time series analysis. To remove the local trend (an increasing activity and a weak 4-month cycle) appearing in the time series in Fig. 5, we apply the method of overlapping intervals 17,46 . Then, for each time series h(k), k = 1, 2, · · · T h the profile ( ) = ∑ ( ( ) − ) is divided into N s segments of length n and the standards deviation around the local trend y μ (i) is computed at each segment μ = 1, 2 · · · N s , i.e., μ ( , ) = Model rules of interacting agents with expertise. Assuming that the new arrivals in the system boost the activity 47 , the agents are introduced with a pace p(t) agents per time step, where p(t) is the empirical time series of new users, shown in Fig. 5a. Each new agent receives a unique id and a fixed profile. The agents' profiles statistically match the profiles of users in the data. Specifically, the agent's activity level is set by the number of actions N i ∈ P(N i ), where P(N i ) is the distribution of the user's activity averaged over all users in the data (see SI: Fig. 3a). Subsequently, the agent's probability g i to post a question, or otherwise answer other questions, 1 − g i , is selected according to the interdependence g i and N i shown in Fig. 1c. Furthermore, the agent's expertise is fixed by first setting the number of tags n i Exp , according to the considered situation, i.e. Exp1, Exp2, Exp4, or ExpS, and then making the list of the agent's expertise of n i Exp tags by random selection from the common list of 32 tags. The interactivity time of a new agent is set to Δ T = 0, which implies its immediate action. After each completed action, a new delay Δ T ∈ P(Δ T) is taken, where P(Δ T) is the empirical distribution for users, Fig. 1d. Note that Scientific RepoRts | 5:12197 | DOi: 10.1038/srep12197 both p(t) and P(Δ T) have the same temporal resolution, one bin representing 10 minutes in the original data. All agents are systematically updated, and the agents with an expiring delay time are placed in the active agents list. Each active agent, with its probability g i , puts a new question. Otherwise, it attempts to answer a question from the updated list of interesting questions. The list is created by considering all questions of next-neighbor agents on which an activity occurred within previous T 0 = 10 steps. With a given probability that item can be searched elsewhere. In both cases, the agent's action is the subject of the expertise matching. In the case of μ-process, with the probability μ = 0.5 an agent connects to a random question and post an answer while the matching of tags with the agent's expertise is not required, but it can occur by chance (see illustration Fig. 1 in SI, and Algorithm in SI).