Cooperation patterns of members in networks during co-creation

Cooperation (i.e., co-creation) has become the principal way of carrying out creative activities in modern society. In co-creation, different participants can play two completely different roles based on two different behaviours: some participants are the originators who generate initial contents, while others are the revisors who provide revisions or coordination. In this study, we investigated different participants’ roles (i.e., the originator vs. the revisor) in co-creation and how these roles affected the final cooperation-group outcome. By using cooperation networks to represent cooperative relationships among participants, we found that peripheral members (i.e., those in the periphery of the cooperation networks) and core members (i.e., those in the centre of the cooperation networks) played the roles of originators and revisors, respectively, mainly affecting the quantity versus the quality of their creative outcomes. These results were robust across the three different datasets and the three different indicators defining core and peripheral members. Previous studies have considered cooperation behaviours to be homogeneous, ignoring that different participants may play different roles in co-creation. This study discusses patterns of cooperation among participants based on a model in which different roles in co-creation are considered. Thus, this research advances the understanding of how co-creation occurs in networks.


S1. Statistical information about the regression models of the quality of content
shows the statistics of all variables in the three regression models for the quality of content (Tables 1-3 in the manuscript). The distributions of each variable and the correlations between them are shown in Figs. S1-S3.

SCP-Wiki
The statistical information for each variable in the regression model of revision behaviours in SCP-Wiki (in Table 4 in the manuscript) is shown in Table S2. The distributions of each variable and the correlations between them are shown in Fig. S4.

S3. Topology features of the three cooperation networks
The degree distributions and rich-club coefficients of the three cooperation networks are shown in Fig. S5. The panels in the first column of Fig. S5 show the degree distributions of the three networks. We found that the node degrees in the three networks followed a power-law distribution. Namely, the possibility of nodes with degree k is related to the degree k in a power law ( ~ • − ℎ ). 1 To interpret these distributions, the power-law distribution indicates that most of the nodes in the three networks had very small degrees, and a small number of nodes had very large degrees. Then, we estimated the values of by OLS regressions to examine whether the three networks were scale-free. 1 We found that the three networks' s were larger than one but smaller than two ( =1.26 in the SCP-Wiki data; =1.64 in the SCP-Wiki data; =1.21 in the Idea Storm data). These results indicate that the three networks are not scale-free. This could be attributed to the fact that all members in the three communities were given equal opportunities to cooperate: both the newcomer and the experienced members had a chance to cooperate with other members; thus, the frequency of nodes with the corresponding degree does not decrease significantly when the value of the degree increases.
The panels in the second column of Fig. S5 shows the rich-club coefficients of the three networks, which reflect the extent to which well-connected nodes (i.e., nodes with high degrees) connect to each other in the three networks. The specific computation of the rich-club coefficient is as follows 2 : where > is the number of edges between the nodes with degrees greater than or equal to k, and > is the number of nodes with degrees greater than or equal to k. This indicator measures the number of connections among nodes with degrees at least k, normalised by the number of connections that exist between these nodes at most. An interesting point in Fig. S5 shows that in SCP-Wiki data and GitHub data, when the value of k was larger than a certain value (the logarithm of k was larger than 6 in SCP-Wiki data and the logarithm of k was larger than 8 in the GitHub data), the corresponding richclub coefficients became zero; however, in Idea Storm data, regardless of the value of k, the corresponding values of rich-club coefficients kept increasing. These results imply that there exist several subgroups in the cooperation networks of the SCP-Wiki and GitHub communities. These subgroups have a centre node with a very large degree as well as some member nodes with small degrees around the centre node. As a result, the nodes with large degrees (i.e., the centre nodes) are only connected to other nodes with smaller degrees (i.e., the member nodes). In contrast, in the cooperation network of the Idea Storm community, these subgroups did not exist. As a result, all nodes with large degrees are connected to each other.

S4. Results based on the betweenness centrality
As explained in the manuscript, in addition to degree, k-core, and eigenvector centrality, we also employed the betweenness centrality to measure the core-periphery positions of nodes in the network. In this section, we first explain the definition and computation of betweenness centrality. We then report the results based on the betweenness centrality in detail.
Simply speaking, the betweenness centrality measures the number of shortest paths in the network passing through the focal node. 3,4 The shortest path between two nodes is defined as the path connecting these two nodes by passing the fewest nodes. Its computational formula is as follows: where is the total number of shortest paths from node s to node t, and ( ) is the number of those paths that pass through node v. Based on this computation, a node with a large betweenness centrality can be considered to be a node bridging a large number of nodes through the short paths among these nodes. Therefore, the focal node can be considered as an information hub, 3 namely, a core member, in the network. Note that in the specific computation, we computed the betweenness centrality using an approximation algorithm with a cut-off value of 3. In other words, when computing node v's betweenness centrality, we only considered the shortest paths between nodes whose distance to v was equal to or less than 3. We used this approximation algorithm because the computation of betweenness centrality is far more time-consuming than the other three core-periphery metrics in the manuscript. 3,4 After we obtained the value of each node's betweenness centrality, we normalised its value over time to ensure that the value indicated the same core-periphery position across different time points. In the normalisation, we divided the values of betweenness centrality by ). N is the number of nodes whose distance to v is equal to or less than 3 at time point t. In this manner, regardless of the difference in time, a large value of betweenness centrality indicates that the focal node is closer to the core of the network.
Based on the normalised values of the betweenness centrality, we replicated the same analyses based on the other three metrics in the manuscript. Because many previous studies have pointed out that although betweenness centrality is a very important metric for detecting subgroups in a network (i.e., community detection application) 4 , it is not as effective as eigenvector centrality 5-7 and k-core 5  show that even after controlling for the time factor, the participants with smaller values of betweenness centrality (i.e., peripheral members) still had a significantly larger possibility of submitting initial contents at the next time point than those with larger values of betweenness centrality (i.e., core members).
The results shown in Fig. S7 and Tables S3-5 are also consistent with the results in   Fig. 3 and Tables 1-3 in the manuscript. In particular, the regression results (the statistical information of the variables in the regressions and the correlations among them are shown in Table S1 and Figs. S8-10) in Tables S3-5  Finally, the results in Table S6 are consistent with the results in Table 4. The regression results (the statistical information of the variables in the regressions and the correlations among them are shown in Table S2  In summary, the results based on betweenness centrality supported the conclusion in the manuscript: in co-creations, the peripheral members generated most of the initial content submissions. Then, based on these initial contents, core members provide their revisions and integrations, which improve the quality of the final co-created outcomes. One asterisk refers to a p-value smaller than 0.1, two asterisks refer to a p-value smaller than 0.05, and three asterisks refer to a p-value smaller than 0.01. One asterisk refers to a p-value smaller than 0.1, two asterisks refer to a p-value smaller than 0.05, and three asterisks refer to a p-value smaller than 0.01. One asterisk refers to a p-value smaller than 0.1, two asterisks refer to a p-value smaller than 0.05, and three asterisks refer to a p-value smaller than 0.01. One asterisk refers to a p-value smaller than 0.1, two asterisks refer to a p-value smaller than 0.05, and three asterisks refer to a p-value smaller than 0.01.  The grey area shows the 95% confidence intervals of the blue lines, generated by two-tailed t-tests (note that the confidence intervals for the GitHub data are too narrow to be seen). The insets show the predicted possibilities of initial content submissions by the participants who had different values of betweenness centrality but shared the same number of days in the communities (which equals to the average number days that all participants spent in the communities). The predicted possibilities of initial content submissions were generated by a logistic model with the likelihood of initial content submissions as the dependent variable, the values of the betweenness centrality as the independent variable, and the number of days that a participant spent in the communities as the control variable. The grey area shows the 95% confidence intervals of the average value generated by the two-tailed t-test (note that the confidential intervals in GitHub data are too narrow to be seen). Note that, in the Idea Storm data, the proportion of valuable ideas fluctuated. This is because most participants only submitted a small number of ideas (e.g., one or two ideas). Specifically, in many cases, even only one idea was evaluated as valuable. Therefore, it inevitably generated a large fluctuation (e.g., from 0 % to 50 %) in the proportion of valuable ideas. One asterisk refers to a p-value smaller than 0.1, two asterisks refer to a p-value smaller than 0.05, and three asterisks refer to a p-value smaller than 0.01. One asterisk refers to a p-value smaller than 0.1, two asterisks refer to a p-value smaller than 0.05, and three asterisks refer to a p-value smaller than 0.01. One asterisk refers to a p-value smaller than 0.1, two asterisks refer to a p-value smaller than 0.05, and three asterisks refer to a p-value smaller than 0.01. One asterisk refers to a p-value smaller than 0.1, two asterisks refer to a p-value smaller than 0.05, and three asterisks refer to a p-value smaller than 0.01.   Note: * p < 0.1; ** p < 0.05; *** p < 0.01; Standard Error shown in ( ); since all coefficients in the table were estimated based on the standardized variables, the sizes of the coefficients are comparable. Table S5. The supplementary results of the regression of content quality for Idea Strom data; the coefficients of the core-periphery metrics (i.e., the independent variable) is shown in the fourth row.

Dependent Variable: Idea is valuable or not in in Idea Storm
Core-periphery metric: Betweenness centrality Observations: 837 Note: * p < 0.1; ** p < 0.05; *** p < 0.01; Standard Error shown in ( ); since all coefficients in the table were estimated based on the standardized variables, the sizes of the coefficients are comparable.