Introduction

The users of the Dark Web forums have a strong need to effectively communicate with other members while protecting their privacy. Theoretically, there is a tradeoff between online self-disclosure and privacy protection. Inspired by the seminal research of social penetration theory (Altman and Taylor, 1973) and communication privacy management theory (Petronio, 2002), as information exchanges increase, there is a higher possibility that user identities in real life could be revealed. Social interactions also play a competing role in privacy calculus (Jiang et al. 2013; Krasnova et al. 2010; Laufer and Wolfe, 1977). For example, Lewis et al. (2008) show that people’s taste for privacy is shaped by both social influence and personal motives. When the cumulative risk is larger than the future benefit, the user has to leave the community. The question pertains to how they smell the danger. We argue that there exists a prospective theoretical puzzle in explaining anonymous users’ withdrawal behaviors: which of the two factors, information intensity or network connectivity expose anonymous users? By tracking how users leave the forums on the Dark Web, we can take a close look on the mechanisms of social penetration and privacy management in a fully anonymous setting.

Prior research on the Dark Web has left notable gaps that this study aims to fill. First, the bulk of empirical investigations into the Dark Web has predominantly centered on its economic dimensions, particularly focusing on operations within Dark Web markets, as demonstrated by a range of studies (Aldridge and Decary-Hetu, 2016; Barratt and Aldridge, 2016; ElBahrawy et al. 2020; Hardy and Norgaard, 2016; Martin, 2014b; Paquet-Clouston et al. 2018). Nevertheless, there is a scarcity of studies that have approached the Dark Web from a communication standpoint. This fully anonymous environment offers a communication context markedly distinct from traditional online communities on the Internet, characterized by unique patterns of online interaction. Second, while qualitative and critical studies have explored and discussed the potential and social implications of the Dark Web technology (Bancroft and Reid, 2017; Maddox et al. 2015; Martin, 2014a; Munksgaard and Demant, 2016; Pace, 2016; Sotirakopoulos, 2018; Tzanetakis et al. 2016), very few studies have ventured into investigating online behavior with behavioral logs. Third, prevailing theories have predominantly centered on the social penetration of close relationships, but the intricacies of how individuals manage their privacy in fully anonymous social settings warrant more extensive exploration. The dynamics of how users on the Dark Web interact while remaining hidden in the shadows remain under-examined in the existing literature, offering potential theoretical contributions to the domains of anonymous communication and group socialization. Fourth, while existing theories have indeed identified various factors contributing to social penetration, such as culture, gender, motive, context, and the risk-benefit ratio, their relative importance has not been subjected to comprehensive comparative examination (Altman and Taylor, 1973; Petronio, 2002). We aim to fill these gaps and provide valuable insights into the Dark Web’s communication dynamics, user interactions, and the different aspects of factors influencing them.

This study aims to bridge these research gaps by delving into the communication dynamics of the Dark Web and exploring online behavior through behavioral logs, offering a fresh perspective on this unique digital realm. First, by conceptualizing the cryptomarket forums as anonymous online communities, the study examines the sustainability of user activities within this context. Second, through investigating patterns and mechanisms within this unique online communication environment, we aim to enhance our understanding of user online behavior and the dynamics of group socialization in anonymous online communities. Third, utilizing online behavioral data from cryptomarket forums, the study investigates how language usage and online social networks influence the long-term usage of these forums. The overall findings of this study have substantial implications for understanding group dynamics and anonymous online communities.

In the following sections, we will begin by introducing the Tor network, the Dark Web, and the cryptomarket, which serve as the research contexts for this study. Subsequently, we will provide a comprehensive review of the theoretical perspectives on technical anonymity and anonymous communication in online communities. Furthermore, we will formulate two research questions pertaining to the potential factors influencing users’ engagement in anonymous online communities. Using data collected from the digital traces of online user behavior, we will assess talkativeness, linguistic diversity, and online leadership within the online social network. Survival analysis will be employed to examine the relationships between these potential factors and user engagement in Dark Web marketplaces. Lastly, we will discuss the implications of the research findings.

Tor, dark web, and cryptomarket forums

The Tor (The Onion Router) network is a special type of network that is only accessible through the Tor browser. Just like the name “The Onion Router” goes, the structure of the Tor network is analogical to an onion, which consists of a group of relays operated by volunteer individual users or non-profit organizations. The relays work as a series of virtual tunnels for connecting the information provider (sender) and user (receiver). Therefore, the traffic on the Tor network could be hidden and encrypted by multiple layers and routed between different relays multiple times. With this careful design, user identities are fully anonymized and untraceable on the Tor network. In summary, the Tor anonymity network has provided a convenient way to be extremely anonymous in some special application contexts, which to some extent becomes an important alternative ecosystem on the Internet, usually known as the Dark Web. The primary emphasis of this study will center on user behavior within the Dark Web forums, with a specific focus on the forums associated with the cryptomarket.

Many anonymous services and websites hosted on the Dark Web contain unethical content (e.g., pedophilia and child pornography) and online criminal activity (e.g., drug markets, financial fraud, illegal weapons, and espionage) (Biryukov et al. 2014; Biryukov et al. 2013; Guitton, 2013; Soska and Christin, 2015). Among them, the marketplaces have been highly prosperous since 2011, formally known as cryptomarkets (Martin, 2014b). Cryptomarket is a special type of online market hidden on the “Dark Web”. The technical basis of the Tor anonymity network has made anonymity a fundamental feature of how cryptomarkets are constructed and function as markets and as communities (Bancroft and Reid, 2017). In cryptomarkets, users are highly anonymized and well protected by the onion-routing technology behind the Tor network, which creates a very different space for communicating and trading with each other in many aspects of the communication process. Moreover, most cryptomarkets not only present and sell various products but also provide a forum for users to discuss valuable topics during the whole purchase process, such as the usage of the website, delivery of the products, and vendors’ ratings. The market-affiliated forums have provided great opportunities for researchers to analyze user behavior in anonymous online communities. With the advanced privacy-enhancing technologies employed to protect user identities, cryptomarkets have shown great potential in changing the landscape of global drug distribution networks, leading to worldwide opportunities for drug consumers and challenges for law enforcement (Nugent, 2019).

Starting from 2011, there have been several prosperous cryptomarkets, created and operated widely around the whole world through the Tor anonymity network. As one of the most popular types of services on the Tor network, in most cases, the cryptomarkets will provide affiliated forums for users to discuss the purchase process, the usage of the website, delivery of the products, vendors’ ratings, etc. Topics discussed in the forums typically center around sense-making, technical difficulties, purchase processes, logistic issues, vendor and product reviews, and cyber security issues (Hazel Kwon and Shao, 2020; Porter, 2018). As the technical design of the Tor network has made anonymity a fundamental feature of the cryptomarket forums, users are highly anonymized and well protected by the onion routing technology behind the Tor network, which creates a very different space for communicating and trading with each other in many aspects of the communication process (Bancroft and Reid, 2017). Therefore, the cryptomarket-affiliated forums have provided great opportunities for researchers to analyze users’ online behavior in a fully anonymous setting. In this study, Dark Web forums are examined as anonymous online communities.

Technical anonymity versus social anonymity in online interactions

Theoretical works in anonymous communication and organizational communication argue that there are several types and dimensions of anonymity, e.g., technical anonymity versus social anonymity, physical anonymity versus discursive anonymity, and self-anonymity versus other-anonymity (Rains and Scott, 2007; Scott, 1998; Scott et al. 2011). In the context of the Dark Web, it is especially important to clarify the technical anonymity and social anonymity provided by privacy-enhancing technologies. As previously introduced, technical anonymity is well ensured by the design of the onion routing network. However, this only provides part of the guarantees for user identity. Tor networks can’t solve all anonymity problems. Tor network only focuses on protecting the data transmission procedure. There could still be potential risks caused by inappropriate online behavior by the user of the Tor network. Although technically well protected, making no chances for third-party tracking and surveillance, every user must be vigilant and careful about their self-disclosure when using the Tor network to ensure a high level of anonymity.

Technically, it is always possible to infer user identity from their social interactions and group socialization process with other people (Al Jawaheri et al. 2020; Biryukov et al. 2013). On the normal websites on the World Wide Web such as social networking sites, the group socialization process naturally occurs everywhere and never stops. Assisted by rapidly developed Web technologies, people can now share various types of content, e.g., text, photo, voice, video, etc. with others in a short time at almost any place. The enormous data about social interactions exposed on social media has already provided enough information for user identification and personality prediction, even without directly tracking users’ IP addresses (Kosinski et al. 2013; Stachl et al. 2020). This could make a huge impact on Dark Web users. Any casual leak of personal information may lead to vital consequences, especially for some controversial or illegal activities. For example, the founder of Silk Road, Ross Ulbricht, got caught and arrested by law enforcement only because he had once accidentally exposed his personal email address on the Dark Web. Therefore, the Dark Web users are consistently faced with a social dilemma between self-disclosure (social interactions) and privacy protection (social anonymity) during the entire group socialization in anonymous online communities.

Much like conventional counterparts such as Reddit, which have attracted the attention of researchers across various disciplines in recent years (Medvedev et al. 2019; Parmentier and Cohen, 2019; Seering et al. 2019), the cryptomarket forums primarily serve as platforms for discussion and social interaction, enabling individuals with shared interests or opinions to engage over the Internet, all without face-to-face contact. Notably, Dark Web forums exhibit significant differences from mainstream online platforms like Reddit. Firstly, Dark Web users typically have limited real-life or offline social connections, leading to natural disconnection and separation among its users, unlike popular social networking sites such as Facebook and Twitter, which heavily rely on offline social relationships. Moreover, Dark Web access requires sophisticated technology and carries a high entry threshold, challenging ordinary users. Consequently, Dark Web users tend to be more motivated and active compared to users on mainstream social media platforms. The elevated entry requirements attract individuals with specific needs for a high degree of anonymity, often driven by economic incentives, which may be illegal or unethical. Lastly, although the Dark Web is perceived as an environment conducive to anonymous online interactions, achieving technical anonymity does not ensure complete social and perceived anonymity, as cautioned by the theory of anonymous communication. Users must exercise care in their online self-disclosure to prevent inadvertent disclosure or inference of private information during online social interactions.

In summary, the Dark Web forums differ from common online forums in terms of social connections, user motivation, and the need for anonymity. Understanding these distinctions is crucial for comprehending the dynamics of Dark Web communities and the unique challenges associated with online interactions in this context. Just like all communities, the first step for every user of anonymous online communities is to join the group. Then the new members need to learn the group norms through social interactions and communication with each other. Theoretically, the whole group socialization process can be categorized into five phases (Levine and Moreland, 1994; Moreland and Levine, 1982), namely investigation (entry), socialization, maintenance (acceptance), resocialization (divergence), and remembrance (exit). A successful socialization process could lead to good acceptance of the newcomers into the group, while a failed socialization process will just result in divergence of the group and the worst case is that the group member exit from the group. In this sophisticated process, the most important outcome and consequence is exit behavior, which completely ends all the social interactions between the group members.

User engagement in anonymous online communities

From the perspective of media use, one of the most important research questions is how and why users use and continuously use online communities as a media platform to communicate with others. As anonymity is an important affordance of computer-mediated communication, which could lead to substantial implications and consequences (Bancroft and Reid, 2017; Scott and Rains, 2020; Scott et al. 2011; Woo, 2006), this study pays special attention to the use and sustainable use of online communities in a fully anonymous setting, i.e., the anonymous online communities. Therefore, the exit behavior (or user sustainability in contrast way) on the Tor network is a focal variable in this study.

There has been a large body of research on continuance behavior in information systems. Bhattacherjee (2001) claims that users’ continuance intention is determined by their satisfaction with the IS (information system) use and the perceived usefulness of continued IS use. Limayem and Cheung (2008) expanded Bhattacherjee’s IS continuance model by adding a moderating effect (IS habit) to IS continuance intention and IS continued usage. Bhaskar et al. (2019) decomposed the perceived usefulness into the perceived quality and perceived usability. In the context of open-source software (OSS) development, Wu et al. (2007) find that developers’ feelings of satisfaction and their intentions to continue with OSS development were influenced by both helping behavior and economic incentives. Jin et al. (2010) find that users’ continuance intention to participate in an online community is determined by both satisfaction and affective commitment.

On the Dark Web, Christin (2013) has analyzed the survivability of both users and products on the market of Silk Road 1 and found that about half of the sellers leave the site within 100 days of initial appearance; around a fifth of the sellers stay for less than three weeks. Most of the sellers will disappear within roughly three months since their arrival. Only a core of 112 sellers has been online throughout their whole observation interval. Similar findings were replicated on another 16 cryptomarkets from 2013 to 2015, including Silk Road, Agora, and Evolution (Soska and Christin, 2015) and also OpenBazaar (Arps and Christin, 2020). All the empirical observations indicate that vendors usually hold low stocks and operate primarily in the retail space, with small product quantities, low sales volumes, and a short lifetime in the cryptomarkets. These previous works have already paid attention to survival analysis on the cryptomarket ecosystems but are majorly limited to the analysis of products and vendors. Also, most of the work stops at describing the estimated Kaplan-Meier curve (the K-M curve), which only depicts the decaying process of user and vendor lifetime without further explorations on the potential explanations.

As most of the data collected on the cryptomarket forums are user behavior logs in the form of text on the webpage, it is necessary to develop various methods and strategies to characterize users’ behavior according to the content they produced (Bagozzi and Dholakia, 2002; Chen and Liu, 2021). Previous studies have claimed that both language use and online social networks on the Dark Web demonstrate different patterns from the normal platforms on the Internet. For example, Choshen et al. (2019) find that the text related to legal and illegal activities on the Dark Web is different from each other in terms of not only word use but also shallow syntactic structure (represented through POS tags). Further, Zamani et al. (2019) examined the differences in the structure and dynamics of networks in both dark and public web forums. They find that the degree distribution in public forums is much more homogeneous than in Dark Web forums and even “semi-dark forums” (e.g., 8chan). Therefore, both language use and online social network characteristics should be considered when trying to understand user behavior in anonymous online communities.

Linguistic diversity and talkativeness

Language use can be used to reflect and infer many characteristics of the users, from the psychological process, e.g., affective processes and perceptual processes, to social processes, e.g., social concern and social support (Tausczik and Pennebaker, 2010). Among the psychological meanings of language use, Batikas and Kretschmer (2018) found that negative emotion is highly correlated to exit behavior in the Agora market. In other words, entrepreneurs with a high share of negative feedback are much more likely to exit the market. This finding is consistent with another work by Bhaskar et al. (2019), who also find that sellers with poor ratings are significantly more likely to exit. The finding is further confirmed across several different cryptomarkets, including Silk Road 2.0, Agora, Evolution, and Nucleus. All the previous findings suggest that online illegal markets function in a similar way to legal markets—the economic performance is significantly harmed by negative ratings. Since negative ratings and comments under the vendor profiles in the anonymous online communities play essential roles in the group socialization process, we give special focus on the user sentiment when exploring the factors that explain the user survivability from textual features. Therefore, we propose the following research question:

RQ1: What is the relationship between the user’s language use and exit behavior in anonymous online communities?

Online leadership in social network

Social network plays important roles in shaping markets, either online or offline (White, 1981). Previous studies on cryptomarkets show that buyers tend to exit the market instead of retaliation against sellers after negative experiences in social interactions with others (Norbutas et al. 2020). In many different types of online communities, social rewards such as enjoyment and reputation are important predictors of continuous engagement (Nov et al. 2010; Zhang et al. 2020). Also, social support (e.g., emotional support and informational support) gained from the online social network on the online communities could further motivate the user to keep staying in the online group, especially for health-related communities (Wang et al. 2017; Wang et al. 2012). From the perspectives of online influence and leadership, the more central position a user occupies in the network, the more likely he/she would become the leader and have more influence in the community, which could further enhance the commitment to the group (Huffaker, 2010; Johnson et al. 2015). Based on all the considerations, we propose the second research question:

RQ2: What is the relationship between the user’s position in online social networks and exit behavior in anonymous online communities?

Methods

Data Collection

We analyze data from the forums affiliated to three popular cryptomarkets, Silk Road 1, Silk Road 2, and Agora. Silk Road 1 was the first online black market on the Tor network (Ormsby, 2014). It was initially established in February 2011 and later shut down by the Federal Bureau of Investigation (FBI) in October 2013 for large-scale illegal trade of drugs. After that, Silk Road 2 was established as a substitute for the previously closed market. At the same time, Agora was also launched, which lasted for another two years until August 2015. Agora even survived Operation Onymous in November 2014, when Silk Road 2 was seized by a joint law enforcement operation between the Federal Bureau of Investigation (FBI) and the European Union Intelligence Agency Europol. After Evolution closed in an exit scam in March 2015, Agora became the largest cryptomarket on the Dark Web.

All the data are collected from the publicly available Darknet Market Archives (Branwen et al. 2015). For the Silk Road 1 forum, the data range from the beginning of the forum (June 2011) to November 2013, including 39514 usernames and 81408 threads with 872961 posts. In Silk Road 2 forum, there are 39862 usernames and 29002 threads with 390086 posts from October 2012 to April 2014. Data collected on the Agora forum has a shorter time range from December 2013 to April 2014, containing 13590 users and 10666 threads with 84918 posts. Users are quite active and contribute substantially to the forum. For example, more than 3500 users (9%) on the Silk Road 1 forum wrote at least 50 posts during their lifetime. In Silk Road 2 and Agora forum, the percentages are 12 and 7%.

User engagement

Since all three cryptomarkets were taken down by exogenous forces (law enforcement), there is a censoring problem in the data collected, i.e., it is hard to identify whether it is an intended leave or due to the closure of the website. Therefore, we take a time window, e.g., 180 days for the Silk Road 1 forum, to define the survival status of a user (for the Silk Road 2 forum and Agora forum, we use 30 days as the period in the collected data is much shorter). For the whole observed period, if the number of days counting from the last post time to the end date of data collection, denoted as Tlasti nactive, is larger than 180 days, the user will be identified as a user who exits. Otherwise, the user will be labeled as right-censored. This can be written in a function as below:

$${{{\mathrm{Status}}}} = \left\{ {\begin{array}{ll} {{{{\mathrm{Exit}}}},\,T_{{{{\mathrm{last}}}}\,{{{\mathrm{inactive}}}}}\, > \,{180}\,{{{\mathrm{days}}}}} \\ {{{{\mathrm{Censored}}}},\,T_{{{{\mathrm{last}}}}\,{{{\mathrm{inactive}}}}}\, \le \,{180}\,{{{\mathrm{days}}}}} \end{array}} \right.$$

The survival function S(t), is the probability that a subject survives longer than time t. Let T be a random variable of the user lifetime, i.e., the length of time before a user exits from the forum. The survival function defines the probability that the death event has not occurred yet at time t, or equivalently, the probability of surviving past time t. S(t) is theoretically represented by a continuous and smooth curve. Typically, it is approximated with the K-M curve using the Kaplan-Meier Estimator, which calculates the proportion of users surviving for a specific duration before the event takes place. Mathematically, it is defined as:

$$\widehat S\left( t \right) = \mathop {\prod}\limits_{t_i \le t} {1 - \frac{{d_i}}{{n_i}}}$$

where ti is the time when at least one event happened. di is the number of events (e.g., exit) at the time ti and ni is the number of individuals known to have survived (have not yet had an event or been censored) up to the time ti.

Linguistic diversity and talkativeness

For the message-level characteristics, we measure two variables. The first is Linguistic Diversity, which is measured as the number of unique words found in a message, calculated by dividing the number of different words by the total number of words. This is also referred to as the type/token ratio (Huffaker, 2010) or richness (Spitters et al. 2015). The second is Talkativeness, which is the average length of messages contributed by each participant, i.e., the sum of the total words found in each message divided by the total number of messages.

Sentiment (positive/negative emotion)

The valence of emotion in a piece of text is measured by the percentage of positive and negative words according to LIWC (Tausczik and Pennebaker, 2010). The emotional content of each user’s posts is measured by calculating the average of both positive and negative emotions expressed in all of their messages.

Online leadership in social network

Except for the measurement of language use, we also seek measurements based on the online social network formed by user interactions. An edge is created between who replied to whom under each thread. For example, if user A (source) replies to user B (target), it will be considered as a directed edge in the network. In this network, various node-level structural measures can be calculated. To be specific:

Expansiveness is measured by outdegree centrality. This measure represents the number of outgoing links for each author (i.e., the number of users that this user has replied to), which is one of the fundamental factors that shape the formation of lasting identification as a member in a virtual community (Kozinets, 1999).

Reply Trigger is measured by in-degree centrality in the reply network, i.e., the total number of incoming links (i.e., the number of users that have replied to this user). This measure indicates the ability of a user to inspire and motivate other users to reply to a message.

Brokering in the reply network identifies which authors link the shortest paths or geodesics in the network. It is an indicator of to what extent a user acts as a “structural hole” (Burt, 2009) in the reply network.

Reciprocity is measured as the frequency of user interactions in a mutual dyad, i.e., both actors reply to one another, regardless of the order of the replies (Wasserman & Faust, 1994). When doing this, the networks were converted into symmetric relations, and the sum of reciprocal links was calculated for both authors.

User activity

Two variables concerning user activity were measured as control variables. The first one is the number of posts, measured by the total number of posts that a user contributes to the forum. In addition to that, we also measure the number of threads that are started by the user. Both measures indicate the commitment to the anonymous online community, which is expected to correlate to the final survival status (active or not) on the platform.

Analytical approach

Survival analysis is a widely used statistical model to analyze the duration of time until one or more events happen, such as a death in biological organisms and failure in mechanical systems. Usually, the event in survival analysis refers to death, disease occurrence, disease recurrence, recovery, etc. In this study, the exit behavior from the cryptomarket forums is considered as the event of interest for survival analysis.

The basic Cox proportional hazard model takes the log hazard ratio of an individual as a linear function of a group of time-independent covariates and a population-level baseline hazard that changes over time. Let X = [X1, …, Xp] denote a group of covariates for an individual at time or duration t, and the log hazard ratio can be modeled by:

$$h\left( {\left. t \right|X} \right) = h_0\left( t \right)exp\left( {\beta X} \right)$$

where h0 (t) is the baseline probability that a user will exit at time t with X = 0, which reflects the change of hazard ratio over time. eβi represents the relative hazard, a proportionate increase or reduction in hazard, associated with Xi. With one unit increase in the covariate Xi (with all other covariates held constant), the probability that the user exits the forum at time t will increase by eβj.

Results

The growth curves for both user and post activity in all three distinct cryptomarket forums are depicted in Fig. 1. The Silk Road 1 forum experienced the most rapid growth right from its beginning, signifying its widespread popularity within the ‘Dark Web’ community. The Silk Road 2 forum exhibited a growth pattern similar to Silk Road 1, with its rapid expansion following the closure of Silk Road 1 in October 2013. This abrupt shift in growth pattern underlines a strong substitution relationship between the Silk Road 1 and Silk Road 2 platforms. The Agora forum, another prominent cryptomarket that emerged after the shutdown of Silk Road 2, also displayed a growth pattern similar to that of the Silk Road 1 forum, despite its relatively short observation period of approximately four months since its inception. All three communities have undergone growth stages. Towards the end of the observed period, the growth of these communities has plateaued, leading to a flattened curve. This indicates that the critical mass of users has been achieved within the observed timeframe. These findings align well with the expected S-Curve as described in the Diffusion of Innovations theory (Rogers, 2003).

Fig. 1: S-curve of community growth on three cryptomarket forums.
figure 1

Three cryptomarket forums incudes the Silk Road 1 forum (left), the Silk Road 2 forum (middle), and the Agora forum (right). The blue line represents the number of users and the orange line represents the number of posts.

Figure 2 illustrates the survival curves of user lifetimes in the three forums. Users from all three platforms exhibit similar survival patterns. Roughly half of the users exit the platform within 30 days of joining, indicating a generally short user lifetime. Conversely, the number of users with longer lifetimes in anonymous online communities declines at a much slower rate. As user lifetimes increase, the survival curve becomes flatter. Among the three platforms, Silk Road 1 has the longest lifespan, with the most long-lasting user remaining active on the platform for over 800 days. Following law enforcement actions in 2013, variations in patterns have become more apparent in Silk Road 2 and the Agora forum.

Fig. 2: Survival curves on three cryptomarket forums.
figure 2

Three cryptomarket forums includes the Silk Road 1 forum (left), the Silk Road 2 forum (middle), and the Agora forum (right).

Further, we investigate the relationship between overall lifetime linguistic diversity and survival status for each user across the three platforms. Cox proportional hazard regressions, as explained in the Methods part, are conducted. Both textual characteristics and user characteristics in the online social network are used as control variables. Table 1 presents the number of cases, means, and standard deviations for each variable calculated from the three platforms. The three platforms exhibit similarities in terms of mean and standard deviations, enabling examinations and comparisons across the platforms.

Table 1 Descriptive statistics on three cryptomarket forums.

The regression results are shown in Tables 2, 3, and 4 for three forums, respectively. In the regressions, communication activities, i.e., the number of posts and number of threads started by the user are both negative predictors of the exit behavior. This result is very easy to understand as the survival status is defined by the posting activities. We recognize users who haven’t posted for a certain period as dead (leaving) users. Therefore, if a user has more posts or starts more threads, it will be less likely that he is a short-living user in the community. Besides, positive emotion (B = −0.304, 95% C.I. = [−0.585, −0.023], p < 0.05) is a negative predictor of exit behavior. However, this relationship is found in neither Silk Road 2 nor the Agora forum.

Table 2 Cox proportional hazard model on Silk Road 1 forum.
Table 3 Cox proportional hazard model on Silk Road 2 forum.
Table 4 Cox proportional hazard model on Agora forum.

Linguistic diversity and talkativeness turn out to be critical factors that explain the survival behavior in anonymous online communities. For regressions in the Silk Road 1 forum, linguistic diversity (B = 2.523, 95% C.I. = [2.446, 2.600], p < 0.001) and talkativeness (B = 0.154, 95% C.I. = [0.139, 0.168], p < 0.001) are both significant predictors that positively correlated to the proportional hazard, i.e., the probability to exit from the platform when we only control for activities. The relationship still exists even after the textual characteristics and network characteristics are controlled. The findings are well replicated on both Silk Road 2 and Agora forums. In the Silk Road 2 forum, linguistic diversity (B = 3.311, 95% C.I. = [3.196, 3.425], p < 0.001) and talkativeness (B = 0.921, 95% C.I. = [0.851, 0.991], p < 0.001) both significantly predicts the proportional hazard. The relationship still exists after controlling textual and network characteristics. In the Agora forum, linguistic diversity (B = 2.634, 95% C.I. = [2.114, 3.154], p < 0.001) and talkativeness (B = 1.698, 95% C.I. = [1.130, 2.267], p < 0.001) are both significant predictors to the proportional hazard, controlling for textual and network characteristics.

Discussion and conclusion

The analytical results in this paper yield several noteworthy findings. Firstly, rather than sentiment and online leadership, the analysis demonstrates that both linguistic diversity and talkativeness serve as significant predictors that correlate with user exit behavior. In regression analyses conducted on three cryptomarkets, only positive emotion exhibits significance on the Silk Road 1 forum, while neither positive nor negative emotions prove to be significant predictors of user exit behavior on the other two forums. Furthermore, centrality-based network measures (in-degree centrality, out-degree centrality, and brokering) are all found to be insignificant in predicting user exit behavior. This result is intriguing since network centrality is commonly considered an important indicator of social position and leadership in social networks. In summary, these findings indicate that the exposure risk in anonymous communities primarily stems from the strength of information rather than social connections.

The users who engage in more extensive and varied conversations are more likely to leave the forum on the Dark Web. One possible explanation is that linguistic diversity and talkativeness collectively reflect the volume of information contained in the messages posted by users. It is important to consider the nature of cryptomarket forums in the analysis. The majority of discussions on these forums revolve around trading and beginner-oriented topics, such as learning how to successfully conduct purchases (Van Hout and Bingham, 2013). For instance, users need to familiarize themselves with using Bitcoin for transactions and understanding Escrow services to ensure product delivery (Lacson and Jones, 2016). In other words, every new user entering the cryptomarkets undergoes a knowledge accumulation process due to the distinct operations compared to conventional online markets (Chen et al. 2022). The slight increase from the 10 to 20% user life stage on the Silk Road 1 forum (as depicted in the figure) also suggests the presence of this learning process.

What motivates users to join and leave anonymous online communities? In a nutshell, social interactions are not the primary motivator for users in anonymous online communities; instead, they are driven by distinct and well-defined purposes, primarily centered around transactions between buyers and vendors in the cryptomarket forums (Tzanetakis et al. 2016). This differs from common online communities. With the progression of Web 2.0 technology, webpages have become more interactive, and social interactions have gained prominence across various online platforms (Bagozzi and Dholakia, 2002). Users tend to develop stronger attachments to online platforms, seeking information, social support, and engaging in conversations (Wang et al. 2017). Today, it is often assumed that people enter the online world to enhance their offline social relationships and daily lives (Kosinski et al. 2013). Nevertheless, the analysis of anonymous online communities presents a contrasting narrative and reveals alternative possibilities for technological design in human-computer interactions.

To provide a more comprehensive insight into user dynamics on the Dark Web, we propose the Dark Web Privacy Dilemma, expanding upon and developing this concept in alignment with the existing concepts, such as the Dark Web Dilemma (Jardine, 2015), and the transparency paradox (Tzanetakis et al. 2016). The Dark Web poses a compelling privacy dilemma for its forum users, who strive to engage in effective communication while safeguarding their identities. In this realm, there exists an inherent tension between the desire for online self-disclosure and the imperative to protect one’s privacy. As information exchanges intensify on Dark Web forums, the risk of revealing real-life identities from online conversations increases (Jardine, 2019). This raises concerns about the potential exposure and the subsequent consequences that may follow. When the cumulative risk outweighs the anticipated future benefits, users are compelled to make the difficult decision of leaving the community altogether.

Exploring the factors that prompt users to withdraw from Dark Web forums opens a window into the intricate interplay of privacy concerns and social interaction. It allows us to delve into the motivations, risk assessments, and decision-making processes that users undertake to safeguard their identities (Martin, 2014b). Understanding the privacy dilemma faced by Dark Web users not only contributes to theoretical advancements in the field but also has practical implications. The insights gleaned from this exploration can inform the design of privacy-enhancing technologies, policies, and educational initiatives that better align with users’ needs for effective communication while preserving their privacy. One example of such an initiative is the creation of privacy discussion rooms (Doiciar and Cretan, 2021).

Overall, the contribution of this study is three-fold. First, the previous analysis mainly focuses on vendors and products (Christin, 2013; Christin, 2017; Soska and Christin, 2015). In this study, the unit of analysis is an individual user. This is the first work to examine the online social interactions between cryptomarket users with a focus on sustainable user behavior. Second, this study explored user continuity from a language and network perspective to the Cox model on the cryptomarkets. Most of the previous works merely plot the estimated Kaplan-Mayer curve to demonstrate the lifetime of either users or products. Third, the findings of the analysis imply a highly purpose-driven behavior in online anonymous communities, which is quite different from the commonly seen driving forces in normal online communities, where most of the users are intended to socially interact with each other.

The main shortcomings of this study come from data collection. Unlike common approaches in examining the sustainable use of media technology in the existing literature, such as surveys and experiments, all the measurements in this study are calculated from online behavior (posts), which makes it hard to collect individual-level demographics and perceptions. This limitation is largely constrained by the special design of the cryptomarkets, which forbid collecting user identity from the technical foundations. It is particularly hard to collect individual-level information using traditional survey due to the potentially strong social desirability as most of the behavior on cryptomarkets are illegal and unethical. Another limitation of this study is the measurement of information disclosure in online conversations. This study suggests that linguistic diversity and talkativeness together imply the volume of information delivered in the messages. The findings could be further confirmed by quantifying information through more refined methods.

Subsequent research endeavors may explore various promising directions. Firstly, acquiring additional data can enhance the robustness of findings and support more comprehensive cross-validation. Future studies might combine behavioral log analysis of cryptomarket users with survey data, further fortifying the empirical foundation. Secondly, there is an opportunity to advance methodologies for assessing information disclosure, such as implementing measures like the information redundancy measure on social media platforms (Liang and Fu, 2016). This approach has the potential to notably enhance research findings. Thirdly, advancements in measurement techniques can contribute to a more precise operationalization of theoretical frameworks. For example, future investigations may employ advanced methodologies to calculate individualized risk-benefit ratios based on users’ behavioral traces, providing more direct insight into factors influencing online forum participation.

In conclusion, this study investigates the digital traces of user actions on the Dark Web’s three most popular cryptomarket forums: Silk Road 1, Silk Road 2, and Agora. The findings reveal that users who actively engage in discussions and use a diverse vocabulary are more inclined to discontinue their forum participation. Moreover, there is no significant correlation between network characteristics and user engagement. These results imply that the risk of exposure within anonymous communities primarily arises from the intensity of information rather than social networking. The research sheds light on the privacy predicament inherent in this hidden online realm and offers valuable insights into the dynamics of user behavior surrounding anonymity-enabling technologies on the Internet.