Phrase-level pairwise topic modeling to uncover helpful peer responses to online suicidal crises

Suicide is a serious public health problem; however, suicides are preventable with timely, evidence-based interventions. Social media platforms have been serving users who are experiencing real-time suicidal crises with hopes of receiving peer support. To better understand the helpfulness of peer support occurring online, this study characterizes the content of both a user’s post and corresponding peer comments occurring on a social media platform and present an empirical example for comparison. It introduces an approach that uses pairwise topic models to transform large corpora of discussion into associated topics of user and peer posts. The key advantages of this approach include: (i) modeling both the generative process of each type of the corpora (i.e., user posts and peer comments) and the associations between them, and (ii) using phrases, which are more informative and less ambiguous than words, in addition to words, to represent social media posts and topics. The study evaluated the method using data from Reddit r/SuicideWatch. It examined how the topics of user and peer posts were associated and how this information influenced the peer perceived helpfulness of the peer support. Then, this study applied structural topic modeling to data collected from individuals with a history of suicidal crisis as a means to validate findings. The observations suggest that effective modeling of the association between the two lines of topics can uncover helpful peer responses to online suicidal crises, notably providing the suggestion of professional help. The proposed technology can be applied to “paired” corpora in many applications, such as technical support forums, question-answering sites, and online medical services.


Introduction
A s suicide rates continue to rise across the United States (for Disease Control, 2018), an increasing number of individuals are also experiencing non-lethal suicidal thoughts and behaviors (i.e., suicidal ideation, suicide attempts). To support individuals through suicidal crises, and ultimately prevent deaths by suicide, national crisis resources have become increasingly important (i.e., National Suicide Prevention Lifeline); however, due to concerns of call volumes exceeding available resources (Service, 2018), resulting in delayed assistance, and fears of unwanted consequences in using such services (e.g., forced hospitalization (Drum et al., 2009)), many individuals in crisis are instead turning to informal crisis resources, such as peer support.
A leading avenue for peer support of stigmatized topics, such as suicidality (Berry et al., 2017;Birnbaum et al., 2017), is social media platforms. Reddit is one such, widely used platform with more than 330 million active users. Like many other platforms, Reddit depends almost solely on crowd-sourced interactions, which occur on topic-based conversation "threads", with minimal involvement from moderators (Reddit, 2018). Within Reddit, there is a growing thread called r/SuicideWatch ((n.d.), 2019), where users who are experiencing real-time suicidal crises post messages with hopes of receiving peer support. Theories of suicidal behavioral may aid in understanding the potential impact of online peer support via platforms like r/SuicideWatch. Despite the existence of several suicide theories (i.e., Interpersonal Psychological Theory (Cowgell, 1977); Three-Step Theory (Simon et al., 2001)), the Fluid Vulnerability Theory was one of the first to articulate the temporal dynamics of suicidality (Rudd, 2006), a process that is particularly relevant for understanding the process, and importance of timely, online peer support (American Foundation for Suicide Prevention, 2019). More specifically, this theory postulates two phases, or dimensions, of suicidality. First, is baseline risk that is one's chronic or stable level of suicide risk, which may be influenced by factors such as genetics or demographics. Next, is acute risk, which represents factors that fluctuate in response to one's environment (i.e., interpersonal stress, job loss) or internal experiences (i.e., hopelessness, depressed mood). The interaction between these can then lead to personspecific experience of "suicide mode", which is dynamic timedependent process (Rudd, 2006) where the potential to engage in suicidal behavior is high. It is during this acute experience of suicide mode that an individual experiences elevated distress, and thus may seek external support (i.e., online peer support). Indeed, crisis resources, formal and informal, are centered around this idea of a dynamic suicide mode that can be intervened upon (Woodward and Wyllie, 2016).
For many individuals, online peer support is an acceptable and often effective avenue for coping with mental health symptomology (Ali et al., 2015;Horgan and Sweeney, 2010;Naslund et al., 2019); this may be particularly salient for those struggling with sensitive mental health conditions (Gowen et al., 2012;Haker et al., 2005;Kummervold et al., 2002), like suicidality. More specifically, individuals with severe mental illness may be more likely to seek online peer support (Ciarrochi et al., 2003;Gowen et al., 2012;Hope et al., 2005). Further underscoring their importance, those interactions may be used as a catalyst to seek formal care for mental health difficulties (Lawlor and Kirakowski, 2014). A specific platform for online peer support among those experiencing acute suicidal crises, is r/SuicideWatch. Despite the fact that r/SuicideWatch may serve as a primary suicide crisis intervention for some, there has been limited research aimed to understand both the content and quality of support received by individuals via r/SuicideWatch. Research that has been conducted in this area has worked to improve automatic suicide risk determinations of users posting as a way to triage resources (Althoff et al., 2016;Coppersmith et al., 2018;De Choudhury et al., 2016;DeMasi et al., 2019;Milne et al., 2016Milne et al., , 2019Morris, 2015;Shing et al., 2018) or have provided descriptive information on the elements of a user's posts that may elicit peer responses (De Choudhury and De, 2014;Huang and Bashir, 2016), rather than evaluating the quality (i.e., helpfulness) of the peer support actually received. This particular avenue of research is important given evidence that appropriate support can mitigate crisis states (Turner et al., 1983), and may even ultimately prevent the transition from suicidal crisis to suicide behavior (Woodward and Wyllie, 2016). As such, the current study had two primary aims. First, we aimed to characterize the content of both a user's post and corresponding peer comments as a way to better understand the peer support interactions occurring on r/SuicideWatch. Furthermore, we examined how the topics of user and peer posts were associated and how this information influenced how helpful peers perceived the given peer support, which was quantified by the number of "upvotes," or likes, a specific peer post receives (i.e., "Study 1"). This metric of peer perceived helpfulness is particularly important on Reddit, where the visibility of a peer comment is based on upvotes: peer comments with the greatest number of upvotes appear at the top of the thread. Second, we aimed to validate the findings related to peer perceived helpfulness by presenting empirical data from individuals with a personal history of suicidal crisis (i.e., "Study 2"). Such findings may highlight ways that peer support on r/SuicideWatch can be leveraged, or improved, to best support individuals through suicidal crises.
As prior research on the use of r/SuicideWatch for the resolution of suicidal crises may be limited by available methodologies, our key methodological contributions are as follows: (i) We introduce PairwiseLDA, a new topic model that allows for analyzing the topics of users' posts, the topics of peers' comments, and their paired relation. The pairwise topic model discovers a set of multiple topics of posts from users experiencing suicidal crises and another set of multiple topics of the associated peer comments. (ii) Documents were represented as bags of words and phrases. PairwiseLDA discovers representative words and phrases, giving a more comprehensive description for each topic. (iii) When topics have been captured, comparisons between the volume (defined by the number of comments) and peer perceived helpfulness (defined by the number of "upvotes") of peer comment topics by user post topics. Findings also offer advances in both the practical and theoretical understanding of online peer support for suicidal crises. Indeed, the present findings contribute to the theoretical understanding of the proximal risk factors for suicidal crises and have the potential to identify aspects of peer support that may serve to reduce suicidal crisis intensity, having implications for educational efforts aimed at improving informal crisis resources. Furthermore, beyond the topic of suicide, our technology can be applied to "paired" corpora in many applications such as tech support forums, question-answering sites, and online medical services, if data are available for readers.

Study 1: Results
Data collection. Data was scraped using the Reddit API from r/ SuicideWatch channel across a period of 18 months, ranging from December 1, 2013 to May 31, 2015. All procedures were approved by the authors' institutional review board for research ethics (Protocol #19-06-5390). We extracted original user's posts (i.e., individuals experiencing a suicidal crisis), subsequent peer comments (i.e., received peer support responses), and the number of upvotes for peer comments (i.e., peer perceived helpfulness of responses). Of note, upvotes are given by any user of the r/Sui-cideWatch forum to comments that are perceived to be particularly helpful or useful in response to the original user's post, and have a direct impact on the visibility of that peer's comment. There were a total of 21,430 original posts and 129,008 peer comments. Each original post was found to have received 6.02 peer comments, on average. The number of original user's posts and the number of peer comments by month are detailed in Fig. 1. The number of original posts increased across the time span, as the forum had more and more users.
We calculated the response time (RT) for peer comments, defined as the length of time taken for the first peer comment in response to the original post. Figure 2 presents the number of posts and percentage of posts vs. RT. The curve remained at a very low level before 5 min and then rose steadily thereafter. Only 0.4% posts received the first response within 1 min, and fewer than 8% posts were responded within 5 min. Figure 3 presents the distribution on the number of upvotes of responses in logarithm-logarithm scale. The high R 2 value shows the distribution follows power law (Adamic, 2000) (y = ax −k ) with the law's exponent of k = 2.101. Most responses had no or very few upvotes. Only 18.0% responses had at least one upvote and only 6.1% responses had at least two upvotes.
Topics of original user's posts. Our method identified five topics among original users' posts. Each post document has a topic distribution and is assigned to the topic of the highest probability. Then for each topic, the volume (i.e., number and percentage of posts) was determined. Table 1 presents the descriptive information for each topic, representative words defining each topic, and phrases found within each topic. Table 2 gives an example post with the greatest percentage of words from each topic. Based on this information, we provided a label for each identified topic. The largest topic (i.e., of the highest volume) was labeled psychological pain. As can be noted by the representative words (such as curse words) and phrases (i.e., "I do not want to live", "I want to die"), these posts represent significant psychological pain closely associated with expressions of suicidal crises, which were often not event specific but instead broader emotional states. The second largest topic was labeled relationship stress, which included representative words of "relationship", "friend", and "love", in addition to phrases of "talk to me" and "talk to someone". This topic predominantly focused on recent disruptions in friendly or romantic relationships. The third largest topic was labeled psychiatric disorder and included representative words of "depression", "anxiety" and "panic", and phrases of "intrusive thoughts", "panic attack", and "suicide attempt." In contrast to broader psychological pain represented by the first topic, this topic appeared to be focused on specific psychological problems or diagnoses that were contributing to the user's current state. The fourth topic was labeled academic difficulty. Posts representative of this topic used words such as "college", "class" and "grade", and phrases such as "go to college" and "first year" to discuss current difficulties experienced while enrolled in or the prospective of enrolling in high school or college (CBS News, 2018). The last topic, which was the smallest proportion of user's posts, was labeled financial stress (U.S. News, 2015). The representative words (i.e., "money", "afford", "debt") and phrases (i.e., "enough money", "lost my job") highlight acute and chronic financial difficulties an user was discussing as part of his/her current crisis state.
Topics of peer responses. Our method identified ten topics among peer comments. Table 3 presents the descriptive information of each peer comment topic, key words defining each topic, and phrases found within each topic. Table 4 gives an example comment with the greatest percentage of words from each topic. The largest topic was labeled asking questions. Peer comments queried for additional information presented in original user's posts (representative phrases: "do you think", "do you mind"), which often seemed to inform suggested problem-solving strategies. The second largest topic was labeled communication support, which was best characterized by peers suggesting or offering time to talk (representative phrases: "you need someone to talk", "talk to me"). The third topic was labeled academic encouragement. Peer comments representative of this topic provided specific accounts or encouragement related to high school and/or college experiences (representative phrases: "high school", "community college"). The next three topics (i.e., interest development, life meaning, distraction, and entertainment) represented <10% of peer comments each, with the remaining four topics (i.e., professional help suggestion, relationship/loss support, thanks and   HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-020-0513-5 ARTICLE HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2020) 7:36 | https://doi.org/10.1057/s41599-020-0513-5 appreciation, treatment, and medication) each representing <5% of peer comments.
Topic associations on volume between original user's posts and peer responses. Figure 4 presents a heatmap of the distribution of volume of specific peer comment topics to specific user's post topics. The cell values are percentages; for example, the cell "32.38" means that among all the peer comments to the user post topic of psychological pain, 32.38% were assigned to the topic asking questions. For each peer response topic, the green fonts highlight the highest volume among the original post topics. We summarize Topic associations with peer perceived helpfulness between original posts and peer responses. We examined two potential factors in generating a response perceived as helpful by peers (i.e., received a greater number of upvotes). The first factor is comment length. The average number of words per original user's post was 272.2, with a median of 183. The average number of words per peer comment was 75.2, with a median of 44. This is because users need more words to describe their situations in original posts than to respond with peer support. Writing a peer comment depends on many factors such as the respondent's background, personality, and available time. On the other side, the average number of upvotes per comment was 0.53. The correlation between comment length and upvote was NOT significant,  r = 0.036. Therefore, writing a long comment was not more likely to receive a greater number of upvotes. The second factor is topic associations. Figure 5 presents a heatmap of the distribution of upvotes per peer comment topics by original user's post topics. Given a pair of a specific original post topic and a specific peer comment topic, the cell value is the average number of upvotes for the peer comment topic. For each row, the highest values are colored in green and the lowest is in red. The figure has an additional row/column to give the average upvote of each user post/peer comment topic. We make the following observations:    upvotes (i.e., 0.41) was relationship/loss support, despite being one of the most frequently used topics. On the other hand, the comment topic of professional help suggestion had the highest average number of upvotes in response to relationship tress (i.e., 0.79).

•
The original user's post topic of psychiatric disorder mainly (i.e., >36%) received peer comments of asking questions, however, the average number of upvotes was 0.41, below the overall average 0.53. The highest number was 0.82 by the peer response topic of relationship/loss support (2.7% of the peer comments to psychiatric disorder).

•
The original user's post topic of academic difficulty received peer comment topics of academic encouragement (average number of upvotes was 0.52); however the peer response topic with the most upvotes was professional help suggestion with an average of 1.17 upvotes. Of note, this was only 4.2% of the peer comments in response to academic difficulty.

•
The peer comment topic of life meaning received the highest number of upvotes, on average, (0.53) in response to user topic of psychological pain, however, made up only 9.9% of the peer comments.
• According to Table 3, the three most frequent peer comment topics were asking questions, communication support, and academic encouragement. Figure 5 shows, however, that the three most helpful peer comment topics were professional help suggestion, life meaning, and relationship/loss support. The three most frequent peer comment topics make-up 63.6% of all peer comments, whereas the three most helpful peer comment topics make-up only 15.8% of the peer comments.

Study 2: Results
Data collection. Data was collected from an independent online sample of individuals from the community via Amazon Mechanical Turk (mTurk). As part of mTurk, potential participants are allowed to browse available tasks (i.e., focus groups, marketing surveys, research studies), and select those that they are interested in completing. Participants were recruited as part of a larger study advertised as, "A Research Study on Emotion, Health, and Suicidality;" participants were not selected based on history of suicidality. For the current analysis, one question regarding lifetime history of suicidal ideation (which was used as an indicator of personal history of a suicidal crisis) and one openended question were utilized. Participants were asked, "If you had a friend that was going through a difficult time, what advice would you give?", in addition to, "How would your advice differ if they had thoughts of suicide?." The current study utilized responses from only the latter half of the question (pertaining to suicide), to increase the equivalency of responses to peers responding on the r/SuicideWatch forum. All procedures were approved by the authors' institutional review board for research ethics (Protocol #18-12-5050).
The final sample consisted of 491 participants, of which 42.16% who reported a lifetime history of suicidal crisis (i.e., suicidal ideation; n = 207). For analysis of participant responses, stopwords and punctuation were removed, and only words that were used greater than five times were included.
Topics associated with suicidal ideation. We used the structural topic model (Roberts et al., 2013), using history of suicidal crisis as a covariate to explain topic prevalence. Analyses are based on the assumption that an individual's response is indicative of advice they view as helpful. That is, if an individual is offering this advice to a friend struggling with thoughts of suicide, it is likely they believe their response would be helpful or supportive. Further, through the structural topic model, results will demonstrate if topics differ based on individual history of suicidal crisis, highlight any similarities, or discrepancies, with peer perceived helpfulness (i.e., upvotes) on the Reddit r/SuicideWatch forum.
Based on the correlated topic model, we tested 2-15 topics, using hold out likelihood (on random subset of 100 observations), exclusivity, and semantic coherence to select a final number of topics. The results of this analysis identified 12 topics, eight of which were significantly associated with suicidal crisis history (see Table 5). We note that in interpreting these coefficients, positive values indicate they were more likely to be used among those with a history of suicidal crisis. As it is difficult to discern topic content solely from the most representative words, we depict the most representative responses for each of the significant topics, displayed in Table 6.
Eight topics were significantly associated with suicidal crisis history; we focus our topic interpretations here. Four topics were identified as less likely to be used among those with a history of suicidal crisis: topics 1, 3, 8, and 12. These topics convey the notion of a need to seek professional help (topics 1 and 3) and offer suggestions of reasons for living, including seeing one's own worth or having faith in a higher power (topic 8), in addition to the pain they may cause others by dying (topic 12). Similarly, four topics were identified as more likely to be used among those with a history of suicidal crisis. While some of these topics also centered around professional help, they appear to suggest or encourage professional help, and perhaps, more specifically, therapy (topic 5). Further, topics provided a potential context or understanding for why someone might feel suicidal (topic 6) and offer more direct or immediate support for the individual (topic 9). A final topic emerged that suggested individuals with a history of suicidal crisis would not differ in the advice they would give to a friend having a hard time more generally vs. thinking about suicide (topic 10).

Discussion
The current study introduced a new methodological framework that allowed us to conduct a nuanced investigation of peer support interactions in times of suicidal crisis occurring on Reddit r/SuicideWatch,in addition to responses from participants in an independent sample. Utilizing Reddit r/SuicideWatch data, we aimed to (1) characterize both the user's posts (individuals experiencing a suicidal crisis) and peer comments in response to users' post; (2) examine the association between topics discussed in users' posts and peer comments; and (3) investigate how this may inform the peer perceived helpfulness (as indicated by upvotes) of peer comments. Empirical findings from an independent sample of individuals with a personal history of a suicidal crisis were also examined to further inform the helpfulness of peer comment topics provided on the Reddit r/SuicideWatch forum. Methodologically, the findings from this study support the use of the newly developed method PairwiseLDA and highlight its use to advance our understanding of online peer support during suicidal crises. Substantially, results provide further evidence of topics, or triggers and acute states, that may be directly contributing to suicidal crises. The current study also highlights the ways in which Reddit r/SuicideWatch is supporting individuals through suicidal crises, while providing potential opportunities for improvement to better resolve suicidal crises among individuals electing to use informal crisis resources. Findings demonstrated that five different topics were discussed by original users (i.e., individuals experiencing suicidal crises). These topics support the Fluid Vulnerability Theory (Rudd, 2006) of suicide. More specifically, the current findings lend insight to the acute factors that may be interacting with individuals' baseline risk, leading to a heightened experience of suicidal crisis (i.e., suicide mode). While the findings may be specific to those individuals who opt to seek online peer support, they are consistent with acute factors proposed in theory. Indeed, there is a direct relation between each of the five topics and factors theorized to contribute to suicide mode, including triggers (i.e., relationship stress, academic difficulties, and financial stress), cognition (i.e., psychological pain ["blame"]) and emotions (i.e., psychiatric disorders ["depression", "anxiety"]; psychological pain ["hate?]). Beyond theoretical support, these findings also highlight factors that may be particularly relevant for individuals who seek online peer support, but have received relatively limited attention in the literature. For example, academic difficulties was a predominant topic in original user's posts with 11% of users discussing this triggers. Echoing this, findings are consistent with the empirical literature demonstrating broad risk factor categories that contribute to suicide outcomes (i.e., suicide attempts (Franklin et al., 2017)). Similarly, these findings may help elucidate the specific experiences within the broad risk categories that are important. For example, social factors have been identified as one of the risk categories with the greatest weighted hazard ratios in predicting suicide attempts (Franklin et al., 2017), and the current findings may help elucidate relationship stress as a particularly relevant social factor.
Peer comments, on the other hand, displayed a variety of topics discussed, from content specific topics (i.e., academic encouragement) to broader support (i.e., emotional support). On average, peers were more varied in their topics discussed compared to original user's, with four of the topics each consisting of <5% of the overall peer comments. There are a few potential explanations for this. First, it is possible that this reflects a discomfort and uncertainty among peers regarding the most appropriate response to a suicidal crisis (underscoring asking questions as being the most common peer comment topics). This is also indirectly supported by a long-held finding that talking about suicide increases psychological and physiological distress among the nonsuicidal individual (Cowgell, 1977). However, it is unclear if peers providing support on r/SuicideWatch have a personal history of suicidal crises. As such, it is also possible that peers have reflected on their own experiences of times in (suicidal or non-suicidal) distress, identifying that personalized support during crises that have been the most meaningful, resulting in attempts to provide idiographic responses on the r/SuicideWatch thread. This notion is supported by the fact that peer comments focused on academic encouragement were one of the most common responses to user's posts discussing academic difficulties. Findings from our independent empirical example, however, are not congruent with the peer post topics on r/SuicideWatch. For example, in the empirical example, individuals with a history of suicidal crisis presented topics that largely reflect broader support topics (i.e., encouragement, support, listen) vs. those that are personalized. While it is possible this is due to the prompt given in the empirical advice (requesting general advice), it may also suggest a potential discrepancy between support provided online and desired support by the individual in a suicidal crisis. Given the significance of this, it is worthy of further investigation.
In examining the associations between post and comment topics and peer perceived helpfulness, there was a disconnect between the topics peers most frequently discussed and those rated as most helpful. For example, the three most frequent peer comment topics were asking questions, communication or emotional support, and academic encouragement (63.6% of all peer comments), however, the peer comment topics perceived as most helpful by peers were professional help suggestions, life meaning, and relationship/loss support (15.8% of all peer comments). Part of the reason for this disagreement may be that peer perceived helpfulness appeared to be dependent on both the user's post and peer's response topic, as noted above. For example, peer responses of relationship/loss support were rated as most helpful Table 6 Most representative responses from each of the significant topics.

1
"I would recommend that they seek professional help" 3 "I would tell them to contact a professional to help them or, if necessary, I would contact someone on their behalf". 5 "I would encourage them to seek professional help...therapy" 6 "It would depend entirely on the situation If the thoughts of suicide were related to a terminal illness, for example, it doesn't seem appropriate to interfere In other plausible scenarios, I couldn't conceive of another situation where I would be approached by another person for this sort of serious advice, and if I were I would demure and avoid interacting to the extent possible by the situation" 8 "I think I would try to help them see how worthy they are of life People make small differences in other people's lives, sometimes they don't even realize that they've made an impact Helping someone see that someone else's life is better just because of some encounter that they have shared with that person" 9 "I'd tell them they have to call me every hour or that I will call 911 immediately, and that I love them, and it would really suck if they killed themselves, and to call their therapist immediately" 10 "The advice wouldn't differ at all I would give the same advice" 12 "I would tell them they are loved and too many people will miss them if they kill themselves I would also tell them that there in nothing in life that is worth not living" HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-020-0513-5 ARTICLE HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2020) 7:36 | https://doi.org/10.1057/s41599-020-0513-5 by peers in responses to user posts expressing financial stress or psychological difficulties. On the other hand, peer responses focused on the meaning of life were rated as most helpful in response to user posts of psychological pain. Such discrepancies highlight that, despite the prevalence of peer responses being similar across original post topics, peers are attuned to the topic of the original post and attempt to tailor their response accordingly.
Our empirical example highlights areas of both consistency and potential discrepancy between peer perceived helpfulness and perceptions of individuals with a personal history of a suicidal crisis. One of the most notable consistencies across samples was the promotion of professional help suggestions. Indeed, this was one of the peer comment topics with highest number of upvotes and was more likely to be used by those with a history of suicidal crisis in the independent sample. This underscores that suggesting professional help, as opposed to being more directive regarding treatment (i.e., needing to get professional help), may be a supportive response for those experiencing a suicidal crisis. It will be beneficial for future educational and prevention efforts to target the use of such support in an empathetic manner. There were also discrepancies in topics perceived as helpful by peers (i.e., relationship/loss support) that were not used by those with a history of suicidal crisis in the independent sample. Furthermore, results also demonstrate there was a topic perceived as helpful by peers that was less likely to be used by those with a history of suicidal crisis: reasons for living/meaning in life. This suggests that while peers, or those without a personal history of suicidal crises, may think it helpful to remind others of reasons to keep living, it may not be perceived that way to those experiencing a suicidal crisis. If replicated, these findings have important implications for peer education programs for suicide prevention.
A final point of discussion with regard to this studies' findings is the timeliness of peer responses. Only 7.8% of users received a peer response within 5 min of their post and only 64.90% received a response within one hour. This aspect of online peer support is particularly important to consider not only due to timeliness being a cited concern for the use of more formal suicidal crisis resourced, but because consistent with the acute risk phase of the Fluid Vulnerability Theory, 25-40% of individuals who attempt suicide make the decision to act on their suicidal thoughts within 5 min and 70% make the decision within an hour (Simon et al., 2001;Williams et al., 1980), leaving a limited window for intervention. Finally, despite the perceived helpfulness of professional help suggestions by both peers and individuals with a history of suicidal crisis, this topic was rarely used in peer comments. While noted above, this suggests the need for educational efforts for peers providing online support. The fact that social media platforms are being used for its intended purpose of (only) peer support and not necessarily for the purpose of providing referral resources may be an important element of this support avenue; however, it is also appears important to improve peers' understanding that many individuals experiencing suicidal crises may have a treatable mental illness (Nock et al., 2010), and that warmly suggesting or encouraging professional help may actually be perceived as helpful by those in a suicidal crisis.
The current study's findings should be viewed in the context of its limitations. Foremost, we relied on a combination of upvotes by peers and responses from an independent sample among those with a suicidal crisis as metrics of helpfulness. Given that neither metric directly assesses the original user's perception of peer support received online, there is a need for further research in this area. Similarly, we do not have outcome data for user's posting about their suicidal crisis to determine if the peer support that was received aided in reducing the intensity of their suicidal crisis. It will also be important for future research to compare findings from the current study, which utilized one specific social media platform, to other avenues for online peer support for suicidal crises. Demographic information to characterize the original users and peers utilizing r/SuicideWatch is not available. While prior research examining online peer support for mental illness more broadly has successfully included individuals across developmental lifespans (Naslund et al., 2019), it is unclear how the current findings will generalize among all ages, or, if Reddit users and those from our independent sample were similar. Finally, the current study is also limited by the inability to ensure the validity of peer responses provided online. That is, it is possible that some of the peer comments included in the current analysis were not "real" responses as has been suggested in recent media reports (Bergstrom, 2011), which may have had an impact on current findings.
Results provide valuable insights into the process of receiving online peer support during suicidal crises. Topics discussed by users experiencing a suicidal crisis are consistent with theoretical models of suicide (i.e., Fluid Vulnerability Theory), which serves to inform current suicide risk assessment procedures. Moreover, current findings should be utilized to provide educational efforts to peers to promote the use of helpful responses, such as the suggestion of professional help rather than reasons for living, as way to improve online support for suicidal crises. Such educational efforts may help peers provide a greater proportion of comments that may be perceived as helpful, thus increasing the overall quality of support received during a crucial time of distress. Similarly, it may be beneficial for efforts to not only increase the quality of peer support, but also the quantity of individuals available to provide support as a way to improve the timeliness of support received.

Methods
Automated phrase mining. Representing text with quality phrases instead of words or n-grams can improve computational models for applications such as information retrieval, taxonomy construction, and sentiment analysis (El-Kishky et al., 2014;Li et al., 2016). We use the state-of-the-art automated phrase mining tool AutoPhrase (Shang et al., 2018) to find phrases from the text corpus-original user's post to peer response, which is domainindependent and with no additional human effort on the target text data. Figure 6 illustrates the workflow of the tool. The first step is to use frequent n-gram mining (based on frequent sequential pattern mining (Pei et al., 2001;Yan et al., 2003)) to find phrase candidates from the Reddit corpus. The support (i.e., frequency in the corpus) of each candidate is not smaller than a given minimum support threshold. Short phrase candidates such as "great at" and "to talk" may have a large support but not good quality. Therefore, the follow step is to classify the phrase candidates into two categories, quality phrases and non-quality phrases. AutoPhrase adopts a random forest-based classifier and has innovative designs on labeling and feature selection: • Distant positive-only labeling and robust negative label sampling: Knowledge bases (KBs) such as Wikipedia and Freebase have been widely used for distant supervision (Mintz et al., 2009). Positive labels can be assigned to the phrase candidates that can be linked to the KBs, such as "community college", "social anxiety", and "health insurance". The strategy of random negative sampling has been applied to generate negative labels for classification, which assumes that the majority of samples are likely to be negative. However, the noise in the negative pool can be harmful to the classifier's performance. A robust negative sampling method was developed in AutoPhrase to reduce the noise based on ensembled subsampling (Chen et al., 2013).
• Statistical features of phrase quality: If the phrase quality is defined as the probability of a word sequence being a complete semantic unit, then a quality measurement should include the following criteria: (1) popularity: quality phrases should have sufficient frequency in the given text corpus; (2) concordance: the collocation of tokens should have significantly higher probability than expected due to chance; (3) informativeness: it should have tokens that indicate specific topics, events, or concepts; (4) completeness: a phrase is deemed complete when it can be interpreted as a complete semantic unit in the context. Statistical features such as probabilities and Part-of-Speech tag counts are used to generate the set of features.
We applied AutoPhrase to our dataset. It assigns scores between 0 and 1 to every phrase candidate. A higher score indicates a better phrase. We determine the set of quality phrases with a threshold score.
Parameter settings: We chose the minimum support threshold as 2. The threshold score of phrase quality was 0.75.
AutoPhrase has been recognized as an available and useful tool for phrase mining in the NLP communit (Kshirsagar et al., 2017;Nguyen et al., 2019). The tool was published in 2018 and has been cited 78 times by December 17, 2019 on Google Scholar.
Pairwise latent Dirichlet allocation (LDA). LDA is a generative probabilistic model of text data. The basic idea is that documents are represented as random mixtures over latent topics, where each topic is characterized by a distribution over words (Blei et al., 2003). In our study, we have two text corpora, original user's post corpus, and peer response corpus. The topics of the two corpora are different and associated. So on one hand, we develop two LDA models separately to discover the topics on each corpus; on the other hand, we use the post-comment pairs in the dataset to supervise the models with the association information. Figure 7 presents the proposed pairwise LDA model. Specifically, the models have different number of topics, K (P) and K (R) for original user's post and peer response, respectively. The observable post-response pairs can be generated by pairs of topic distributions of post documents and comment documents. Moreover, our model generates documents with both words (i.e., single-words) and multi-word phrases. We use the term "phrases", as words can be considered as single-word phrases and one of the unique points of the model is to learn topics at the phrase level.
Formally, we define the following terms:   (Shang et al., 2018). It generates phrase candidates with frequent n-gram mining and builds a random forest classifier with distant supervision (Mintz et al., 2009) from general knowledge bases (e.g., Freebase and Wikipedia). The classifier has (1) positive labels from knowledge bases and negative labels through robust sampling and (2) a rich set of features of phrase quality. Applied to the corpus, it assigns good phrase candidates with a high score. We make inference with the proposed probabilistic graphical model as follows. The inference problem is to compute the output parameters ϕ k , θ d , z d,n , and Σ in the pairwise supervised topic model (for each corpus) given input observed data w d,n and r d ðPÞ ;d ðRÞ . However, exact computation is intractable in topic models (Blei et al., 2003). Thus, we are interested in approximated inference of the model parameters. Collapsed Gibbs sampling is a straightforward, easy to implement, and unbiased approach that converges rapidly to a known ground-truth (Mcauliffe and Blei, 2008;Porteous et al., 2008). It is typically preferred over other possible approaches in large scale applications of topic models (Rehurek and Sojka, 2010).
Topic number selection. We use a heuristic approach to search for a proper number of topics (Zhao et al., 2015). Based on the following expectations/assumptions, we determine the search range for each corpus as K (P) ∈ {1, …, 10} and K (R) ∈ {1, …, 20}. First, the number of the topics should not be too large; otherwise, the topics would be too sparse and the topic associations would be of too high complexity, i.e., O(K (P) K (R) ). Second, the number of response topics K (R) is expected to be larger than the number of post topics K (P) . There could be a small number of types of needs when people had suicide thoughts but the replies could vary to a large extent, particularly given the larger number of responses than posts. So we chose 10 and 20 as the maximum number of topics to search.
We use two metrics to evaluate the results of topic modeling with respect to different topic numbers. The first metric is well accepted and widely used, called topic coherence (Röder et al., 2015;Stevens et al., 2012). It is the average coherence of all topics. For a single topic, the coherence is defined as the sum of pairwise distributional similarity scores over a set of topic phrases: where V is a set of phrases describing the topic and ϵ indicates a smoothing factor, which guarantees that score returns real numbers. Here, we use ϵ = 1 as the original authors used. The similarity score defines the score to be based on document cooccurrence, known as the UMass metric (Stevens et al., 2012): where D(x, y) counts the number of documents containing phrases x and y and D(x) counts the number of documents containing x. Significantly, this metric computes these counts over the original corpus used to train the topic models. It attempts to confirm that the models learned data known to be in the corpus. Note that the topic coherence score is negative and a higher coherence score means a better result of topic modeling. The second metric is new in this study. Given the topic distribution of a document, we assign the document to the topic of the highest probability. The "volume" of a single topic is the number of documents that they were assigned to. As we are expecting volume balance among topics but also aware of the variety of topic popularity, we measure the degree of volume imbalance, called Maximum Relative Volume Gap (MRVG). We sort the volume (in percentage) from high to low: p 1 ≥ p 2 ≥ ⋯ ≥ p K , where K is the number of topics. The metric is defined as A close-to-1 MRVG indicates a more balanced volume distribution.
Here, we list the set of expectations to a good result of topic modeling: 1. K (R) > K (P) : The number of topics of responses K (R) should be larger than the number of topics of original posts K (P) . 2. Higher coherence score means a better result. 3. Close-to-1 MRVG indicates a better result.
We set K (R) = 2K (P) and apply the proposed pairwise LDA model with K (P) ∈ {3,4,5,6,7,8,9,10}. Figures 8,9,10, and 11 present the coherence score and MRVG on each topic distribution. First, when K (P) = 3, 4, 5, the coherence score is high. It significantly drops from the point of K (P) = 6. Similar   This is the coherence score when we choose K (R) = 10 for peer response text. Fig. 11 This is the MRVG when we choose K (R) = 10 for peer response text. ARTICLE HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-020-0513-5 observations can be found on the response text. When K (R) = 6, 8, 10, the coherence score is high. It drops to the bottom from the point of K (R) = 12. Second, the MRVG scores are very close to 1 when K (P) = 5 and K (P) = 10, respectively. Therefore, we choose K (P) = 5 and K (P) = 10 for the topic numbers in this study. a. Assessing associations between post topics and comment topics: As we have assigned every document to a proper topic in both post and comment data, we can project associations between post and comment documents (in the original data) into associations between post and comment topics. Therefore, we can calculate the number of document pairs associated with every topic pair, as presented in Fig. 4. As the document associations come along with the number of upvotes, we can calculate the average number of upvotes between every topic pair, as presented in Fig. 5. b. Topic comparison analysis 1: pairwise LDA vs. traditional LDA: We run traditional LDA model (Blei et al., 2003) separately on the post and comment corpora in Reddit data. Table 7 presents the representative words and phrases (discovered by AutoPhrase). We met difficulty in finding proper topic names for any of the topics of original posts and/or peer responses. Our pairwise LDA model was able to generate meaningful topics in Tables 1 and 3. c. Topic comparison analysis 2: pairwise LDA vs. cross-lingual LDA: A recent work on topic model proposed to connect cross-lingual topics (Yang et al., 2019). We implemented this model and applied an analogy: The analogy would be that original posts are in one "language," and peer responses are in the other language. Then this model can be considered as a competitive baseline with our proposed pairwise topic model. The assumption of the cross-lingual topic model is that the topics across the two languages follow the same distributions and thus can be aligned. This is true for the cross-lingual setting because usually we can find proper word pairs of the same semantics between languages. However, as we have observed, the topics in the original posts and the topics in the peer responses are significantly different. They cannot be simply aligned with a one-to-the-other transformation regularization. Table 8 presents the representative words and phrases. Though they make more sense than the traditional LDA, we still cannot easily find a proper name for each topic. Our pairwise topic model assumes that if the topic distributions of any original post and peer response were predictive of the pairing of the two documents (paired or not), then the topics would be effectively, jointly learned.

Conclusions
In this study, we characterized the content of both a user's post and corresponding peer comments occurring on a social media platform and present an empirical example for comparison. We introduced an approach that uses pairwise topic models to transform large corpora of discussion into associated topics of user and peer posts. Our approach modeled both the generative process of each type of the corpora (i.e., user posts and peer comments) and the associations between them. It used phrases, which are more informative and less ambiguous than words, in addition to words, to represent social media posts and topics. We evaluated the method using data from Reddit r/SuicideWatch. We examined how the topics of user and peer posts were associated, and how this information influenced the peer perceived helpfulness of the peer support. Then, we applied structural topic modeling to data collected from individuals with a history of suicidal crisis as a means to validate findings. Our observations suggested that effective modeling of the association between the two lines of topics can uncover helpful peer responses to online suicidal crises, notably providing the suggestion of professional help.

Data availability
The datasets analyzed during the current study are available in the Dataverse repository: https://doi.org/10.7910/DVN/BDZ4LI. Besides the datasets, code is available in the repository. These datasets were derived from a public domain resource: https:// archive.org/download/2015_reddit_comments_corpus/ reddit_data/.