Brief Communication | Open | Published:

Expressions of loss predict aggressive comments on Twitter among gang-involved youth in Chicago


Recent studies suggest social media shapes the transmission of firearm violence in high-poverty, urban neighborhoods. However, the exact pathways by which content on social media becomes threatening has not been studied. We consider a dataset of tweets by gang-involved Chicago youth that are coded for expressions of aggression and/or loss. Using a permutation test and mixed-effects log linear regression, we find that aggression and loss tweets do not occur randomly, and furthermore that in a 2-day window after loss expressions we find an increase in aggressive tweets. We discuss implications for intervention.


Increasing evidence suggests that social media plays a role in the transmission of firearm violence in high-poverty, urban neighborhoods.1,2 However, the exact pathways by which content on social media becomes threatening remains unclear. In a qualitative study of Twitter communication among gang-involved youth in Chicago, we identified two main themes in the labeled dataset, Loss and Aggression, and observed loss expression generally preceding aggressive posts.3 We support our earlier qualitative observations by testing quantitatively whether aggression and loss are randomly distributed in the dataset, and whether there is a temporal relationship between the two categories of tweets.


Loss and aggression are not randomly tweeted by users, and low p-values in the permutation test are evidence of a sequential relationship between tweet labels. We find strong evidence (Table 1) that loss and aggression tweets cluster since the p-values are lower than .05.

Table 1 Permutation test results

Loss tweets are significantly predictive of aggressive tweets. The model states that loss tweets are followed by a 13% increase in aggressive tweets the next day. A 95% confidence interval is 3.5 to 23%. Two days after a loss tweet, we see a 21% increase in the expected number of aggressive tweets. A 95% confidence interval is 12–30% (see Fig. 1). Loss tweets do not appear predictive of aggressive tweets 3 or more days out.

Fig. 1

Confidence intervals of coefficient estimates


This brief highlights the value of leveraging social media to understand loss and aggression in Twitter posts from youth and those who identify as gang-involved in Chicago and offers two meaningful findings: (1) loss and aggression tweets are not random and (2) posting a loss tweet increases the likelihood of an aggressive tweet in the next 2 days. These findings confirm prior qualitative observations and theory that loss and aggressive behavior on Twitter are connected.1,4,5 The findings underscore the idea that online behavior is rooted in offline events.5 The multiple day opening provides an opportunity to provide resources to young people before their grieving turns to aggression. The time between loss and aggressive tweets could be used to prevent offline violence by providing grief counseling and other mental health services as well as traditional clinical practices.


We used a corpus of publicly available Twitter communications from a self-identified gang member and users in her Twitter network.3 We increased the size of the original data by including the last 200 tweets in a chosen time window from the Twitter network of the 14 users who were most communicative with the original users. Our dataset consists of 2256 tweets from 235 unique users. The tweets were coded as “loss”, “aggression”, or “other.” Loss was defined as expressions of grief, trauma, or sadness or mentions of death or incarceration. Aggression was defined as insults, threats, mentions of physical violence, or of wanting retribution. Posts identified as “other” did not contain features involved in aggression or loss. The number of tweets per account ranged from 1 to 200, with an average of 10. One hundred and eighty five users tweeted at least one aggressive or loss tweet. Twenty five of the users made a transition from loss to aggression at least once.

First, we evaluated how users transition between loss and aggression in sequential tweets. We considered the null hypothesis that aggressive and loss content is tweeted by users at random, and p-values reflecting the weight of the evidence supporting this hypothesis were approximated using a permutation test. Specifically, we constructed a 3 × 3 transition matrix for each user that contained the observed frequency of the two labels of two consecutive tweets from the same user within a 24-h window. The test statistic is the mean matrix, obtained by averaging the transition frequencies across users.

We randomly permuted the order of the tweets and reconstructed the mean matrix 1000 times.6 The number of permutations that yield larger mean transitions than those observed divided by 1000 yielded a 3 × 3 matrix of p-values (see Table 1).

Second, we established that the number of loss tweets is predictive of the number of aggressive tweets within a subsequent time window. We fit a mixed-effects log linear model to predict the daily number of aggressive tweets.7 The fixed effects or predictors are the daily number of loss tweets in the previous 1, 2, 3, and 4 days. A random user intercept accounts for the baseline number of aggressive tweets per user. The regression coefficients were estimated using maximum likelihood. The estimates and their standard errors are presented in Fig. 1. There are four regression coefficients, representing the increase in the number of aggressive tweets expected for each of the four time intervals.

Data availability

Our social media data will be placed in a publicly available repository housed at Columbia University.


  1. 1.

    Patton, D. U., Eschmann, R. D. & Butler, D. A. Internet banging: new trends in social media, gang violence, masculinity and hip hop. Comput. Human. Behav. 29, A54–A59 (2013).

  2. 2.

    Lane, J. The digital street: an ethnographic study of networked street life in Harlem. Am. Behav. Sci. 60, 43–58 (2016).

  3. 3.

    Blevins, T. et al. Automatically processing tweets from gang-involved youth: towards detecting loss and aggression. Proc. Int. Conf. Comput. Ling. (2016).

  4. 4.

    Patton, D. U., Lane, J., Leonard, P., Macbeth, J. & Smith Lee, J. R. Gang violence on the digital street: Case study of a South Side Chicago gang member’s Twitter communication. New Media Soc. 19, 1–19 (2016).

  5. 5.

    Patton, D. U. et al. What’s a threat on social media? How Black and Latino Chicago young men navigate threats online. Youth Soc. (2017).

  6. 6.

    Miller, R. G. Beyond ANOVA: Basics of Applied Statistics (Chapman & Hall, New York, NY, 1997).

  7. 7.

    Agresti, A. Categorical Data Analysis, Third Edition (Wiley-Interscience, Hoboken, NJ, 2013).

Download references


This research is supported by a ROADS grant (Title: Natural Language Processing Tools for Gang Violence Prevention, Principal Investigator: Patton, #RG33, 2016–2018), sponsored by the Provost’s Office and the Data Science Institute at Columbia University.

Author information

D.U.P. conceptualized the study, conducted the initial qualitative study that informed our approach, wrote the introduction, and revised the entire manuscript. O.R. conceptualized the methodological strategy, supported the analysis for the study, and revised the manuscript. J.A. supported the analysis for the study and revised the Methods and Results sections. K.L. conducted the analysis for the study and wrote the Methods and Results sections. W.F. wrote the Discussion/conclusion section.

Correspondence to Desmond Upton Patton.

Ethics declarations

Competing interests

D.U.P. has work funded by DARPA. He has consulted for companies including Story and Structure and Social Sentinel. He has also received compensation as a member of the scientific advisory board for Cure Violence. The other authors declare no competing interests.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Further reading