Introduction

Since 2017, the overdose crisis in the United States has continued to soar, with opioid use disorder (OUD) cited as the leading cause of overdose-related deaths1. OUD is highly prevalent in the United States and is estimated to affect between 6.7 and 7.6 million persons2. Financially, OUD undertreatment and associated complications impose great strains on both the individual and society3 with estimated annual US opioid-related costs totaling $1.5 trillion in 20204. Further complicating matters, OUD is difficult to treat effectively as it often presents with other serious mental and physical comorbidities5 and is a risk factor for trauma6, suicide7, and infectious disease8. While effective treatment for OUD can reduce financial burden and associated adverse health outcomes, the majority (65%) of those with OUD are not receiving substance use treatment9. Moreover, OUD prevalence and overdose fatality rates continue to increase despite the presence of medication as well as psychotherapeutic treatments for OUD. The reasons for this are manifold and stem in part from stigma, insufficient clinician training, and shortage of addiction specialists10, ultimately contributing to an appreciable and especially detrimental treatment gap.

Among the many comorbidities of OUD, anxiety and depression are especially prominent11. While lifetime prevalence for generalized anxiety disorder (GAD) and major depressive disorder (MDD) is 4 and 17%, respectively, 11% of individuals with OUD are also diagnosed with GAD and an even greater 46% with MDD12. The associations of these disorders with OUD are multifaceted and bidirectional13,14, with research suggesting heightened risk of MDD in those with OUD15, as well as an increased risk of OUD among MDD patients16. Similar risk dynamics have also been reported when looking at anxiety disorders and non-medical opioid use17. Research supports that anxiety and depression may have a perpetuating and causal role in OUD. Opioid nonfatal overdose and relapse, for instance, have been positively associated with depression18,19, and opioid cravings have been shown to be associated with symptoms of depression and anxiety20. Further, research supports that induced anxious or depressed mood states may increase cravings for opioids21,22. Transdiagnostic psychological factors, such as distress intolerance17 and anxiety sensitivity14, may underlie vulnerability to both negative affective states (e.g., anxiety) and substance use disorders. These findings highlight the importance of implementing OUD interventions that are sensitive to changes, fluctuations, and shifting contexts within the GAD and MDD symptom milieu.

Despite the clear evidence for treatments which address both substance use and mental health disorders, <1 in 5 addiction treatment programs in the US offer treatment for co-occurring mental disorders23, and only one quarter of those with co-occurring OUD and a mental disorder receive adequate treatment for both9. The rarity of such dual diagnosis treatment programs and undertreatment of co-occurring mental disorders, in combination with commonly cited issues of general mental health service accessibility due to insurance and finances, stymies efforts to provide optimal OUD care. In total, bridging the OUD treatment gap must not only entail the expansion of resource availability in a way that mitigates financial burden, but also the incorporation of measures and procedures that are attuned to the variable presence and manifestation of mental health co-occurrence. To manage the opioid crisis, OUD intervention strategies must become farther reaching as well as more contextually sensitive and personalized.

In this regard, one promising way forward is through the leverage of ubiquitous and highly integrated mobile devices (i.e., smartphones) which have the capacity to improve mental health and substance use disorder care access within many traditionally underserved and low-income populations24. As of 2021, an estimated 85% of Americans own smartphones, with rural populations seeing comparable adoption rates of around 80%25. The reach of mobile devices is demonstrable; thus, digital approaches to OUD treatment inequity built for mobile deployment may possess a level of scalability and accessibility commensurate with need. Additionally, smartphones allow for the longitudinal and dense collection of contextually relevant behavior and mental health information. Through a combination of unobtrusive passive sensing and ecological momentary assessment (EMA), smartphone-powered data collection and interaction can address and complement the typically fragmented nature of addiction treatment, providing continuous patient monitoring and support. This continuity of support is important as OUD patients are likely to encounter opioid use triggers, heightening their risk for relapse26. In practice, smartphone-based implementations can be integrated within OUD intervention strategies to build treatment regimens that are continuously sensitive to the dynamic behavioral signals of OUD and comorbid mental health symptomatology. Taken together, modern smartphone-based implementations for OUD treatment, support, and intervention have the capacity to combat the traditional treatment challenges of availability, continuity, and contextual depth to ultimately serve as a complementary and powerful addition to the clinical toolkit.

Alongside EMA, ecological momentary interventions (EMI), e.g., tailored interventions delivered in real-time to people in their natural environment, have been explored in a variety of mental health populations, including substance use27,28,29,30, mood31,32,33, and anxiety disorders34,35,36,37,38. Such EMI studies among persons with depression and anxiety have generally yielded positive results, though with variable effect sizes39. Substance-related EMI studies to date have preferentially targeted alcohol use disorder, usually without exploration of co-occurring psychiatric disorders34. Nonetheless, smartphone application development for OUD has seen a steady increase over the last decade40, with numerous options available for download on mobile app stores41. Such mobile applications have shown feasibility and acceptability in pilot studies42,43,44, and randomized control trial results45 and large real-world observational studies46 have supported the clinical effectiveness of digital therapeutics as adjunctive treatments aimed at core OUD symptomatology. However, no studies to date have assessed the feasibility, acceptability, or effectiveness of interventions aimed at addressing comorbid mood and anxiety disorders co-occurring with OUD.

To this end, the primary aim of the current work was to test the feasibility and acceptability of a smartphone app-based digital intervention designed to treat anxiety and depressive symptoms among persons receiving medication treatments for OUD (MOUD). This randomized controlled pilot study enrolled participants with comorbid OUD and mood or anxiety disorder currently receiving MOUD via online targeted advertisements on the Reddit platform; the intervention consisted of a series of 4–6 videos each week delivered through a custom module within the publicly available Mood Triggers smartphone application47 and rooted in cognitive-behavioral principles48. The videos were brief (120 s on average; Fig. 1 displays the basic app layout, and the interested reader may view an example video of the intervention in the Supplementary Materials) and designed to target anxiety and depression symptomatology, built on CBT principles dealing with (i) actions, (ii) mental hygiene, (iii) exposure, (iv) acceptance, (v) cognitions, (vi) regulation and positive feelings, or (vii) preparation. Each video introduced participants to an exercise to help manage symptoms in their everyday routine, including short guides to progressive muscle relaxation, problem-solving strategies, and relapse prevention.

Fig. 1: The layout of the application, “Mood Triggers” used for the guided intervention.
figure 1

A Login screen for participants. B The organization of the video modules. C The screen during an active video module. Note that a transcript is provided for accessibility. D User trends in user answers to daily prompts on anxiety and depression.

We hypothesized that the digital intervention would show feasibility, based on the ability to successfully recruit participants (n = 30 randomized to intervention, n = 30 randomized to control) and demonstrate acceptability with at least 50% of intervention sessions completed by those assigned to the intervention. Using a star-based rating system, we provided the functionality for participants to rate both their overall satisfaction with the app and the app’s ability to help reduce anxiety and depressive symptoms.

Secondarily, this work sought to test the effectiveness of this digital intervention in treating anxiety, depressive and OUD symptoms among persons with co-occurring disorders. Measured through self-report and weekly urine drug screens (UDS), we hypothesized that those assigned to the digital intervention would experience greater reductions in depressive and anxiety symptoms alongside reductions in opioid cravings and use. Given the previously discussed relationship between anxiety/depression and OUD, we hypothesized that our intervention would have indirect effects on opioid use outcomes. Validated self-report measures included the 9-item Patient Health Questionnaire (PHQ-9)49 for assessing depression, the Generalized Anxiety Disorder Questionnaire (GAD-Q-IV)50, the Rapid Opioid Dependence Screen (RODS)51, and the Opioid Craving Scale (OCS)52.

Lastly, we explored the feasibility of EMA and passive sensing add-ons to the digital intervention, which have the potential to serve as useful metrics of symptom change and allow for participant self-monitoring. Ambient light, GPS location, call and text logs, and screen time were collected via passive sensing monitoring on the participants’ personal mobile devices. Such passive data streams have potential to serve as proxies for important mental health outcomes, such as activity level and social connectedness53. Further, as an objective and longitudinal data source, passive sensing has the potential to supplement traditional self-reported measures. We specified primary and secondary outcomes in a ClinicalTrials.gov preregistration (ID: NCT05047627), as well as the guiding aims of the work, which were to assess the feasibility, acceptability and preliminary effectiveness of the mobile digital intervention.

Results

Participant flow through study

During the study recruitment window, September 2021 to October 2022, 462 unique persons completed the screening survey. Of these, 172 individuals initially qualified and completed a secondary screening which functioned as a baseline measure for those who were eventually enrolled. Concurrently, participants were required to provide physical address verification and undergo IP address and geolocation validation measures. A total of 63 participants passed the secondary screening and were randomized to either the experimental condition (n = 32) or waitlist control (n = 31). Those in the intervention group received a telephone onboarding session with a trained research assistant. Those in the control group were offered a telephone call, but not required to partake. After 4 weeks of the study, all participants were asked to complete a post-intervention symptom survey (of which 46 completed). Biweekly drug screens were collected remotely from all participants for the remaining 4 weeks of the study (weeks 4–8). After 8 weeks, at the completion of the study, all participants were asked to complete one final exit survey (of which 47 completed). See Fig. 2 for a summary of the recruitment, onboarding, and study collection protocol. A trained research assistant manually coded UDS images shared with research staff by participants. To ensure fidelity of codings, a second rater independently coded a majority subset (67.04%) of the images. Cohen’s Kappa was calculated to determine interrater agreement (κ = 0.78). Disagreements were manually reviewed and reconciled.

Fig. 2: Study flow diagram.
figure 2

Summary of study which spans recruitment, randomization, intervention, and timing of survey administration.

Five participants dropped out of the study, three from the experimental group and two from the waitlist group. Reasons for drop out included frustration with repeat UDS, frustration with the mobile application, and feeling that symptoms were too severe to participate. The trial was stopped after the predetermined recruitment goal was reached and all had completed the trial.

Baseline demographics

The groups overall had no statistically significant baseline differences, with the exception of race, annual household income, and geographic region (p < 0.05). The intervention group contained a lower ratio of White participants, compared to the waitlist group. See Table 1 for more information.

Table 1 Baseline characteristics of cohort

Primary outcomes

Our results indicated a large effect size for depression (measured by PHQ-9) from baseline to post-intervention (4 weeks) and a very large effect size from baseline to follow-up (8 weeks) in the intervention group. We found a medium effect size from baseline to 4 weeks and a small effect size from baseline to 8 weeks in the control group. Further, we found a small–medium between-group effect size from baseline to post-intervention and a large between-group effect size from baseline to follow-up. These effect sizes were, however, not statistically significant. These results are summarized in Table 2 and displayed in graphical form in Fig. 3A.

Table 2 Statistical results of within and between group differences
Fig. 3: Within and between-group comparisons of key intervention outcomes through time.
figure 3

Cohen’s d values are shown for both within and between group comparisons. Pre to Post represents a 4-week period while Pre to Follow-up represents an 8-week period. A Depression, B anxiety, and C OUD. SS simple slopes; * significant Cohen’s d value. Note that if negative signs were present for between-group effect size, they are dropped in the figure.

Our results indicated a large effect size for generalized anxiety (as measured by the GAD-Q-IV) from baseline to post-intervention and a very large effect size from baseline to follow-up in the intervention group. We found medium effect sizes from baseline to 4 weeks and from baseline to 8 weeks in the control group. Furthermore, we found a small–medium between-group effect size from baseline to post-intervention and a medium between-group effect size from baseline to follow-up. These effect sizes were not statistically significant. These results are summarized in Table 2 and displayed in graphical form in Fig. 3B.

Analysis of within and between-group differences across 4 (baseline to post-study) and 8 (baseline to follow-up) weeks of MTD and BUP positive screens on UDS were used to determine MOUD adherence. UDS results revealed similar between-group adherence to buprenorphine as indicated by positive buprenorphine screens from baseline to post-intervention and from baseline to follow-up, with a small decrease in the intervention and control groups from baseline to post-intervention and from baseline to follow-up.

Regarding methadone adherence, the intervention group showed a small increase in adherence from baseline to 4 weeks and baseline to 8 weeks. The control group showed a negligible difference from baseline to 4 weeks and a very small increase from baseline to 8 weeks. There was a small between-group difference from baseline to 4 weeks and a negligible between-group difference from baseline to 8 weeks.

From baseline to 4 weeks, the results suggested a small between-group effect size in opioid (MOP) use such that the waitlist group had a small increase in opioid use, while the intervention group experienced a small decrease. Likewise, the results showed that these trends continued from baseline to follow-up with a small to medium between-group effect size, with the intervention group and the waitlist group decreasing and increasing their opioid use, respectively. It should be noted that neither between-group nor across-time changes in substance use were statistically significant. Across all study participants and all 5 weeks, 70.16% of drug screens were completed. All within-group and between-group statistical results for UDS are summarized in Table 2.

Secondary outcomes

We found a very large, statistically significant effect size from baseline to post-intervention and from baseline to follow-up in both the intervention and control groups. However, we found only a very small between-group effect size at both 4 weeks and 8 weeks, which was not statistically significant. These results are summarized in Table 2 and are displayed in graphical form in Fig. 3C.

We found a small increase in self-reported opioid cravings (as measured by OCS) in the intervention group from baseline to post-intervention and negligible change from baseline to follow-up in the intervention group. The results demonstrated a large effect size for opioid cravings from baseline to post-intervention and a medium-large effect size from baseline to follow-up in only the control group. We found a very large between-group effect size from baseline to 4 weeks and a large between-group effect size from baseline to 8 weeks. These effect sizes were not statistically significant. The associated statistical results are summarized in Table 2.

Fifteen features were collected via smartphone passive sensors and spanned four behavioral domains: movement (three features), light exposure (two features), general phone use (two features) and social interaction (eight features). There were significant differences in baseline and/or individual trajectories for most features; however, except for a subset of social interaction features (Fig. 4), these features were not significantly associated with the fixed effect of time since the beginning of the intervention within persons. The results demonstrate that both overall volume and diversity of phone-based social interactions increased through time across intervention participants. A complete report of findings for each feature is available in Supplementary File 1.

Fig. 4: Intervention group trajectories of significant phone-based social interaction feature.
figure 4

Values represent per diem counts of A outgoing calls, B incoming calls, C unique outgoing call contacts, D unique incoming call contacts, and E unique incoming text contacts.

App acceptability and usership

The average satisfaction with the app was 3.86 stars (median = 4, SD = 1.08) and the average rating regarding the app’s ability to help reduce depressive and anxiety symptoms was 3.25 stars (median = 3, SD = 1.11). The participant rating distributions are displayed in Fig. 5.

Fig. 5: Participant evaluation and feedback of the intervention app.
figure 5

A Distribution of user’s overall satisfaction with the app. B Distribution of user ratings based specifically on the app’s ability (including CBT content videos) to help reduce anxiety and depression symptoms. We have provided averages and standard deviations (std) for both (A) and (B).

EMAs were an optional component of the app, which drove participant self-monitoring functionality. We present a supplementary descriptive figure (see Supplementary Fig. 1), highlighting usership of the EMA feature across time.

Discussion

The present work is the first to investigate the feasibility, acceptability, and preliminary effectiveness of an app-delivered digital intervention for use in treating depression and anxiety among persons receiving MOUD. This work utilized multiple streams of data, including subjective self-report to evaluate intervention response, UDS for objective substantiation, and passively collected smartphone data to assess behavioral changes. As hypothesized, evidence supported both feasibility and acceptability. Regarding acceptability, we found on average a favorable intervention rating (average rating ≥ 3.25), commensurate with existing highly-rated mental health mobile apps on the iOS and Google Play app stores54. Despite the limitations imposed by our sample size, we observed encouraging trends toward symptomatic improvements in both self-reported depression and anxiety among those assigned to the intervention, compared with randomized controls (Fig. 3). Furthermore, these reductions were maintained and observed after cessation of the intervention at 4 weeks follow-up. OUD self-report severity decreased and was sustained at follow-up, although this effect was seen in both the treatment and waitlist control groups. While these preliminary findings suggest a potentially beneficial impact of our digital intervention, it is important to interpret these effects with due caution, as they do not reach the threshold for statistical significance. We believe that the lack of statistical significance was due to the small sample size, which we discuss further in limitations and future directions. Associated with these promising outcomes, social connectedness, as measured through analysis of passively collected call and text logs, saw significant increases from baseline within the intervention group (Fig. 4). However, contrary to our hypothesis, decreases in cravings were only observed among control participants, whereas the intervention group saw slight increases. Taken together, this pilot study has demonstrated the potential clinical utility of digitally delivered CBT in those with OUD, specifically highlighting both the promise of using this approach to target depression and anxiety comorbid symptomatology to good effect, as well as the ability for this modality to facilitate the dense and longitudinal behavioral contextualization of OUD through complementary passive sensing monitoring.

The observed trends toward improvement in depression and anxiety symptoms are consistent with a robust literature supporting cognitive-behavioral approaches to these disorders55,56, approaches that are also found to be efficacious when delivered via a scalable and automated digital platform57. Unlike traditional CBT, however, this work suggests that very short (i.e., 90 s on average), self-paced, and targeted interventions may lead to favorable outcomes alongside a mitigation of patient burden. While echoing past successes, the current study’s results are unique and important in demonstrating an ability to effectively target comorbid symptomatology in individuals undergoing treatment for OUD.

In line with self-reported improvements in depression and anxiety, those in the intervention group also showed increased smartphone communication patterns (i.e., daily incoming/outgoing calls and unique contacts) over time, perhaps signifying increased social connectedness. However, it is crucial to acknowledge that insights derived from mobile communication records provide a limited perspective on the multifaceted nature of social interactions and might not generalize to other forms of social engagement, such as in-person interactions. Our findings suggestive of increased social connectedness may be contextualized in recent studies, which suggest that the mental health benefits of social connectedness can be derived virtually58 and that communicative smartphone use correlates with greater friendship satisfaction and reduced anxiety59. We believe that the intervention may have led to more social connection through intervention video exercises that either directly or indirectly addressed social connection. Examples of such exercises included (1) “activity scheduling”, which encouraged participants to schedule activities, providing the example of talking to a good friend, (2) “interpersonal effectiveness,” (3) “grief processing”, which included encouragement to share feelings with trusted contacts, and “exposure treatment,” which emphasized the importance of approaching anxiety-provoking situations, including social ones. Further, we posit that overall improvements in anxiety and depression may have led to increased social drive. Bolstering social connectedness may also help to maintain improvements in depression and anxiety, as social connectedness has been shown to be a consistent predictor of mental health60 and may serve as a protective factor against depressive symptoms61. Given the importance of social connectedness in managing both anxiety and depression, as well as the phenomenological ties of anxiety and depression to OUD, it stands to reason that social connectedness may be an important consideration in populations receiving OUD treatment.

Despite this appreciable difference in depression and anxiety severity change, as well as social connectedness, between those assigned to the intervention and controls, both groups experienced similar trajectories of OUD symptom decrease, had similar adherence to MOUD, and showed comparable levels of illicit substance use at 4 and 8 weeks (There was a modest between-group effect size in opioids from pre-intervention to 4 weeks; the control group had a small increase in opioid use, while the intervention group experienced a small decrease. See the results section for additional information on this finding). There are two potential explanations for these findings. First, the inclusion criteria did not require a specified duration of treatment with MOUD prior to joining the study. Given that the recruitment strategy was exclusively via Reddit advertisements posted in groups aimed at topics relevant to MOUD, there may have been a potential selection bias toward persons who were proximal to MOUD treatment initiation. As such, reductions in OUD symptoms due to the confounding effect of recently initiated MOUD are expected given the robust evidence for MOUD effectiveness, itself62,63. Second, both groups were required to complete biweekly UDS over the course of the 8-week study. Limited evidence suggests that such monitoring may increase adherence to opioid therapy and decrease illicit drug use64,65. However, this explanation is weakened by the fact that the results of the UDS had no punitive consequences for participants in the current study.

Given that depression and anxiety have been shown to be bidirectional risk factors for opioid use, their improvement was anticipated to coincide with a reduction in cravings. Counter to the study hypotheses, self-reported cravings showed a marginal increase in the treatment group and decreased in the control group. We suggest two potential reasons for this finding. First, the anxiety exposure-based exercises included in our intervention may have led to short term increases in craving as stress-mediated responses to the exposure. Specifically, effective exposure therapy for anxiety disorders necessarily increases short-term distress via voluntary exposure to feared stimuli66. Subsequently, increased distress, including increased stress or negative affect, has been positively associated with increased opioid cravings67. Thus, short-term distress induced by exposure exercises could lead to heightened cravings. Second, we posit that factors unique to those participants in our intervention group (e.g., more time spent weekly engaged in a study focused on opioid use) may have served as a behavioral cue, increasing the likelihood of cravings68. It is also possible that cravings are lower in the control group because participants are not abstaining from drug-related activity. The question of craving exacerbation may be more definitely understood by research which extends our methods to include longer follow-up periods and a larger sample size.

The results of this work carry important preliminary implications for future clinical and research efforts aimed at addressing concurrent anxiety and depression among MOUD users. Most importantly, findings support the feasibility, acceptability, and preliminary effectiveness of a brief, digitally delivered CBT-based intervention, and showcase the potential insights afforded through complementary passive sensing data collection that is native and thus seamlessly integrated into the delivery modality. While we do not see app-based interventions as a replacement for traditional treatments, effective digital interventions may serve to augment and scale traditional approaches, which often overlook commonly co-occurring mental health disorders (e.g., depression and anxiety). Further, such digital interventions could also provide more immediate support for persons awaiting treatment for co-occurring psychiatric disorders, or for whom traditional treatment is inaccessible. Further, for clinical research especially, the convenience of leveraging smartphones to collect passive sensing data alongside intervention administration may serve to augment and better contextualize subjective self-report measures as well as other more objective outcomes.

While this study has numerous strengths, it is important to acknowledge its inherent limitations which may serve as an important guide for future research efforts. First, the data gathered constitutes the initial results of a pilot study, thus sample size was limited and may have constrained the capacity to detect small, statistically significant differences between the intervention group and controls. Second, the recruitment methodology, which relied on targeted Reddit advertisements, may have inadvertently introduced selection bias and impacted generalizability. Relatedly, participation required the Android platform, potentially impacting the generalizability of our findings, given differences between users of distinct mobile platforms69. Third, information on the duration of MOUD treatment was not collected prior to beginning the study, precluding an ability to account for the impact of newly initiated MOUD on response trajectories. Fourth, the cohort was characterized by a narrow racial diversity, predominantly composed of individuals identifying as White, further impacting generalizability. Additionally, given the potential challenges associated with app installation, phone meetings were required for the intervention group, while optional for the waitlist group. While these meetings were short and intentionally focused, there is the possibility of differing duration of staff interaction affecting treatment outcomes. Further, while we did collect app-level feedback on participants’ experiences, we did not assess their compliance with individual video interventions; we recognize that this is an important consideration for future research. Lastly, the follow-up period for the study was limited to 4 weeks, potentially limiting the ability to identify and characterize long-term responses to the intervention.

Thus, while showing early promise for the effectiveness of digitally delivered mobile interventions, the results of our work primarily serve to support the feasibility and acceptability of such interventions among persons with co-occurring disorders; based on our results and limitations, we provide recommendations for future research. First, we suggest inclusion of larger, more diverse samples adequately powered to detect between-group differences; second, we suggest extended follow-up in order to assess the durability of the digital treatment effects; third, given our success in utilizing passive sensor data (e.g., number of phone calls) as an outcome measure for treatment response, we emphasize the future importance of such data in assessing treatment effects, which we believe will be especially useful in blended care interventions. As the passive sensing data collection was a feature of the app itself, we only collected these data for the intervention group; we thus emphasize the importance in future study design to separate passive assessment from digital intervention such that passive outcomes can be assessed for both treatment and control groups.

In summary, there is considerable need for cost-effective, scalable solutions aimed at effectively treating comorbid anxiety and depression among those receiving MOUD. This pilot study is the first to lay the groundwork for testing the feasibility, acceptability, and preliminary effectiveness of a brief, digitally delivered CBT-based intervention for those undergoing treatment for OUD. Through creative and resourceful leverage of technology, treatment paradigms within substance misuse and mental health can effectively coalesce and begin to overcome the traditional barriers of accessibility that have plagued patients and clinicians alike. This study is one early foray into the rapidly burgeoning space of digital health with promising implications for the hitherto unexplored realm of OUD research and treatment.

Methods

Trial registration and institutional review

The Clinical Trial was pre-registered with clinicaltrials.gov (NCT05047627). Primary and secondary outcomes were specified in the preregistration, as well as the guiding aims of the work, i.e., to assess the feasibility, acceptability and preliminary efficacy of the mobile digital intervention. The study described herein was approved by the Dartmouth College Institutional Review Board (STUDY00032008, approved 07/27/2020). It was most recently rereviewed and approved on 06/12/2023. All collaborators who met criteria for authorship have been included as co-authors on the manuscript. The study did not involve international collaborators.

Recruitment

Bolstered by previous works which have found Reddit to be a useful and valid tool within social science research70,71,72, we used the Reddit platform to make targeted recruitment posts on seven subreddit pages pertaining to OUD treatments (r/suboxone; r/methadone) and several opioid drugs (r/heroin, r/heroinsheroines, r/opiates, r/opiatesrecovery, r/fentanyl). For each page used for recruiting, moderators were contacted by study personnel to confirm that the recruitment ad followed all page guidelines. We utilized RODS51 for OUD diagnosis, the PHQ-949 for depression diagnosis, and the Generalized Anxiety Disorder Questionnaire (GAD-Q)50 for anxiety diagnosis. Inclusion criteria included (i) adults (age ≥ 18 years), (ii) fluency in English, (iii) ability to provide informed consent, (iv) use of an Android smartphone device running version 9 or greater, (v) diagnosis of OUD as determined by total RODS score ≥3, (vi) current methadone, buprenorphine, or naltrexone treatment for OUD, and (vii) positive screen for MDD or GAD as determined by a PHQ-9 ≥ 10 or a GAD-Q-IV positive screen as defined by Newman et al.50. Preliminary screening was conducted via the Qualtrics Platform73. Prospective participants were required to provide proof of primary residence via photograph of official mail and their IP address was checked against their self-disclosed physical address.

Study period

Participants completed a baseline screening questionnaire, which included GAD-Q-IV, PHQ-9, RODS, and the OCS52. Participants were randomized to either the app-intervention group or the control group. Participants randomized to the app-intervention group were scheduled for a phone call onboarding with a trained research assistant. Participants in the waitlist control group were also offered a phone meeting with a trained research assistant. Participants were assisted in downloading and logging in to the Mood Triggers application during this session. Participants in the intervention group were asked to use the intervention (see the intervention subsection) four to six times for 4 weeks. This consisted of completing 20 digital video sessions in total. Participants in the intervention group were provided instructions on which interventions to use weekly. All study participants completed a questionnaire battery at week 4 and week 8 of the study. Both batteries contained the PHQ-9, GAD-Q-IV, RODS, OCS, and the Difficulties in Emotion Regulation Scale (DERS)74. All study participants were asked to complete a 12 panel drug screen five times over the duration of the study.

Randomization

We utilized a simple random number generator in order to randomly assign participants to either the intervention or waitlist control group. Randomization was performed by trained research assistants.

Power analysis

To assess the appropriateness of a sample size of 60 participants, divided into two groups of 30 each, a simulation study was performed. We used linear mixed models with nesting within individuals across two distinct time points: baseline and post-study. Data for the simulation were generated based on a multivariate normal distribution to account for participant random effects at both time points. The main focus was to evaluate the effectiveness of an intervention, aiming for a small to moderate effect size of 0.3 as the difference between the waitlist control and the intervention groups. A 10% missingness was also factored in for the post-study data points. Upon running the simulation 100 times, we found that in 90% of the cases, the chosen sample size was sufficiently powered to detect the targeted effect size of 0.3. This result suggests that a total sample size of 60 participants is adequate for identifying significant differences in the outcome variable between the waitlist control and intervention groups.

The intervention

Mood Triggers is an application designed by our research team, which led participants through a suite of evidenced-based video interventions, each a member of one of the following categories: (i) actions, (ii) good mental hygiene, (iii) exposure, (iv) acceptance, (v) thoughts, (vi) regulation and positive feelings, or (vii) preparation. Each intervention was written, designed, filmed, and edited by our research team, utilizing evidence-based CBT skills. The topics included in the video suite were curated in the context of prior research supporting a linkage between anxiety and/or depression and substance use disorders13,14,18,19. Cravings, for example, have shown the capacity to be induced by anxious or depressed mood states21,22. The full suite of video interventions with associated weeks of suggested viewing is available in Supplementary Table 1. In addition, the application included a user interface for self-tracking across time, driven by daily prompts on mood and anxiety. Key elements of the application interface, as well as EMA prompts delivered daily to participants are shown in Fig. 1 and Supplementary Table 2. The EMA prompts were delivered in order to drive the application’s self-monitoring functionality and to test the feasibility of using daily EMA within a digital app-based intervention in a participant sample with a co-occurring OUD. The EMA prompts were adapted from the Positive and Negative Affect Schedule-Expanded Form (PANAS-X) and earlier work by our research team75,76. EMAs were optional, and participants did not receive additional compensation for completion.

The application was designed to be engaging and interactive. First, the video interventions themselves generally followed the pattern of explaining an evidenced-based psychotherapeutic skill before asking the participant to attempt an in-vivo exercise using the skill. For example, the “exposure” intervention explains the concept of graded exposure77 and then prompts the user to make their own list of anxiety-provoking scenarios prior to attempting exposure. Second, the application prompts the user daily with questions regarding mood and anxiety, and encourages the user to track their symptom report over time. Participants were also sent weekly reminders via email to suggest video interventions, detailed in Supplementary Table 1. The suggested order was such that concepts built on one another, though participants did not need to view the videos in that order, nor did they need to complete one set of videos before they could move on to the next. To enhance the flexibility of the intervention, participants could view videos anytime, anywhere at their discretion.

Compensation

Participants were compensated according to their individual completion of five UDS and three questionnaires. Compensation occurred at three separate time points: (i) at study qualification, (2) after post-measure completion (week 4), and (iii) after follow-up completion (week 8). Participants were compensated $15 for each UDS and survey, for a total possible compensation of $120. Compensation was paid via online Amazon gift cards which were delivered via email.

Measures

The severity of opioid use was measured using the validated 8-item self-report RODS51. The first item prompts participants to report whether they have used any opioid over their lifetime. If the participant responds affirmatively, they are directed to questions 2–8, which assesses cognitive, behavioral, and physiological features associated with DSM-IV-defined opioid dependence. A sum score of 3 on items 2–8 is used as a threshold for opioid dependence. The RODS has demonstrated a sensitivity and specificity of 0.97 and 0.76, respectively, and a positive predictive value and negative predictive value of 0.69 and 0.98, respectively51.

Anxiety severity was measured using the 14-item GAD-Q-IV measure78, a self-report instrument for GAD based on the DSM-IV criteria. Items 1–5 on the GAD-Q-IV assess the frequency, intensity and perceived control of excessive worry, as well as topics of worry. Items 7–12 assess the DSM-IV-defined associated features of GAD over the last 6 months. Items 13–14 assess the degree of impairment and dysfunction associated with the symptoms. The GAD-Q-IV has a demonstrated sensitivity and specificity of 0.83 and 0.89, respectively, and exhibits robust convergent and discriminant validity, as well as test-retest reliability78. Kappa agreement compared to a structured interview diagnosis of GAD was 0.6778.

Opioid cravings were measured using the 3-item OCS, a modified version of the robustly-validated Cocaine Craving Scale79. The OCS includes three items on a 0–10 scale pertaining to the intensity of opioid cravings and the subjective likelihood of use in high-risk contexts. The OCS has demonstrated strong predictive validity, with a 17% higher odds of using opioids for each 1 point increase in cravings80. Further, a generalized version of the cravings scale showed strong internal consistency (Mcdonald’s omega = 0.80) for opioids, as well as factor loadings >0.60 for all three items. The generalized craving scale also exhibits strong concurrent and discriminant validity52.

The severity of depression was evaluated using the Patient Health Questionnaire for Depression (PHQ-9)49, a well-validated 9-item self-report instrument. The PHQ-9 prompts participants to assess their depressive symptoms over the preceding 2 weeks, utilizing a Likert scale that ranges from 0 to 3, where 0 denotes “Not at all” and 3 denotes “Every day”. Consequently, the total score varies from 0 to 27, with a threshold of 10 often used to indicate MDD. The PHQ-9 shows robust construct and criterion validity; a summative score of 10 or higher displays an 0.88 sensitivity and specificity when compared with an interview conducted by a mental health professional49.

All study participants were asked to completed either a 12 panel drug screen [testing for tetrahydrocannabinol (THC), cocaine (COC), oxycodone (OXY), methyl enedioxy methamphetamine (MDMA), buprenorphine (BUP), opioids (MOP), amphetamines (AMP), barbiturates (BAR), benzodiazepines (BZO), methamphetamine (MET), methadone (MTD), and phencyclidine (PCP)], or a 6-panel drug screen if their state prohibited full panel testing (testing for AMP, BZO, COC, MOP, OXY, and THC). The MOP assay had sensitivity to morphine, codeine, hydrocodone, hydromorphone, morphine 3-β-D-glucuronide, 6-monoacetylmorphine, normorphone, oxycodone, oxymorphone, and thebaine. They were mailed a box containing five drug screens at baseline and asked to complete a total of five tests over the 8 week study period. These occurred at baseline and weeks 2, 4, 6, and 8. Participants were reminded of each drug screen and instructed to take a photograph of their UDS cup and upload it via a secure Qualtrics survey. UDS results were interpreted and coded by trained research assistants under the supervision of a medical doctor. A subset (67%) of drug screens were checked by two independent raters for quality assurance. Over the study population, analysis of within and between group differences across 4 (baseline to post-study) and 8 (baseline to follow-up) weeks of MTD and BUP positive screens on UDS were used to determine MOUD adherence.

Using a star-based rating system, scaled from 1 to 5 (1 = the worst, 5 = the best), we allowed participants to provide feedback on both their overall satisfaction with the app and with their belief that the app, including the video interventions, could help them with their depressive and anxiety symptoms.

Statistical analyses

The dataset was first separated into two groups: the waitlist (control) group and the experiment (treatment) group. This separation allowed us to handle each group separately during the imputation process, ensuring that the imputation was conducted within the context of each group and affording a greater ability to capture the inherent variability within the respective groups. Missing data poses a challenge in statistical analyses, as it can lead to biased estimates if not addressed appropriately. In order to account for missing data, the multivariate imputation by chained equations (MICE) method via the mice package in R (v4.10) was employed. Broadly, this technique allows for the replacement of missing values with multiple sets of plausible values, reflecting the uncertainty around the true value. Accordingly, 30 estimated datasets were generated through MICE and were subsequently modeled in parallel.

Following the imputation process, we fit 30 robust linear mixed-effects models to the data for each outcome using the robustlmm package in R. The robust linear mixed-effects models offer several advantages over traditional linear mixed-effects models. Primarily, they are less sensitive to the presence of outliers or contamination in the data. This robustness ensures that our estimates remain reliable even in the face of extreme observations, which could otherwise skew the results. The fixed effects part of our model included the baseline measure of the outcome variable (Baseline) as a control to look specifically at deviations from where participants started, the interaction between PrePost (before or after intervention) and Condition (control or treatment), as well as the interaction between PreFollowup (before or after follow-up) and Condition. These interactions allowed for the examination of the differential effects of the intervention over time, both between and within the two groups. The random effects accounted for participant (ID) variation in trajectories across time points. This random structure enabled an ability to capture the inherent variability between subjects and their response to the intervention over time. Note that the intercept was also removed (−1) from both the fixed and random effects prior to modeling. To pool the results of these models across imputed datasets, the lmerpool() function in the miceadds R package was applied after custom translation for use in robust linear mixed models. As described above, each robust linear mixed-effects model was of the following form:

$$outcome=-1+Baseline+PrePost\ast Condition+PreFollowup\ast Condition+\,(-1+Time|ID)$$
(1)

For all passive sensing data (15 features; see Supplementary File 1), we fit 30 generalized additive mixed models for each outcome of interest using the mgcv package in R. Each participant (ID) and their time since the study started (StudyTime) were modeled as fixed and random effects. The default thin plate factor smooth (bs = “tp”) was used for StudyTime, while a factor smooth (bs = fs) was used for StudyTime and ID. The ID variable was also treated as a random effect (bs = “re”). To pool results across imputed datasets, the lmerpool() function in the miceadds R package was applied. The structure of each generalized additive mixed model took the following form:

$$outcome \sim s(StudyTime,bs= {\mbox{``}} tp{\mbox{''}})+s(ID,bs= {\mbox{``}} re{\mbox{''}})+s(StudyTime,ID,bs= {\mbox{``}} fs{\mbox{''}})$$
(2)

Subsequent to modeling, Cohen’s d effect sizes for various contrasts of interest, including between-group differences at Pre–Post and Pre–Followup, as well as within-group differences at Pre–Post and Pre–Followup were calculated based on Judd et al.81. These effect sizes provided a standardized measure of the magnitude of the intervention’s effect on the outcome variable, enabling us to assess the clinical or practical significance of the results.