## Introduction

The number of people with diabetes worldwide is nearly 500 million and is projected to grow dramatically in the coming decades1. Most of these individuals have type 2 diabetes, a chronic metabolic disease characterized by insulin resistance and hyperglycemia2.

Chronic inflammation is a critical facet of type 2 diabetes3,4. The NLRP3 inflammasome, a multimeric protein complex, is implicated as a key driver of type 2 diabetes5,6. Activation of this inflammasome occurs in response to multiple danger-associated molecular patterns including several molecules involved in the pathogenesis of type 2 diabetes: glucose, islet amyloid polypeptide, free fatty acids, and mitochondrial reactive oxygen species7,8, and leads to production of the mature forms of the proinflammatory cytokines IL-1β and IL-189. In animal models, inflammasome inhibition protects against insulin resistance6,10,11,12,13. Importantly, inflammasome activation is observed in circulating cells and adipose tissue of patients with insulin resistance6,14,15, and plasma concentrations of IL-1β and IL-18 are elevated in patients with type 2 diabetes and predict development of this disease16,17.

One of the activators of this inflammasome is RNA derived from Alu mobile genetic elements18. Alu RNAs recently are implicated in Alzheimer’s disease19, macular degeneration18,20,21, and systemic lupus erythematosus22. Elevated levels of Alu RNA and inflammasome activation in macular degeneration result from reduced levels of the enzyme DICER1, one of whose metabolic functions is to catabolize Alu RNAs20,23,24. Interestingly, DICER1 levels are decreased in circulating cells in patients with type 2 diabetes25,26, and deletion of DICER1 in adipose or pancreatic islet beta cells triggers insulin resistance and diabetes in mice27,28,29,30. Given the expanding array of human disorders in which mobile genetic elements are recognized to play a pathogenic role31, and because DICER1 and inflammasome activation are implicated in diabetes, we explore whether Alu RNAs might be dysregulated in this disease as well.

Alu mobile elements propagate themselves by hijacking the endogenous reverse-transcriptase LINE-131,32. Notably, nucleoside reverse-transcriptase inhibitors (NRTIs), drugs that are used to treat HIV-1 and hepatitis B infections, inhibit not only viral reverse-transcriptases but also LINE-1 reverse-transcriptase activity33,34. NRTIs also block inflammasome activation by Alu RNAs and other stimuli, independent of their ability to block reverse-transcriptase35. Therefore, we seek to determine whether, among patients with HIV-1 or hepatitis B, there is a relation between exposure to NRTIs and development of type 2 diabetes.

Here we employ a health insurance claims database analyses approach to examine the link between NRTI use and incident type 2 diabetes. We also investigate the effect of the NRTI lamivudine on insulin resistance in diabetic human adipocytes and myocytes and in a mouse model of type 2 diabetes to obtain experimental evidence in support of the clinical observations. We find that NRTI exposure is associated with reduced development of type 2 diabetes in people and that lamivudine inhibits inflammasome activation and improves insulin sensitivity in experimental systems. These data suggest the possibility of either repurposing this approved class of drugs or exploring less toxic modified NRTIs for treating prediabetes or diabetes.

## Results

### NRTIs associated with reduced hazard of developing diabetes

We examined associations between exposure to NRTIs (a list of specific medications is in Supplementary Table 1) and subsequent development of type 2 diabetes in the Veterans Health Administration, the largest integrated healthcare system in the United States, that was studied for over a 17-year period. To confirm these main findings, we then studied four other health insurance databases comprising diverse populations. In each of the five cohorts that included a total of 128,861 patients (Supplementary Fig. 1), we determined the association between NRTI use and the hazard of developing type 2 diabetes after adjustment for sociodemographic factors, overall health, comorbidities (a list of specific disease codes is in Supplementary Table 2), and use of medications that are known to alter risk of diabetes development.

In our main analysis of the Veterans Health Administration database, which comprises predominantly men, 79,744 patients with confirmed diagnoses of HIV or hepatitis B and without a prior diagnosis of type 2 diabetes were identified (Supplementary Table 3). Of this group (baseline characteristics in Supplementary Table 4), 12,311 patients developed incident type 2 diabetes. After adjustment for potential confounders, users of NRTIs had 34% reduced hazard of developing type 2 diabetes (hazard ratio, 0.665; 95% CI, 0.625 to 0.708; P < 0.0001) (Fig. 1, Supplementary Table 5).

In the Truven database, which comprises employer-based health insurance claims, 23,634 patients with confirmed diagnoses of HIV or hepatitis B and without a prior diagnosis of type 2 diabetes were identified (Supplementary Table 3). Of this group (baseline characteristics in Supplementary Table 6), 1630 patients developed incident type 2 diabetes. After adjustment for potential confounders, users of NRTIs had 39% reduced hazard of developing type 2 diabetes (hazard ratio, 0.614; 95% CI, 0.524–0.718; P < 0.0001) (Fig. 1, Supplementary Table 7).

In the PearlDiver database, which comprises predominantly private health insurance claims, 16,045 patients with confirmed diagnoses of HIV or hepatitis B and without a prior diagnosis of type 2 diabetes were identified (Supplementary Table 3). Of this group (baseline characteristics in Supplementary Table 8), 1068 patients developed incident type 2 diabetes. After adjustment for potential confounders, users of NRTIs had 26% reduced hazard of developing type 2 diabetes (hazard ratio, 0.738; 95% CI, 0.600–0.908; P = 0.004) (Fig. 1, Supplementary Table 9).

In the Medicare 20% sample (primarily men and women 65 years or older), 3,097 patients with confirmed diagnoses of HIV or hepatitis B and without a prior diagnosis of type 2 diabetes were identified (Supplementary Table 3). Of this group (baseline characteristics in Supplementary Table 10), 707 patients developed incident type 2 diabetes. After adjustment for potential confounders, users of NRTIs had a 17% reduced hazard of developing type 2 diabetes (hazard ratio, 0.828; 95% CI 0.646–1.062; P = 0.137) (Fig. 1, Supplementary Table 11).

In the Clinformatics dataset, which comprises predominantly commercial health insurance claims, we identified 6341 patients with confirmed diagnoses of HIV or hepatitis B and without a prior diagnosis of type 2 diabetes (Supplementary Table 3). Of this group (baseline characteristics in Supplementary Table 12), 1067 patients developed incident type 2 diabetes. After adjustment for potential confounders, users of NRTIs had 27% reduced hazard of developing type 2 diabetes (hazard ratio, 0.727; 95% CI, 0.572–0.924; P = 0.009) (Fig. 1, Supplementary Table 13).

Given the low proportion of the observed variance among the five studies that could be attributed to heterogeneity (I2 = 0.0012%; 95% CI, 0.0–93.6%; P = 0.26, test of heterogeneity), summary risks were calculated using both fixed-effect and random-effects models, which yielded identical estimates and confidence intervals because of the extremely low estimate of heterogeneity (τ2 < 0.0001). Collectively, among 128,861 patients with HIV-1 or hepatitis B, users of NRTIs had 33% reduced hazard of developing type 2 diabetes (adjusted hazard ratio, 0.673; 95% CI, 0.638–0.710; P < 0.0001, test of overall effect; 95% prediction interval, 0.618–0.734) (Fig. 1).

### Bayesian meta-analysis

To complement this classical “frequentist” approach to meta-analysis, we performed a Bayesian meta-analysis using a random-effects normal-normal hierarchical model, which accounts for uncertainty in the estimation of the between-study variance. We used a weakly informative half-Cauchy prior distribution for between-study variability (τ) with the assumption that it was unlikely for the between-study hazard ratios to vary by more than 3-fold (scale = 0.280). In this model, collectively, among the patients with HIV-1 or hepatitis B in the five databases, users of NRTIs had a 32% median reduced hazard of developing type 2 diabetes (adjusted hazard ratio, 0.685; 95% credible interval, 0.610 to 0.794; P(HR > 1) = 0.0008, posterior probability of a non-beneficial effect) (Fig. 2). We performed a sensitivity analysis of these results to the choice of the prior by assuming that it was unlikely for the between-study hazard ratios to vary by more than 10-fold (scale = 0.587). The posterior distribution was quite robust to changes in the scale as the summary effect was remarkably insensitive to the choice of the prior: applying this model, users of NRTIs had a 31% median reduced hazard of developing type 2 diabetes (adjusted hazard ratio, 0.686; 95% credible interval, 0.604 to 0.809; P(HR > 1) = 0.0017, posterior probability of a non-beneficial effect) (Fig. 2). In both models, the estimate of heterogeneity (τ2) was low (0.005–0.006). These Bayesian meta-analyses yielded estimates that were qualitatively similar and directionally identical to the frequentist meta-analyses.

### Sensitivity analyses

Next we performed three types of sensitivity analyses in the main Veterans cohort. First, a single hazard ratio averaged over the entire study duration may not necessarily reflect a robust measure of the exposure effect because the hazard ratio may change over time36. Therefore, we computed the average hazard ratios after 1, 2, 5, and 10 years of follow-up. After adjustment for potential confounders, users of NRTIs had a reduced hazard of developing type 2 diabetes over each of these time periods (Supplementary Table 14). In addition, we plotted survival curves adjusted for baseline confounders (Supplementary Fig. 2). None of the period-specific hazard ratios crossed 1.0 nor did the adjusted survival curves for the NRTI-exposed and NRTI-unexposed groups cross one another, suggesting that the protective association of NRTI use against the development of type 2 diabetes was maintained throughout the study period.

Second, in chronic diseases such as type 2 diabetes, the competing risk of death can preclude the diagnosis of diabetes. Therefore, we performed a competing risk regression analysis, which was possible in the Veterans cohort as it contains comprehensive mortality data37, but not in the other four databases we studied as they do not provide mortality information. The follow-up duration and mortality rates were comparable between NRTI users and NRTI non-users in the Veterans cohort (Supplementary Table 15). Among these 79,744 patients (who account for the majority of the patients in all 5 databases), using a competing risk of mortality analysis and using the same list of covariates as the primary analysis, NRTI use was associated with a 27% reduced risk of incident type 2 diabetes (adjusted subdistribution hazard ratio, 0.727; 95% CI, 0.683–0.775; P < 0.0001) (Supplementary Table 16). These data, which are similar to risk reduction observed in the primary analysis, suggest that the differential mortality rates are not responsible for the observed risk reduction of incident type 2 diabetes among NRTI users in the Veterans cohort.

Third, as the NRTI exposure prevalence was markedly different between HIV-positive and hepatitis B-positive persons (Supplementary Tables 1721), we performed another sensitivity analysis by analyzing these populations separately. The proportions of HIV-positive patients who did not have a record of NRTI exposure (VA – 34%; Truven – 1%; PearlDiver – 17%; Medicare – 19%; Clinformatics – 24%) were similar to previously reported rates38,39. Because this subgroup analysis markedly reduces the sample sizes, we again studied the largest Veterans cohort. Among HIV-positive hepatitis B-negative individuals, NRTI use was associated with a 38% reduced risk of incident type 2 diabetes (adjusted hazard ratio, 0.621; 95% CI, 0.562–0.685; P < 0.0001). Among hepatitis B-positive HIV-negative individuals, NRTI use was associated with a 28% reduced risk of incident type 2 diabetes (adjusted hazard ratio, 0.717; 95% CI, 0.656–0.783; P < 0.0001). Collectively, these analyses (Supplementary Tables 22 and 23) suggest that NRTI exposure is beneficial in reducing incident type 2 diabetes risk among HIV-positive as well as hepatitis B-positive individuals in the Veterans cohort.

### Continuous exposure modeling

Next, we studied NRTI exposure as a continuous rather than categorical covariate by estimating the hazard of developing type 2 diabetes as a function of per year of NRTI exposure. This approach addresses, in part, the issue of allocation bias. By focusing inferences on how the outcome of incident type 2 diabetes depends on cumulative exposure to NRTIs, this approach also provides insight into potential dose-response effects. In each of the five databases, there was a reduced hazard of developing type 2 diabetes with each increasing year of NRTI exposure (Supplementary Tables 2428). Summary risks were calculated by performing meta-analyses using both fixed-effect and random-effects models. Collectively, among the patients with HIV-1 or hepatitis B in the five databases, users of NRTIs had 3–8% reduced hazard of developing type 2 diabetes with each additional year of use (fixed-effect: adjusted hazard ratio per year of NRTI exposure, 0.974; 95% CI, 0.967 to 0.980; P < 0.0001, test of overall effect; random-effects: adjusted hazard ratio, 0.922; 95% CI, 0.872–0.976; P = 0.005, test of overall effect) (Supplementary Fig. 3).

In contrast, we did not observe any consistent association across the five databases between incident development of type 2 diabetes and exposure to three other drug classes used to treat persons with HIV-1 infection: non-nucleoside reverse transcriptase inhibitors, protease inhibitors, or integrase inhibitors (Supplementary Tables 2428).

### Falsification testing

To test for residual confounding, we conducted falsification tests using the two outcomes of appendicitis and hernia, which were not anticipated to be associated with NRTI exposure, among patients with confirmed diagnoses of HIV or hepatitis B and without a prior diagnosis of these outcomes. NRTI use was not associated with reduced hazard of developing incident appendicitis (Supplementary Fig. 4) or hernia (Supplementary Fig. 5) in any of the five databases individually or in pooled fixed-effect or random-effects model meta-analyses.

### Propensity score matching analysis

As assignment to NRTI treatment was not randomized, differences in incident diabetes might result from different characteristics of the treatment groups rather than NRTI usage itself. Therefore, we used propensity-score matching to assemble cohorts of patients with similar baseline characteristics and thereby reduced possible bias in estimating treatment effects. Because this procedure markedly reduces the original patient sample size, we confined these analyses to the three largest databases: Veterans Health Administration, Truven Marketscan, and PearlDiver databases. In the Veterans database, 9057 patients who had NRTI exposure were matched with 9057 patients who did not have NRTI exposure. In the Truven database, 4343 patients who had NRTI exposure were matched with 4343 patients who did not have NRTI exposure. In the PearlDiver database, 2153 patients who had NRTI exposure were matched with 2153 patients who did not have NRTI exposure. To further control for any residual covariate imbalance, we adjusted for all of the sociodemographic factors, overall health, comorbidities, and use of medications known to alter risk of diabetes development that were employed for the original unmatched group analyses. In all three databases, after adjustment for potential confounders, users of NRTIs had a reduced hazard of developing type 2 diabetes (Veterans: hazard ratio, 0.711; 95% CI, 0.649–0.778; P < 0.0001; Truven: hazard ratio, 0.645; 95% CI, 0.536–0.776; P < 0.0001; PearlDiver: hazard ratio, 0.747; 95% CI, 0.609–0.918; P = 0.005) (Fig. 1). We also estimated hazard ratios as a function of per year of NRTI exposure in the propensity-score matched groups. In all three databases, after adjustment for potential confounders, users of NRTIs had a reduced hazard of developing type 2 diabetes with each additional year of use (Veterans: hazard ratio, 0.979; 95% CI, 0.959–0.999; P = 0.042; Truven: hazard ratio, 0.926; 95% CI, 0.857–0.999; P = 0.049; PearlDiver: hazard ratio, 0.830; 95% CI, 0.719–0.958; P = 0.01) (Supplementary Tables 2931 and Supplementary Figs. 611). The small differences in the hazard estimates between the unmatched and propensity-score-matched analyses suggests that the residual bias in the unmatched analyses is likely to be small.

### NRTI reduces insulin resistance and inflammasome activation

Next we investigated one potential activator of the NLRP3 inflammasome: RNA derived from Alu mobile genetic elements18, which have been implicated in other human diseases such as Alzheimer’s disease19, macular degeneration18,20,21, and systemic lupus erythematosus22. In macular degeneration, elevated levels of Alu RNA and inflammasome activation result from reduced levels of the enzyme DICER1, one of whose metabolic functions is to catabolize Alu RNAs20,23,24. DICER1 levels are decreased in circulating cells in patients with type 2 diabetes25,26, and deletion of DICER1 in adipose or pancreatic islet beta cells triggers insulin resistance and diabetes in mice27,28,29,30. Given that many human disorders are associated with pathogenic mobile genetic elements31, and because DICER1 and inflammasome activation are implicated in diabetes, we explored whether Alu RNAs might be dysregulated in this disease as well.

Primary cells isolated from the adipose or skeletal muscle tissues of type 2 diabetes patients expressed lower levels of DICER1 protein (Fig. 3a, b) and higher levels of Alu RNA (Fig. 3c, d) compared with the cells of nondiabetic individuals. Insulin-induced glucose uptake into cells was impaired in diabetic adipocytes and myocytes; this resistance to insulin was reversed by lamivudine treatment (Fig. 4a, b). Insulin resistance was induced in nondiabetic adipocytes and myocytes by either TNF exposure or treatment with high glucose and high insulin; lamivudine prevented insulin resistance induced in both these models (Fig. 4a, b). Phosphorylation of the protein kinase AKT in response to insulin stimulation, a key signaling event in insulin-dependent glucose uptake40, was impaired in diabetic adipocytes and myocytes; this resistance to insulin was reversed by lamivudine treatment (Supplementary Fig. 12). Lamivudine also restored AKT phosphorylation in nondiabetic adipocytes and myocytes rendered insulin resistant by TNF treatment (Supplementary Fig. 12). These data suggest that lamivudine might ameliorate insulin resistance in part via AKT-dependent pathways.

Since NRTIs as a class exhibit inflammasome-inhibitory effects35, we explored whether other members of this drug class also exerted similar effects. We found that, similar to lamivudine, both azidothymidine and stavudine exerted beneficial effects on insulin-induced AKT phosphorylation in diabetic adipocytes and in nondiabetic adipocytes rendered insulin resistant by TNF treatment (Supplementary Fig. 13). At the doses tested, all three NRTI drugs had no deleterious effect on cell viability (Supplementary Fig. 14). Collectively, these data suggest that several NRTI drugs exhibit class-effects on ameliorating insulin resistance.

High-fat diet-fed mice are used to model impaired glucose tolerance and type 2 diabetes41. In mice raised on a high-fat diet for 8 weeks, we measured higher RNA levels of B2, a rodent Alu-like mobile genetic element, and lower protein levels of DICER1 in their adipose and muscle tissues compared to regular diet-fed mice (Fig. 5a, b). Glucose tolerance and insulin sensitivity in high-fat diet-fed mice, as monitored by glucose tolerance tests and insulin tolerance tests, respectively, were improved by once-daily intraperitoneal administration of lamivudine (Fig. 5c, d). Insulin stimulation of AKT phosphorylation was impaired in the subcutaneous and visceral adipose tissues and skeletal muscle of high-fat diet-fed mice; lamivudine-treated mice retained the activity of this insulin signaling pathway (Supplementary Fig. 15). Protein levels of IL-1β or IL-18, which are products of inflammasome activation, were elevated in the subcutaneous and visceral adipose tissue and skeletal muscle of high-fat diet-fed mice; lamivudine treatment inhibited the increase in these cytokine levels (Supplementary Fig. 16). Of note, lamivudine did not alter high-fat diet-induced gain in body-weight (Supplementary Fig. 17), indicating that its salutary effects were not due to weight reduction. Collectively, these data suggest that lamivudine increased sensitivity to endogenous insulin and reduced inflammasome activation in the context of a high-fat diet.

## Discussion

We identify an association between exposure to NRTIs and lower rates of development of type 2 diabetes among persons with HIV-1 or hepatitis B infection. We also present biochemical evidence that the NRTI lamivudine restores insulin sensitivity in type 2 diabetic human cells and prevents induction of insulin resistance in non-diabetic human cells. At doses allometrically scaled to those used in humans, lamivudine improves glucose tolerance and insulin sensitivity and reduces inflammasome activation in high-fat diet-fed mice. These investigations of human cell, mouse, and population database systems collectively suggest a potential beneficial effect of NRTIs in forestalling diabetes onset.

In the main analysis of NRTI exposure versus incident type 2 diabetes risk, the pooled summary estimate of the adjusted hazard ratio across the five databases (0.673) and confidence interval (0.638–0.710) provide information on how well we have determined the mean effect. In contrast, the prediction interval (0.618–0.734) illuminates the range of true effects that can be expected in future settings by providing an estimate of an interval in which a future observation, e.g., the result of a future clinical trial, will fall42. From this prediction interval, we infer that the probability that a future study, e.g., a clinical trial, observes a beneficial effect of NRTIs (i.e., hazard ratio <1.0) is 99.99% (calculations in Methods). Similarly, we calculate that there is a 95% probability that such a future study will observe a hazard ratio of less than 0.713, i.e., a reduced risk of at least 29%. Likewise, we estimate a 50% probability that a future study would observe a hazard ratio of less than of 0.673 (a reduced risk of at least 33%). However, such inferences are only valid in settings that are exchangeable, i.e., similar, to those on which our meta-analysis is based.

Repurposing of existing drugs is an urgent priority for revitalizing, accelerating, and optimizing drug development43. Not all NRTIs are suitable candidates for repurposing. First-generation NRTIs such as stavudine and didanosine are more toxic than subsequently deployed NRTIs (lamivudine, emtricitabine, tenofovir), induce mitochondrial toxicity and lipodystrophy, and are associated with induction of insulin resistance and increased risk of type 2 diabetes in HIV-infected individuals, particularly when combined with PIs44,45,46,47. In the five insurance databases we analyze, over the time-periods studied (2000–2017), stavudine and didanosine use ranges from 3–9%, thus limiting the likelihood that their use significantly influences rates of diabetes development in these populations. The association of NRTI use with reduced incident diabetes in the cohorts we study might also reflect the much larger populations we study as well as our inclusion of multiple comorbidities and concomitant medication use that are known to affect the development of diabetes.

Limitations to our observational study include limitations intrinsic to all health insurance claims database analyses, particularly proper documentation and coding48,49. However, collectively these five databases encompass 150 million patient lives spanning a multidimensionally diverse array of populations in terms of age, gender, geography, race, and time-period. A notable strength of our study is that our clinical findings are replicated in five independent and geographically dispersed cohorts that collectively represent the majority of adults with health insurance in the United States. In addition, the Bayesian meta-analysis approach, which has advantages in terms of accounting for parameter uncertainty50 and the generation of credible intervals that account for the prior distribution51, parallel the results of the frequentist meta-analysis results, thereby increasing confidence in the main finding that NRTI exposure is associated with a risk in incident type 2 diabetes.

In addition to the main analysis that analyzes exposure to NRTIs as a binary covariate (ever versus never), our analysis of NRTI exposure as a continuous cumulative exposure covariate (per year of exposure), is also associated with reduced incident type 2 diabetes. By focusing inferences on how the outcome of incident type 2 diabetes depends on cumulative exposure to NRTIs, this approach deals, in part, with fixed between-person confounding that may result from unmeasured confounders that result in ever-users and never-users of NRTIs having differential susceptibility to the outcome of diabetes for reasons independent of NRTI use. However, analyses of cumulative NRTI exposure may be affected by time-varying risk factor confounding; thus, estimates provided by the Cox models for the time-updated variable of years of NRTI exposure could be biased.

We study an extensive number of demographic variables, comorbid conditions, concomitant medication use, and laboratory tests that are known risk factors for development of diabetes by including them as fixed risk factor covariates whose values are considered at baseline (index date). Several of these risk factors could have changed with time; however, we do not consider their time-dependent variance for several reasons. First, the availability of information on many of the variables, e.g., body mass index or CD4+ counts, is not uniform over the entire follow-up duration. Second, it is not altogether clear how the time-dependent variance of several of these variables, e.g., body mass index or concomitant medication use, impacts ongoing risk of incident type diabetes, because of complex, non-linear effects and unknowable biological interactions. Third, certain risk factors have greater short-term effects whereas others have greater long-term effects on chronic disease outcomes52, and often these are ill-defined. Use of time-dependent covariates also runs the risk that the value of a covariate during follow-up could change as a result of risk factors being studied52. Nevertheless, we acknowledge that our use of risk factors as covariates fixed at baseline could understate or overstate their influence on the association of NRTI use with development of diabetes.

Another limitation of insurance claims analyses is the non-availability of information on diet, physical activity, and stress, all of which influence development of diabetes. We do, however, control for numerous comorbidities, medications, and laboratory abnormalities known to influence rates of development of diabetes. Despite covariate adjustment for a large number of relevant confounders and performing robust propensity score matching, we cannot rule out the possibility of selection bias or residual confounding. However, the likelihood that unmeasured confounding accounts for identification of an association of NRTI exposure with incident diagnoses of diabetes is diminished by the observation that exposures to NNRTIs, PIs, or INSTIs, which serve as negative controls for medication usage53, are not associated with reduced incident type 2 diabetes. In addition, the results of propensity score matching and falsification testing (which detects confounding, selection bias and measurement error)53,54,55 both increase the internal validity of the conclusions of our main analyses. We also note that the impact of loss of follow-up due to mortality is assessed only in the Veterans cohort and not in the other databases. As our analyses are restricted to patients with HIV-1 or hepatitis B, our results might not be applicable to other populations. However, the protective effects of lamivudine are evident in human cells and mice not infected with these viruses. Therefore, the protective effects of NRTIs, which, as a class, block inflammasome activation in other models of non-infectious disease35,56, might extend beyond the setting of viral infections.

All statistical modeling approaches, including the many we employ, are subject to inherent assumptions and limitations. Analyzing nonlinear covariate effects as well as more complex interactions could potentially provide better model fits in the individual datasets. Propensity score-based methods such as propensity score matching, which we employ, are widely used to draw causal inferences from observational studies57,58,59,60. Alternative causal reasoning frameworks61,62 and introducing synthetic positive controls63 offer additional approaches to analyzing observational data, and could be worthwhile exploring in future studies. Ultimately, prospective randomized trials can provide the best insights into causality.

In addition to providing clinical evidence supportive of the inflammasome hypothesis of type 2 diabetes, we introduce the concept that perturbation in the homeostasis of the DICER1-Alu RNA regulatory axis could be involved in triggering this aging-associated disease, as dysregulation of the DICER1-Alu/B2 RNA pathway is evident in adipose and muscle cells of type 2 diabetic humans and of high-fat diet-fed mice. Our findings expand the spectrum of pathologies potentially triggered by Alu, the most successful of human genomic parasites. Additional mechanistic and phenotypic studies of NRTI treatment in various animal models of type 2 diabetes would enhance confidence in a therapeutic effect.

More recently developed NRTIs are well tolerated and are associated with lower adverse event rates64. Randomized trials of lamivudine monotherapy in adults and children with hepatitis B65,66,67 and of other current-generation NRTIs in non-HIV-infected individuals68,69 have safety profiles that are similar to placebo treatment. However, lamivudine, although less toxic than its predecessors, is associated with development of rare adverse events such as lactic acidosis, hepatomegaly and steatosis, particularly when used in combination with other more toxic anti-retroviral drugs administered to sicker HIV patients in earlier eras, and in children70. Regulatory agency labels for NRTIs contain warnings of lactic acidosis, although it should be noted that currently-approved anti-diabetic medications such as metformin also carry these warnings. Nevertheless, it is prudent to explore in prospective trials whether modified NRTIs known as Kamuvudines, which retain the ability to inhibit inflammasome activation but lack attendant toxicities35, represent better candidates for treating prediabetes or diabetes.

Finally, we stress that despite our “triangulation”71 by integrating interlocking evidence from multiple approaches72 such as health insurance claims analyses performed on different cohorts by different investigators, cell culture studies, and animal models, we caution against advocating use of NRTIs in prediabetes or diabetes in the absence of prospective randomized clinical trials. Cost-benefit analyses of the future utilization of NRTIs following prospective evaluation should include consideration of their potential for inducing viral resistance as well as how they compare to certain diets and exercise regimens, which can benefit individuals with prediabetes73,74,75.

## Methods

### Data sources

Data were evaluated from five health insurance claims databases: U.S. Veterans Health Administration database (which includes health care claims information extracted from the VA Informatics and Computing Infrastructure) for the years 2000–2017; Truven Marketscan, which includes employer-based health insurance claims for the years 2006–2017; PearlDiver, which includes health care claims for persons in the Humana managed care network for the years 2007–2017; a random 20% sample of Medicare beneficiaries with Parts A, B, and D coverage for the years 2008–2016; and Clinformatics DataMart database (OptumInsight), which captures health care claims for persons in a large nationwide managed care network for the years 2001–2016. Disease-specific diagnoses using codes from the International Classification of Diseases, 9th Revision, Clinical Modification (ICD-9-CM) were evaluated. For the VA and Truven databases, codes from the 10th Revision, Clinical Modification (ICD-10-CM) (Supplementary Methods) were also evaluated using a cross-walk between ICD-10-CM and ICD-9-CM codes.

### Study population

Patients were included in these analyses if they had at least two medical claims for HIV/AIDS or hepatitis B during study dates, and were excluded if they had pre-existing diabetes, defined as at least one such ICD-9-CM/ICD-10-CM diagnosis prior to their ICD-9-CM /ICD-10-CM diagnosis of HIV/AIDS or hepatitis B. Baseline participant characteristics, described by means and standard deviations for continuous variables and frequencies and percentages for categorical variables, are presented in Supplementary Table 3.

### Exposure definition

Individuals were classified as receiving NRTI, NNRTI (nonnucleoside reverse-transcriptase inhibitor), PI (protease inhibitor), or INSTI (integrase strand transfer inhibitor) medications if at least one outpatient pharmacy prescription for these medications was filled. American Hospital Formulary Service drug codes and U.S. National Drug Codes (a list of specific medications is in Supplementary Table 1) were evaluated. Patients filling prescriptions for combination anti-viral medications were counted as having received medications from each class. Medication use was summarized as a time-dependent covariate measuring the cumulative days or months supplied.

### Outcomes

The main outcome was incident type 2 diabetes. Time to initial diagnosis of type 2 diabetes during the follow-up period was the dependent variable. Observations were right-censored at the end of plan enrollment, death, or diabetes development. Falsification tests were performed using the two prespecified outcomes of appendicitis and hernia, which were not anticipated to be associated with NRTI exposure, among patients with at least two medical claims for HIV or hepatitis B and without a prior diagnosis of these falsification outcomes. Time to initial diagnosis of appendicitis or hernia during the follow-up period were dependent variables for these analyses. Observations were right censored at the end of plan enrollment, death, or development of appendicitis or hernia.

### Statistical analyses for data sources

Key predictors were use of the HIV-1 and hepatitis B drugs NRTI, NNRTI, PI, and INSTI. Cox regression was used to estimate the hazard for developing type 2 diabetes in relation to NRTI, NNRTI, PI, and INSTI exposure, with adjustment for baseline covariates, which included demographic variables, comorbidities, use of other medications, and laboratory test values known to be associated with diabetes including those listed by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK)76, the Centers for Disease Control and Prevention (CDC)77, and the International Diabetes Federation (IDF)1, and those identified by supplementary literature research; 95% confidence intervals for hazard ratios were constructed based on standard errors derived from the model. Schoenfeld’s global goodness-of-fit test78,79 were used to test the proportional hazards assumption of the Cox regression. In both the NRTI ever/never and the NRTI per year of exposure analyses in all 5 databases, all these P values were >0.05, confirming the validity of the proportional hazards assumption of the fitted models.

We used SAS software, version 9.4 (SAS Institute) and Excel, version 16.35 (Microsoft) to perform statistical analyses. An inverse- variance weighted analysis of the five databases combined was performed to estimate the combined hazard ratio and to compute 95% confidence intervals using fixed-effect and random-effects models. Meta-analyses were performed with the use of the statistical program R, version 3.6.1 (the R project [http://r-project.org]) and the R packages metafor and bayesmeta. The restricted maximum-likelihood estimator was used to estimate the between-study variance. A forest plot was created to depict the HR and 95% confidence intervals or credible intervals of each study and of the summary results. Statistical tests were two-sided. P values of less than 0.05 were considered to indicate statistical significance.

Frequentist meta-analysis (primary analysis): Variability among the five databases was evaluated using Cochran’s Q-test80. A random-effects model was used in the primary analyses as it assumes that individual databases are samples of different populations with different underlying true effects. In contrast, fixed-effect models assume that individual databases are samples from the same population81,82,83.

Bayesian meta-analysis (secondary analysis): Bayesian meta-analysis was performed using a random-effects normal-normal hierarchical model (the same as the random-effects model above). For the effect parameter μ, we choose a neutral unit information prior given by a normal prior with mean μp = 0 (centered around a hazard ratio of 1.0) and a variance of (σp2 = 4)84. In a hierarchical model θ ~ N[μ, τ2], where τ2 is the between-study variance of the logarithmic hazard ratio, the “range” of hazard ratios, defined as the ratio of the 97.5% and 2.5% quantiles of the hazard ratio, is equal to (e3.92τ)84. We used a weakly informative half-Cauchy prior distribution85,86 for between-study variability with the assumption that it was unlikely for the between-study hazard ratios to vary by more than 3-fold. For this assumption, range = 3 and τ = 0.280 (scale). We performed a sensitivity analysis to the choice of the prior by assuming that it was unlikely for the between-study hazard ratios to vary by more than 10-fold. For this assumption, range = 10 and τ = 0.587 (scale). Computation of posterior predictive P values were implemented in bayesmeta via Monte Carlo sampling.

Falsification tests were performed using the two prespecified outcomes of appendicitis and hernia, which were not anticipated to be associated with NRTI exposure, among patients with at least two medical claims for HIV or hepatitis B and without a prior medical claim for these falsification outcomes. Time to initial diagnosis of appendicitis or hernia during the follow-up period were dependent variables for these analyses. Observations were right censored at the end of plan enrollment, death, or development of appendicitis or hernia. Additional information about the statistical analyses is provided in the Supplementary Methods.

### Propensity score matching

For the Veterans Health Administration, Truven Marketscan, and PearlDiver databases, we estimated propensity-score models including use of NRTIs and no use of NRTIs. The individual propensities for starting NRTI treatment were estimated with the use of logistic regression. As predictors, the propensity-score models included the set of variables which displayed P values < 0.1 in logistic regression analyses. In the Veterans Health Administration Database, these variables were thus used in the propensity score model: Labs (CD4 counts, viral load, body mass index), other medications (PI, NNRTI, lipid lowering agents, antihypertensives), comorbidities (Charlson comorbidity index, systemic hypertension, depression, ischemic heart disease, other heart disease, stroke, hepatitis C, osteoarthritis, rheumatoid arthritis), and demographics (age, race, index year, tobacco use). In the Truven Marketscan Database, these variables were thus used in the propensity score model: comorbidities (Charlson comorbidity index, osteoarthritis, rheumatoid arthritis, systemic hypertension, hyperglyceridemia, stroke, acanthosis nigricans, hepatitis C), demographics (age, sex, index year). In the PearlDiver Database, these variables were thus used in the propensity score model: Labs (HDL, triglycerides, CD4 counts, body mass index, ALT, AST), other medications (fluoroquinolones, corticosteroids, lipid lowering agents, antihypertensives), comorbidities (Charlson comorbidity index, osteoarthritis, systemic hypertension, pure hypercholesterolemia, hyperglyceridemia, ischemic heart disease, other heart disease, gestational diabetes), demographics (age, race, sex, index year, tobacco use, family history of diabetes). Matching was performed in a 1:1 ratio using greedy nearest neighbor matching. In addition, to control for any residual covariate imbalance, we estimated the relative hazard in the propensity score-matched groups using the multivariable Cox model that included the covariates from the multivariable regression analysis employed for the original unmatched group analyses. Statistical tests were two-sided. P values < 0.05 were considered statistically significant.

### Prediction interval and threshold probabilities

The prediction interval42 was computed using the metafor package in R. The probability P that the true effect in a new study will be below a desired threshold D was calculated with the left-tail cumulative t-distribution with k–1 degrees of freedom (df) for k studies in the meta-analysis. The probability that the effect is below D equals P. For HRs, calculations were based on the ln HR, with the summary meta-analysis estimate μ = 0.673, SDPI = $$\sqrt {\tau ^2 + SE^2} = 0.0272772$$ (where τ2 is the estimated heterogeneity and SE is the standard error of μ), and df = 4. For example, to determine the probability of a null or protective effect, we computed the probability that a true HR ≤ 1, which corresponds to a true ln (HR) ≤ 0. In general, for any desired threshold D, we set T = (ln (D) – ln (μ))/SDPI and df = 4, and computed the following P values using this website [https://www.danielsoper.com/statcalc/calculator.aspx?id=8]: P (true HR ≤ 1) = 0.99993; P (true HR ≤ 0.7133) = 0.95001; P (true HR ≤ 0.673) = 0.5.

### Cell culture

Human primary pre-adipocytes isolated from subcutaneous adipose tissue from nondiabetic or type 2 diabetic donors were purchased from Lonza. Cells at passage 2–4 were used in this study. Pre-adipocytes were seeded in 96-well or 6-well plates and cultured in Preadipocyte Growth Medium-2 basic medium (Lonza) supplemented with 10% FBS, L-glutamine, and gentamycin (Lonza, PT-9502) and maintained in at 37 °C, 5% CO2 for 5–7 days until reaching 70% confluence. Cells were then exposed to differentiation medium (Preadipocyte Growth Medium-2 supplemented as described by Lonza, including recombinant insulin, dexamethasone, indomethacin, isobutylmethylxanthine, and indomethacin) for 5–7 days to induce maturation, which was monitored morphologically. Human primary skeletal myoblasts (HSM) isolated from the upper arm muscle tissue were obtained from Zenbio or Lonza. HSM were obtained from nondiabetic or type 2 diabetic donors. HSM were utilized at passage 4–5 in this study. HSM were seeded in 96-well or 6-well plates and cultured in SKM-M (Zenbio) basic medium as described by the manufacturer. Cells were maintained in at 37 °C, 5% CO2 for 2–3 days until reaching 70% confluence. HSM were then exposed to differentiation medium (#SKM-D, Zenbio) for another 6–8 days to induce maturation, as assessed morphologically.

Mature adipocytes or HSM seeded in 6-well plates were treated with NRTIs (lamivudine (L1295 from Sigma-Aldrich), azidothymidine (A2169 from Sigma-Aldrich), stavudine (D1413 from Sigma-Aldrich); optimal dose of 100 μM and duration of 1 h selected from pilot experiments), 2.5 nM TNF (human recombinant, #T0157-10UG from Sigma-Aldrich), 25 mM glucose (G5500-500MG from Sigma-Aldrich), 100 nM insulin (human recombinant, #12585014 from ThermoFisher Scientific).

### Animals

All animal studies were approved by the University of Virginia Animal Care and Use Committee and performed according to their guidelines. Male 12-week-old C57BL/6 J mice (The Jackson Laboratory, JAX stock #000664) were housed in specific pathogen-free conditions and maintained under a 12-h light-dark cycle and fed ad libitum with a standard laboratory diet or a high-fat diet paste containing 60% fat plus 0.2% cholesterol (Bioserv) for 8 weeks. Animals were housed in the same room and their care and housing were in accordance with the guidelines and rules of the Institutional Animal Care and Use Committee. High-fat diet-fed mice were administered intraperitoneal injections of lamivudine (70 mg/kg of body-weight once daily) or of phosphate-buffered saline (vehicle control). Euthanasia was performed as a two-step process by inhalation of carbon dioxide gas followed by cervical dislocation.

### Western blotting

Human cells and various mouse tissue protein lysates were homogenized in Complete Lysis Buffer (Roche). Protein concentrations were measured using Pierce BCA protein Assay Kit® (ThermoFisher Scientific). Proteins were separated by either 4–20% or 10–20% sodium dodecyl sulfate polyacrylamide gel electrophoresis and transferred to polyvinylidene difluoride membranes, which were then probed with specific primary antibodies. The abundance of AKT (phosphorylated and total), DICER1, and IL-18 proteins was monitored in human adipocytes, human myocytes, mouse subcutaneous adipose tissue, mouse visceral adipose tissues, and mouse skeletal muscle by western blotting using the following primary antibodies: Mouse anti-human phospho-specific AKT, Ser473 (#12694, Cell Signaling Technology; 1:1000); rabbit anti-mouse AKT (pan), 11E7 (#4685, Cell Signaling Technology; 1:1000); rabbit anti-human DICER1 A301-936A (Bethyl Laboratories; 1:1000); rat anti-mouse IL-18 (Clone 39-3 F, #D046-3, MBL International; 1:1000); anti-mouse β-actin (8H10D10) (#3700, Cell Signaling Technology; 1:1000; for loading control assessment). Following incubation with secondary antibodies, protein abundance was visualized using the Licor Odyssey documentation system and quantitated using ImageJ Fiji software, version 2.1.0/1.53c.

### Northern blotting

Total RNA from primary human adipocytes, primary human skeletal muscle cells, mouse adipose tissue, and mouse skeletal muscle tissue was extracted using Trizol (Thermo Fisher Scientific). RNA samples were separated on 10% or 15% TBE-urea gels (Bio-Rad Laboratories) according to the manufacturer’s instructions. Samples were transferred and cross-linked using ultraviolet light to a HyBond N+ nylon membrane, and blotted for Alu RNA, B2 RNA or 5.8S rRNA using biotinylated oligonucleotide probes. Blots were developed with the Thermo Pierce chemiluminescent nucleic acid detection kit (ThermoFisher Scientific).

### IL-1β ELISA

To measure IL-1β levels in mouse adipose and skeletal muscle tissues, we used a monoclonal antibody-based sandwich ELISA (ThermoFisher Scientific) according to the manufacturer′s instructions.

### In vitro glucose uptake in human adipocytes and myocytes

From type 2 diabetic patients and nondiabetic donors, preadipocytes were isolated from subcutaneous adipose tissue and skeletal myoblasts from the upper arm muscle tissue (Zenbio and Lonza). Glucose uptake assay was performed according to the manufacturer’s instructions (Cayman Chemical). Briefly, pre-adipocytes or human primary skeletal myoblasts were seeded in 96-well plates and induced to mature adipocytes or myocytes for 5–8 days. Nondiabetic cells were rendered insulin resistant by treatment with human tumor necrosis factor (TNF) (2.5 nM) or with high glucose (25 mM) and high insulin (100 nM) for 24 h87,88. Cells, pre-treated with lamivudine (100 μM) or vehicle for 1 h, were treated with insulin (20 nM) for 20 min in 100 μl glucose-free culture medium containing 2-NBDG (150–300 μg/ml; Cayman Chemical), a fluorescent derivative of glucose used to monitor glucose uptake. At the end of the treatment, the plate was centrifuged for 5 min at 400 g and washed twice with cell-based assay buffer, and read at 485/535 nm.

### Cell viability

Cell viability measurements were performed using the CellTiter 96 AQueous One Solution Cell Proliferation Assay (Promega) according to the manufacturer’s instructions. Briefly, human adipocytes were seeded on a 96-well plate and treated with 100 µM NRTI (lamivudine, 3TC; stavudine, D4T; azidothymidine, AZT) or phosphate-buffered saline (PBS; Ctrl) for 24 h. Then, 20 μl of CellTiter 96 AQueous One solution reagent was added into each well. Then, the 96-well assay plate was incubated at 37 °C for 2 h. Final absorbance reading was performed by Cytation 5 Cell Imaging Multi-Mode Reader (BioTek) at 490 nm.

### Glucose tolerance test and insulin tolerance test in mice

Male 12-week-old C57BL/6 J mice (The Jackson Laboratory) were fed with a standard laboratory diet or a high-fat diet paste with 60% fat plus 0.2% cholesterol (Bio-Serv) for 8 weeks. High-fat diet-fed mice were administered lamivudine (70 mg/kg of body weight) or phosphate-buffered saline via intraperitoneal injection once daily for 8 weeks. The glucose tolerance test was performed after a fast for 16–18 h followed by an injection of glucose (2.5 g/kg body weight; Sigma), and the insulin tolerance test after a fast for 4 h followed by an injection of insulin (0.75 U/kg body weight; Sigma). Blood glucose was monitored in tail vein blood by One-Touch Ultra (Life Scan) Glucometer at 0, 15, 30, 60, 90 and 120 min after glucose or insulin injection. Areas under the curve (AUCs) were calculated using trapezoidal integration.

### Protein and RNA assays in human cells and mouse tissues

Levels of AKT (phosphorylated and total), DICER1, and IL-18 in human cells or mouse tissues, were monitored using western blotting. To measure IL-1β levels in tissue samples obtained from mice, we used a monoclonal antibody-based sandwich ELISA (ThermoFisher Scientific). To assess the abundance of Alu RNA in human cells and B2 RNA in mouse tissues, we performed northern blotting20.

### Statistical analysis for in vitro and in vivo experiments

Data are expressed as means ± SEM. Statistical significance was determined by Student t test, using Prism, version 8.3.0 (Graphpad). P values less than 0.05 were considered significant.

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.