Introduction

Suicide is a preventable public health problem that ranks as the second leading cause of death among young adults residing in the US [1]. Suicide rates remain even higher among American Indians/Alaska Natives (AI/AN), with rates more than 50% greater when compared to the general U.S. population across all age groups [2,3,4,5]. While suicide rates are unquestionably higher in the AI/AN population as a whole as compared to other US populations, rates also vary based on geographic region, tribal affiliation and whether one is living on a reservation [6,7,8,9,10,11,12].

American Indian/Alaska Natives also suffer from a disproportionate burden of the effects of alcohol, tobacco, and drug dependence [13, 14]. Large-scale U.S. epidemiological studies indicate that, compared with other U.S. ethnic groups, AI/AN demonstrate the highest rates of alcohol and other drug dependence [15, 16]. Additionally, lifetime rates of alcohol use disorders (AUD) differ depending on the tribes evaluated. Among individual tribal groups studied, reported rates of AUD have ranged from 20% to 70% [17,18,19,20]—significantly higher than epidemiological rates of AUD (14%) in the U.S. general population [21].

Substantial comorbidity has been demonstrated between SB and AUD. Chronic alcohol use is a clear risk factor for suicide [22], and having a severe use disorder and experiencing substance associated depression is particularly linked to increased risk for suicide [23]. Being acutely intoxicated provides additional risk for lethal suicide over and above the risk of chronic use [24, 25]. This is particularly true among American Indian/Alaska Natives [25, 26]. However, documenting comorbidity between disorders does not necessarily imply a causal connection between them or a common etiological pathway. Both AUD and suicide have been shown to have a significant genetic component to their etiology [27, 28]. Behavioral genetics studies have the advantage of being one of the strongest methods for determining whether the comorbidity among psychopathological conditions may be due to shared etiologies and/or pathologies associated with the disorders. For instance, a new study recently estimated the genetic correlation between suicide attempt and alcohol dependence at 44% in populations of European ancestry [29].

In recent years, numerous large genome-wide association studies (GWAS) have been reported for suicidal behaviors [28, 30,31,32,33,34]. These studies were carried out in populations dominated by individuals of European descent. Genetic epidemiology studies have estimated the heritability of SB ranging from 17% to 55% [35, 36]. Single nucleotide polymorphism (SNP)-heritability of suicide attempt using common genomic variants across the genome has been estimated to range from 3.5% in the UKBiobank [37] to 6.8% in the International Suicide Genetics Consortium (ISGC) [32]. The largest GWAS meta-analysis to date (with ~959 K individuals) reported 12 loci for suicide attempt. Risk loci were mostly intergenic and implicated genes DRD2, SLC6A9, FURIN, NLGN1, SOX5, PDE4B, and CACNG2 [30]. Intriguingly, one of the genes, ROBO2, reported by the ISGC GWAS to be associated with SB [31] is on a rare run-of-homozygosity (ROH) segment that was previously reported to be linked to severe AUD in an American Indian population [38].

While SNP-heritability and GWAS studies primarily estimate the impacts of common genomic variants, recent studies have shown that rare variants, especially those in low linkage disequilibrium (LD) with neighboring variants, are enriched for heritability for complex traits and diseases. Rare variants often represent recent and potentially deleterious mutations that can have biological consequences. Studies have indeed shown that most rare missense alleles in humans are deleterious [39]. In fact, a recent Utah Suicide Genetic Risk Study (USGRS) interrogating rare protein-coding variants included on the Illumina PsychArray chip has identified five rare variants for suicide death [40]. Genotyping chips, however, are usually not designed to capture rare variants in a population. Thus, to facilitate comprehensive studies of the impact of rare variants on a disease, exome sequencing or whole-genome sequencing is usually necessary.

The present report is part of a larger study exploring risk factors for substance dependence and suicide behaviors among American Indians [19, 41,42,43]. This American Indian (AI) population has been sequenced and deep phenotyped. The lifetime prevalence of AUD and suicide in this AI population is high, and evidence for heritability, linkage to specific chromosome locations, and genome-wide findings for AUD have been demonstrated [38, 42, 44,45,46,47]. The present study aimed to evaluate genetic factors associated with SB in this AI population. We hypothesized that certain genetic variants may underlie both SB and AUD, and further, that joint association analysis of related traits may lead to increased power to detect variants that contribute to both phenotypes. To this end, we conducted a bivariate genome association analysis between SB and AUD. We further hypothesized that rare (and low-frequency) variants in a high-risk population maybe a significant component of the complex genetic architecture underlying the disorder. Note that rare variants that occur in the general population might become less rare in population isolates such as American Indians. We thus conducted gene-based and pathway-based rare and low-frequency variant analyses for SB.

Material and Method

Participants

American Indian participants were recruited from eight geographically contiguous reservations. The reservations had a total population of about 3000 individuals. To be included in the study, participants had to be between the ages of 18 and 70 years, and mobile enough to be transported from their home to The Scripps Research Institute (TSRI). More details are given in Supplemental Materials. The demographics of the population is listed in Table S1. The protocol for the study was approved by the Institutional Review Board (IRB) of TSRI, and the board of the Indian Health Council, a tribal review group overseeing health issues for the reservations where the recruitment was undertaken. Written informed consent was obtained from each participant after the study was fully explained.

Phenotypes and genotypes

Potential participants first met individually with research staff, and during a screening period, participants completed a questionnaire that was used to gather information on demographics, personal medical history, and drinking history [48]. Each participant also completed an interview based on the Semi-Structured Assessment for the Genetics of Alcoholism (SSAGA) [49], which was used to collect lifetime history of two types of self-directed violence: suicidal thoughts including ideation (Have you ever thought about killing yourself?) and/or plans (Did you have a plan? Did you actually consider a way to take your life? What were you going to do?), as well as suicidal acts, including suicide attempt history (Have you ever tried to kill yourself? How did you try to kill yourself?) reported by the participants and suicide deaths obtained from community sources (e.g., verified by public records, family/tribal informants) over an 8-year period. From the lifetime history of suicidal thoughts and acts, we defined the suicidal behavior phenotype as a ranking variable: 0-none, 1-ideation, 2-plans, 3-attempts, 4-death. Since the individual risks are difficult to quantify, the scale as defined may not be the most ideal. However, the ranked numerical scale serves as an approximation to index corresponding risk as suicide risk increases in severity in this direction. In addition, it offers potential increase in statistical power for detection.

Diagnoses of lifetime DSM-5 AUD (mild, moderate, or severe) were also generated using the SSAGA. In addition, the interview retrospectively asks about the occurrence of alcohol-related life events, and the age at which the problem first occurred, from which a quantitative phenotype, the severity level of AUD, was derived. We used measures of the clinical course of alcoholism originally described by Schuckit and colleagues [50]. These measures were based on the relative order of the appearance of major “alcohol-related life events”. These life events have been shown to be highly similar and consistent across many different subgroups and populations, although age of onset and endorsement rates of individual events can differ [19, 51,52,53,54,55,56,57,58]. The severity level of AUD is indexed by the 36 alcohol-related life events (Table S2) in the clinical course of the disorder [19, 50], with life events given a severity weight of 1 for events 1-12; 2 for 13-24; and 3 for 25-36. AUD severity was then calculated as the sum of the severity weights of the 36 life events [46]. This metric gives larger weights to the more severe events that occur later in the clinical course and are more associated with severe use disorder [46, 53]. As shown in Supplementary Fig. S1, the AUD severity phenotype is highly aligned and correlated with the DSM-5 mild, moderate, and severe AUD diagnoses. Their Spearman’s rank correlation coefficient is 0.9. The relation between the SB phenotype and the AUD severity in this AI population is illustrated in Supplementary Fig. S2. The rank correlation between SB and AUD severity is 0.3.

Participants had low-coverage whole genome sequencing on blood-derived DNA using a previously described pipeline [59]. Briefly, the pair-end sequencing was performed on Illumina HiSeq2000 sequencers by a collaborating lab at the University of North Carolina at Chapel-Hill. About 80% of the samples had coverages between 3X and 12X. The qualities of the variant calling were confirmed using an Affymetrix Exome1A chip. Variants were removed if they had high missing rate ( > 5%), out of Hardy-Weinberg equilibrium (p < 1E-6), or had high Mendel error rate ( > 5%); individuals were removed if they had more than 2% genotypes missing. Further details can be found in the Supplementary Method.

Bivariate association analysis for suicidal behaviors and alcohol use disorders (SB-AUD)

We conducted genome-wide bivariate association analysis to identify genetic variants that might be associated with both SB and AUD severity (denoted as SB-AUD). By leveraging cross-trait covariance, multivariate tests for association may provide increased power over univariate tests. This occurs when the residual correlation between traits is opposite in direction to the genetic correlation induced by the genetic loci [60]. To control for population admixture and familial relatedness, we used the multivariate linear mixed model implemented as genome-wide efficient mixed-model association (GEMMA) for the study [61]. The bivariate association for each variant was conditioned on a genetic relationship matrix of the cohort derived from the genotypes, thus capturing a wide range of sample structures. Rare variants with minor allele frequency (MAF) lower than 1% were excluded from this analysis.

Gender, age, and age-squared were included as covariates, which we refer to as the main model. To take the social economic status of the AI cohort into consideration, in addition to the main model, we also tested an extended covariate model incorporating additional socioeconomic factors such as gross income, years of education, employment status, marriage status, religion, and frequency of religious service attendance. We first assessed whether the covariates were associated with SB or AUD severity through multiple regression prior to genetic association tests, and dropped the factors with negligible effects for both traits (p > 0.1). As a result, the extended model included years of education, employment, and religion as additional covariates (Supplemental Table S3A). Note that some of the socioeconomic factors in the extended model may also have genetic components.

Gene-based rare and low-frequency variant analysis for suicidal behaviors

Rare variants are usually tested by aggregating them into groups. We included both rare (MAF < 1%) and low-frequency (1% < = MAF < 5%) variants in this set of analyses. The variants were grouped by genes or by pathways and tested against the SB phenotype. 63% of all variants in the AI cohort are rare, while 12% are of low-frequency. We analyzed the rare and low-frequency variants across the genome using SKAT-O that optimally combines a burden test and a non-burden sequence kernel association test (SKAT) [62] within a linear mixed model as implemented in EPACTS [63]. We used recommended beta-density based weights beta(MAF;1,25), which increases the weights of rare variants while yielding decent nonzero weights for variants with MAF 1–5% [64]. In all sets of analyses, we used same main model and the extended model for covariates as described above but dropped the employment factor in the extended model (Supplemental Table S3B).

For each gene, we formed two types of groups. One group considered all variants on exons, 5’ and 3’ untranslated regions (UTRs), 50 base pairs (bp) upstream and 50 bp downstream of the gene (denoted as ExonReg). The other group included only the nonsynonymous variants and the splicing-site variants of the gene (denoted as Nonsyn). Intergenic and intronic variants were excluded in the present study. For each group type, a gene was excluded if fewer than three markers were found, or if less than 0.5% of the samples had any such markers on the gene. This resulted in 28,718 genes in the ExonReg group and 12,588 genes in the Nonsyn group. Association analysis was performed between each gene-based set and SB. False discovery rates (FDR) controlled by the Benjamini–Hochberg procedure (Benjamini and Hochberg, [65]) were used to set significant p values from the test statistics of the association tests. We combined two gene variant groups together for the multiple testing correction and report the FDR-adjusted p values (Yekutieli and Benjamini [66]). Note that the correction is conservative since the Nonsyn variants on a gene are a subset of the ExonReg variants, and thus they are correlated.

Pathway-based rare and low-frequency variant analysis for suicidal behaviors

We utilized a collection of canonical pathways from the Molecular Signatures Database (MSigDB) version 7.5.1 [67, 68]. There are total of 2981 pathway gene sets in this release. Since a pathway may contain a large number of genes, to limit the number of variants included in each set for test, we only considered rare and low-frequency nonsynonymous and splicing-site variants (Nonsyn) on the genes within each pathway. The same filters as in the gene-based tests were applied. We performed SKAT-O tests on the Nonsyn variant group for each pathway to be associated with SB. FDR-adjusted p-values are reported.

Rare and low-frequency variant analysis for SB-AUD

We carried out the same set of analyses as described in the last two sections for AUD severity as well. We then used meta-analysis to obtain the effects of rare and low-frequency variants on SB-AUD [69]. Since the test statistics for the two traits were correlated, we corrected these correlations prior to meta-analysis [70]. Details are given in Supplemental Method.

Functional analysis

For the top variants identified in the bivariate association analysis and for the top genes identified in the rare and low-frequency variant analysis, we obtained the combined annotation dependent depletion (CADD v1.6) scores [71] for the variants included in the test to assess the deleteriousness of these variants. CADD integrates multiple functional annotations and genome-wide variant effect prediction models to produce a scaled C-score. We report the number of variants tested in each gene that have C-scores in the range of 15–20 (1–3% most deleterious mutations), 20–30 (0.1–1%), and over 30 ( < 0.1%). We chose 15 as the lowest C-score to report as it happened to be the median value for all possible canonical splice site changes and nonsynonymous variants in CADD v1.0 [72].

The variants with p-value < 10−6 from the bivariate analysis were annotated with the nearest genes, and the associated set of genes was then subjected to functional enrichment analysis using GENE2FUNC in FUMA version 1.5.1 [73]. An integrated network analysis was completed using GeneMANIA [74]. The same set of functional analysis was applied to the top genes from the rare and low-frequency variants analysis as well.

More details can be found in Supplemental Method.

Results

A total of 743 participants have sequencing data and a non-missing phenotype for SB (none [0], suicidal ideation [1], suicide planning [2], suicide attempt [3], suicide death [4], and 742 participants have data on AUD severity (range 0–69, mean = 22.6, sd = 18.8). 57.3% of these individuals are female. Their average age is 31. The detailed demographics is listed in Table S1. The sample size in the main covariate model is 742 for the SB-AUD bivariate analysis and 743 for the SB rare and low-frequency variant analysis, and reduces to 720 and 740 respectively in the extended covariate model due to missing data (Table S3).

Bivariate genome-wide significant variants associated with SB-AUD in AI

Five variants were identified as being significantly associated with AUD severity and SB (p < 5E-8) using the main covariate model, as illustrated in Fig. 1. Only one of the variants, rs184204326 on gene FBXO11, has MAF over 5% (Table 1). Although the association of this variant is mostly driven by SB (p = 1.48E-07), the bivariate association is statistically more significant (p = 3.63E-08) than the univariate associations, which also holds for three other top variants, including an intergenic variant between genes ZIC2 and PCCA, variant rs76300969 on gene AACSL, and variant rs530542541 on ANK1. Among the top variants shown in Table 1, rs184204326 is the only SNP for which brain expression quantitative trait loci (eQTL) were found in the Braineac database [75]. That variant is associated with the differential gene expression of several genes in brain tissues including: putamen, substantia nigra, thalamus, occipital cortex, and temporal cortex (Table S4). The most significant eQTL is for EPCAM (encoding epithelial cell adhesion molecule) expression in the putamen tissue (p = 0.0019). The variant is also associated with gene expression in several brain tissues for FBXO11, MCFD2 (encoding a soluble luminal protein), and KCNK12 (encoding a potassium channel protein). One variant on LRRC4C was significantly associated with AUD severity and SB in the extended covariate model (Table S5). The gene has been associated with drinking and smoking behaviors [76] as well as educational attainment [77]. Four of the five significant variants in the main model remained suggestive significant in the extended model; one was dropped in the analysis due to low allele frequency. As a comparison, we also conducted an additional set of bivariate analysis using SB and the DSM5 AUD diagnosis phenotype (Table S5). While DSM5 AUD diagnosis and the clinical course-derived AUD severity phenotypes are highly correlated (Fig. S1), the latter offers increased power for detection as shown in Table S5.

Fig. 1
figure 1

Manhattan plot of bivariate genome-wide association analysis for suicidal behaviors and AUD severity in the American Indians.

Table 1 Top variants identified in bivariate genome-wide associations for suicidal behaviors and AUD severity in the American Indians.

Variants associated with SB-AUD at p < 106 (using the main model) were mapped to 31 genes. The majority of these variants are intronic, upstream, or downstream from a gene. Rs1804145 on SERPINF1 is the only nonsynonymous variant and has a C-score of 22.8 (top 0.52% most deleterious). These genes are enriched for 27 transcription factor (TF) targets and three microRNA targets (Table S6), suggesting that the top genes share certain regulatory motifs. These genes are most significantly up-regulated in artery, and down-regulated in areas included in the basal ganglia (nucleus accumbens, caudate, and putamen). In addition, they are significantly differentially expressed in anterior cingulate cortex, hypothalamus, substantia nigra, and kidney (Fig. 2). Many psychiatric or neurological disorders have been linked to basal ganglia targets including addiction and depression [78].

Fig. 2: Enrichment in tissue-specific differentially expressed gene sets (DEG) of top genes associated with SB-AUD.
figure 2

The tissue-specific differential gene expression test was conducted against all genes across genomes that exhibited significantly increased or decreased expression levels in a certain tissue sample compared to all other samples. The analysis was performed using FUMA and utilized tissue-specific transcriptome data across 54 tissue types from GTEx v8. From top to bottom: up-regulated, down-regulated, differentially expressed. Red: Significantly enriched (p < 0.05 with Bonferroni correction).

Genes with rare and low-frequency variants associated with suicidal behaviors in AI

Rare and low-frequency variant analysis identified six genes and one long intergenic non-coding RNA (lincRNA) as being significantly associated with SB (FDR adjusted p < 0.05) using the main model, and an additional four genes strongly associated with SB (FDR < 0.1) (Table 2). SERPINF1 was the top gene linked to SB. Both the nonsyn (nominal p = 1.3E-6) and the exonreg (nominal p = 1.7E-6) variant groups of this gene remained genome wide significant after the multiple testing correction (FDR = 0.024). Four out of the nine nonsyn rare and low-frequency variants in SERPINF1 had C-scores between 23 and 29, representing between 0.1 and 0.5% most deleterious mutations. One of the four probably damaging variants, rs1804145, was also suggestively associated with SB-AUD in the bivariate analysis (p = 1.1E-7, see Table 1). Additional genes significantly associated with SB included: ZNF30, CD34, SLC5A9, OPRD1, and HSD17B3. A lincRNA AC002511.3 is also associated with SB. Of the 15 nonsyn variants on ZNF30, one has a C-score of 33 (0.05% most deleterious), and six have C-scores between 22 and 24 ( ~ 0.5% most deleterious). One of the nine nonsyn variants on CD34 has a C-score of 32. Seven out of the 12 nonsyn variants on SCL5A9 have C-scores ranging from 24 to 30. OPRD1 has three rare exon variants, one of which has a C-score of 18 (1.6% most deleterious), and one of the 10 rare variants on HSD17B3 has a C-score of 22.8 ( ~ 0.5% most deleterious). Using the extended model as shown in Table S7, four genes and one lincRNA had FDR < 0.05 and four genes had FDR < 0.1, including all genes that were significantly associated with SB in the main model.

Table 2 Top genes with rare and low-frequency variants associated with suicidal behaviors in the American Indians.

A gene set enrichment analysis for the 11 genes whose rare and low-frequency variants were associated with SB at FDR < 0.1 using the main model (Table 2) identified axon to be the most significantly enriched gene ontology term (nominal p = 1.63E-5, adjusted p = 0.016). Four of the 11 genes belong to the axon gene set: OPRD1, NRP1, SERPINF1, and DCTN1. Network analysis conducted using GeneMania for the 11 genes in Table 2 recognized 10 genes and automatically selected 10 additional related genes (Fig. 3). The analysis considered pathways and genetic interactions and identified a number of significantly associated functional networks (Table S9). Figure 3 illustrates the top enriched distinct functions, including VEGF signaling (FDR = 2.1E-04), angiogenesis (FDR = 2.1E-04), and response to decreased oxygen levels (FDR = 4.2E-03). VEGF, as an angiogenic cytokine, and hypoxia have been associated with depression and suicide [79, 80].

Fig. 3: Networks of the rare-variant genes associated with suicidal behaviors in AI.
figure 3

Three of the enriched functional networks are highlighted in the gene nodes (see Table S9 for the complete list). Each circle represents a gene and the color on the circle indicates to which pathway the gene belongs to. Yellow: vascular endothelial growth factor (VEGF) receptor signaling pathway (FDR = 2.1E-04); Red: angiogenesis (FDR = 2.1E-04); Blue: response to decreased oxygen levels (FDR = 4.2E-03). The color of the edges linking the genes indicate the type of database used in the GeneMania analysis. Blue: Pathways; Green: Genetic interactions.

From the meta-analysis between SB and AUD severity for the rare and low-frequency variant analysis, none of the nominally significant genes or pathways remained genome-wide significant after multiple comparison corrections (Table S10).

Pathways with rare and low-frequency variants associated with suicidal behaviors in AI

Regulation of gene expression by hypoxia inducible factor (HIF) is the top pathway (nominal p = 1.4E-5, FDR = 0.04) whose rare and low-frequency variants were associated with SB using the main model (Table 3). The pathway is comprised of 10 genes with total of 91 nonsynonymous or splicing site variants (nonsyn), of which 83 have MAF < 5% and were included in the association test. Hypoxia could decrease serotonin synthesis, which has been linked to suicide [81]. The three additional pathways that were associated with SB at FDR < 0.1 using the main model are essentially two distinct pathways: vasopressin regulated water reabsorption with 164 nonsyn rare and low-frequency variants on 44 genes, and cellular hexose transport with 126 nonsyn rare and low-frequency variants on 21 genes. Vasopressin plays a critical role in regulating renal water reabsorption and cardiovascular homeostasis. As an integral part of the hypothalamic-pituitary-adrenal (HPA) axis, it is also an important player in response to stress and contributes to stress-related disorders such as anxiety and depression [82]. The cellular hexose transport pathway includes gene families responsible for glucose transport in humans. Sodium independent glucose transporters (GLUTs) are encoded by the SLC2 family, while sodium dependent glucose transporters (SGLTs) are encoded by the SLC5 family. Using the extended model, as shown in Table S8, all four pathways were significantly associated with SB (FDR = 0.019–0.024).

Table 3 Top pathways or gene sets with rare and low-frequency nonsynonymous variants associated with suicidal behaviors (FDR < 0.1) in the American Indians.

Overall, there was no major discrepancy between the findings from the main covariate model and the extended model although extended model had reduced sample sizes and the significant levels differed. However, some of the social economic factors included in the extended model also have genetic components. For instance, it has been demonstrated that suicide death and suicide behavior as well as AUD showed negative genetic correlation with educational attainment [31, 83]. Thus, it is not clear how educational attainment might impact the genetic associations when included as a covariate: more detailed analysis is beyond the scope of this study. We will base our discussions on the findings from the main model.

Discussion

The present study investigated potential genetic risk factors for suicidal behaviors in an American Indian population living on reservations that demonstrates high rates of suicide as well as AUD [19, 43]. We identified five variants significantly associated with SB-AUD, two of which are intergenic and three are intronic. Six genes, one lincRNA, and one canonical pathway, were found to be significantly associated with SB through rare and low-frequency variant analysis. Four additional genes, and two pathways, also were strongly associated with SB in this AI cohort. Approximately 3–20% (50–79%) individuals carry some of the rare and low-frequency variants on these genes (pathways).

F-box gene FBXO11 significantly associated with SB-AUD in this AI population

The intronic variants significantly associated with SB-AUD are on genes AACSP1, ANK1, and FBXO11. AACSP1 has been associated with an Alzheimer’s marker through interaction [84]. ANK1 has also been linked to Alzheimer’s disease through epigenetic deregulation [85, 86] and appears to play a role in immunomodulation [87]. Rs184204326 on gene FBXO11 was the only common variant significantly associated with SB-AUD. FBXO11 is highly conserved in evolution, and involved in the maintenance of genome stability [88] and the regulation of alternative splicing [89]. While the gene is expressed in various tissues, it is particularly abundant in the brain. FBXO11 has not been previously associated with SB. However, variants on or near FBXO11 have been associated with a number or correlated behavior traits and disorders such as alcohol consumption [76], smoking initiation [76, 90], risk taking behaviors [91, 92], externalizing behaviors [93], educational attainment [77], insomnia [94], schizophrenia and depression [95, 96]. De novo mutations on FBXO11 were found to cause intellectual disability with behavior problems as well as facial dysmorphisms [97] and other neurodevelopmental disorders [98]. Variant rs77969729 on FBXO11 has been linked to Alzheimer’s disease in the UKBiobank [99]. This variant is in LD with rs184204326 on FBXO11 in the AI cohort and associated with SB-AUD at p = 3.4E-7.

Rare and low-frequency mutations in PEDF are likely a risk factor for suicidal behaviors

SERPINF1 was the top gene significantly associated with SB in the rare and low-frequency variant analysis. It encodes serpin F1, and is also known as pigment epithelium-derived factor (PEDF). This gene has nine rare and low-frequency nonsynonymous or splicing site mutations, of which four are among the 0.1–0.5% most deleterious. Nearly 6% of individuals in the AI cohort carry some of these mutations. PEDF is a secreted glycoprotein and a potent inhibitor of angiogenesis [100]. It has been linked to obesity [101]. It also has neuroprotective effects and been implicated in depression [102, 103]. Reduction in PEDF levels have been found in the plasmas of patients with major depressive disorder (MDD), as well as in the prefrontal cortex (PFC) of animal models exhibiting depressive-like behaviors [103]. Conversely, overexpression of PEDF in the PFC has been shown to induce antidepressant “like” behaviors, by exerting effects on the tryptophan and glutamate in the PFC, with tryptophan being an essential amino acid precursor of serotonin, which is known to be associated with both depression and SB [104]. Another study has shown that PEDF in the hippocampus has a similar effect on depressive phenotypes in animal models by contributing to the synaptic formation and Wnt signaling activation in that region [105]. PEDF has thus been suggested as a biomarker and a novel therapeutic target for depression [102].

Additional genes significantly associated with SB included ZNF30, CD34, SLC5A9, OPRD1, and HSD17B3. ZNF30 encodes a zinc finger protein. A microdeletion of five genes including ZNF30 results in chromosome 19q13.11 deletion syndrome that includes features such as: developmental delay, microcephaly and intellectual disabilities [106]. The GWAS meta-analysis by the international suicide genetics consortium (ISGC) reported another zinc finger family gene (ZNF28) associated with SB and suicide death [31]. CD34 is involved in the innate immune system. Its variants have been associated with Alzheimer’s disease [99] and risk-taking behavior [92]. SCL5A9 is involved in sodium ion transport and is also known to be a sodium-dependent glucose transporter [107]. This gene has been associated with amygdala volume [108] and metabolic measurements [109]. Variants on or near HSD17B3 have been linked to smoking behavior [110], brain morphology measurements [111], memory performance [112] and Alzheimer’s markers [84, 113].

Opioid receptor delta 1 (OPRD1) gene encodes a member of the opioid family of the G-protein coupled receptors (GPCR). Delta opioid receptors are generally involved in reward mediation and neuroprotection. OPRD1 is specifically involved in the opioid receptor signaling pathway and cellular response to hypoxia. Numerous candidate gene studies have implicated OPRD1 in addiction, including opioid, cocaine, and alcohol dependence [114, 115]. Variants on OPRD1 have also been associated with schizophrenia [116] and educational attainment [77].

Hypoxia regulation is significantly associated with suicidal behaviors

Nonsyn rare and low-frequency variants in a pathway related to hypoxia inducible factor (HIF) regulation were significantly associated with SB (see Table 3). Nearly half of the individuals in the AI cohort have some of these mutations on their genes in this pathway. The network analysis of the top rare and low-frequency variant genes associated with SB also found that those genes were enriched for response to decreased oxygen levels (see Fig. 3).

Chronic hypoxia is suggested to be a risk factor for suicide [79], and metabolic stress associated with hypoxia is a possible mechanism [117]. Hypoxia is also hypothesized to increase the risk of suicide by reducing the synthesis of brain serotonin [81] or downregulating PEDF [118], which has a protective role in depression and is associated with SB in this AI cohort. HIFs are transcriptional factors that respond to reduced oxygen levels in cell and tissue. HIF-1 protects against hypoxia and reduces oxidative stress [119], while HIF-2α plays an important role in the modulation of inflammatory responses [120]. Recent studies have indicated that oxidative stress and abnormal energy metabolism in the brain play significant roles in the development of depression. Therefore, increasing HIF-1 activity has been suggested as a potential new therapeutic target for depression and suicide ideation [119]. Gene expression analyses have demonstrated that patients with MDD exhibit increased expression of HIF-1 and its target genes, including VEGF and GLUT1 (SLC2A1) [80]. Our network analysis of the top rare and low-frequency variant genes associated with SB in the AI have also found enrichments in VEGF receptors signaling and angiogenesis.

Alcohol exposure also can alter expression of HIF. For instance, alcohol exposure can induce HIF-1α activation, however the dose and timing of alcohol exposure results in differential expression of HIF-1α in the brain and other organs [121]. Both acute and chronic alcohol exposure have been found to increase HIF-1α expression in the brain cortex, whereas chronic binge alcohol exposure decreased HIF-1α expression [121]. Another study found that HIF3A could be epigenetically induced in the amygdala in animal models by acute alcohol exposure, and its epigenetic reprogramming was associated with the anti-anxiety effect of acute alcohol exposure [122]. Nonsyn rare and low-frequency variants in HIF3A were strongly associated with SB (nominal p = 2.9E-05, FDR = 0.09) in the present study.

Vasopressin-regulated water metabolism and hexose transport strongly associated with suicidal behaviors

Nonsyn rare and low-frequency variants in two additional pathways, vasopressin-regulated water reabsorption and cellular hexose transport, were strongly associated with SB in the AI. Vasopressin is an evolutionary ancient neuropeptide involved in regulating physiological processes such as renal water reabsorption and cardiovascular homeostasis. It also plays an important role in the modulation of emotional and social behaviors in the brain [123, 124], with vasopressin containing neurons most abundantly found in the hypothalamus. The vasopressin system is known to interact with the HPA axis [82]. Indeed, cortisol response and stress reactivity within the HPA axis is well-established as an endophenotype for depression and SB, as increased cortisol level is associated with death by suicide [125, 126]. HPA axis dysfunction has also been observed in those with a history of suicide attempts [126]. In addition, changes in water and electrolyte metabolism have been reported in clinical studies of depressed patients [127, 128]. Alcohol also uniquely interacts with the vasopressin system. Alcohol is a diuretic that promotes water loss by inhibiting the production of vasopressin. The HPA axis response to alcohol can be altered by manipulating the vasopressin system [129]. The vasopressin system has thus become an emerging therapeutic target for stress and depression, as well as alcohol-related behaviors [82, 129]. Gene families in the cellular hexose transport pathway mediate glucose absorption in the small intestine, glucose reabsorption in the kidney, glucose uptake by the brain across the blood-brain barrier, and glucose release by all cells in the body [130]. Evidence suggests that disturbances in glucose metabolism may be associated with suicidal ideation and attempts [131]. A recent study has found a significant association between blood glucose and suicide attempts in male patients with MDD [132]. Taken together, our findings suggest that pathways underlying vasopressin-regulated water reabsorption and the cellular hexose transport system warrant additional investigation in association with SB.

Strengths and limitations

The results of this study should be interpreted in light of several limitations. The analyses were not meant to generate a comprehensive model of suicide risk and AUD in this community group, but rather to determine whether specific genetic associations could be identified for suicidal thoughts and acts, and between SB and AUD phenotypes. A larger sample, powered to access and assess additional variables associated with suicide risk, particularly according to the above systems, is recommended. We further emphasize that American Indians are a diverse group both genetically and environmentally, therefore our findings may not generalize to other American Indians in the population from which the sample was drawn or be representative of all American Indians, as rates of AUD and suicide vary among tribes [17, 19, 133]. The population diversity in fact underscores the need to delineate risk and resilience factors in local communities to develop community-specific prevention and intervention efforts [10,11,12, 134]. Since our study utilized both whole genome sequence data and an extensive phenotyping instrument, there is presently no replication AI sample available to our knowledge, especially given that suicide risks in American Indians are overall understudied. While our specific findings may not generalize to other populations, the identified genes and functional pathways for SB may be shared by different populations thus warrant further investigations.

We used retrospective data for lifetime measures of suicide risk, which are subject to recall bias and may include reporting bias for psychiatric symptoms. Lifetime suicidal ideation and behaviors were assessed using a subscale of the SSAGA as well as verified death records. Suicidal behaviors were defined using a ranked numerical scale to index corresponding risk. Although suicide risk increases in severity in this direction, suicidal behaviors range in severity by outcomes of risk (e.g., suicidal ideation to suicide attempts to death by suicide) and within such outcomes (i.e., according to intensity, duration, and pervasiveness of suicidal ideation; the lethality of a suicide attempt, etc.). Future studies using validated measures of suicidal behavior, such as the Columbia-suicide severity rating scale (CSSRS), are warranted to increase specificity and precision within this area.

In conclusion, we conducted the first genome-wide bivariate association analysis for SB and AUD, and identified five genome-wide significant loci. We also conducted the first large-scale rare and low-frequency variant analysis and identified 11 novel genes and three pathways for SB. Of particular importance, this study represents the first investigation of genetic factors for SB in an American Indian population that has high risk for suicide. Although our findings may be population specific, the rare and low-frequency functional mutations relating to PEDF and HIF regulation may suggest an important mechanism underlying suicidal behaviors and a potential therapeutic target for treatment and intervention in the prevention of suicide.