Introduction

Suicide is the cause of over 33 000 deaths per year in the United States (1.3% of all US fatalities)1 and accounts for about 2% of deaths worldwide.2 In particular, Rocky Mountain States have the highest age-adjusted suicide rates in the US, and Utah is consistently in the top ten.3, 4, 5, 6 These statistics, and several unique data resources in Utah, have contributed to the development of the Utah Suicide Genetics Project. The Utah Office of the Medical Examiner (OME) is a state-wide centralized system for the investigation of all sudden and unexpected deaths. University of Utah has had a long-standing collaboration with the OME resulting in a large collection of DNA samples from Utah suicide decedents. Utah is also home to the Utah Population Database (UPDB), a rich resource of health data and family history information for over 6.5 million individuals who include descendants of the nineteenth century Utah pioneers (www.hci.utah.edu/groups/ppr/). Previous studies have shown low rates of inbreeding within the UPDB,7, 8 and the population is primarily of Northern European ancestry.9 We have recently combined the OME and UPDB resources to identify extended pedigrees with a higher than expected incidence of suicide.

Suicide is a heterogeneous condition, but genetic risk factors are known to have an important role. A meta-analysis of 21 family studies of suicide showed that first-degree relatives of suicidal probands have a three-fold risk increase for suicidal acts.10 A recent population-based study confirmed that familial risk extends beyond first-degree relatives, suggesting significant familial genetic risk,11 a risk that occurs above and beyond the effects of psychopathology.12, 13, 14, 15 Adoption studies also showed increased risk of suicide for biological relatives of suicidal probands, but not for the non-biological adoptive family.16, 17, 18 Recent estimates from multiple studies show significant heritability (h2) of completed suicide,19, 20, 21, 22, 23, 24 with an aggregate estimate of h2=45%.25 Taken together, these findings support a genetic etiology of suicide, even when accounting for other risk factors, such as psychiatric history, traumatic life events and socio-economic status.

Previous studies have implicated specific genes as risk factors for suicide using multiple methodologies,26, 27 but have also confirmed the heterogeneous nature of genetic risk. In this complex landscape, extended high-risk pedigrees may provide islands of increased genetic homogeneity and greater statistical power to search for genetic risk factors. Our high-risk pedigree resource gives us a unique opportunity to study familial genetic risk factors for completed suicide, a more rare condition (1–2/10 000 per year) than attempted suicide.28 We selected two high-risk pedigrees for this study. The first pedigree was chosen for an unusual excess of completed female suicides, perhaps indicative of a distinct genetic signature. In general, although the majority of suicide attempters are female, the majority of suicide decedents are male.1, 9 Completed female suicides impart greater suicide risk to offspring,13 and female suicidal behavior shows higher heritability.18, 19 We selected the second pedigree because it showed the highest familial risk when compared with matched Utah population data. It also had the most significant profile of psychiatric and substance abuse co-morbidity. Pedigrees with particularly high risk and/or recognizable distinguishing characteristics such as these may improve the ability to detect specific gene effects, or may reveal risk genes specific to particular co-morbid conditions. The intent of the study of these extended pedigrees is to determine genes and gene families contributing to risk that might eventually be developed for targeted interventions.

Materials and methods

Decedent cases collected by the Utah State OME

This project is possible in part because of a long-standing collaboration with the Utah OME allowing collection of DNA approved by the University of Utah and the Utah Department of Health Institutional Review Boards (IRBs) from suicide cases for the purpose of studying suicide risk factors, as shown in Table 1. From 1996–1999, collection focused on youth suicide, and collection was limited in 2003 and 2004 due to funding gaps. Otherwise, collection was as inclusive and therefore as representative of suicides in Utah as possible. The collection is mostly White (95.53%), consistent with the homogeneous racial composition of Utah,29 and predominantly male (80.63%), reflecting the well-established gender signature of completed suicides.30 Using death certificate data across the collection period (12/1996–7/2011), suicides with DNA showed no significant difference in gender distribution as compared with other Utah suicides (80.6% male vs 80.5% male). The average age at death of suicides with DNA was somewhat younger in the set with DNA (mean 38.53 vs 41.35, P<0.01), which is perhaps not surprising given the focus on youth suicide in the collection from 1996 to 1999.

Table 1 Characteristics of Utah decedents with DNA collection from December 1996 to July 2011

Record linking

We generated a list of the 2215 numeric OME IDs from the DNA samples described in Table 1. OME staff transferred identifying information on these suicides to the UPDB, with additional permission from IRBs of the University of Utah and Utah Health Department, and from the University of Utah Resource for Genetic Epidemiologic Research.31 These regulatory committees and also the study investigators respect the sensitivity of data and the privacy of families; therefore, subsequent use of the data and DNA samples for this study involves no attempt to identify subjects or re-contact family members. Ninety-nine percent of these cases linked to the UPDB database. All subjects’ identities are henceforward only referenced by anonymous IDs.

High-risk pedigrees

Although the study focuses on the 2215 decedents with DNA, there were 12 850 additional suicides recorded on Utah death certificates (available from the UPDB) dating back to 1904. We included these individuals in the identification of pedigrees to make the familial risk estimates as accurate as possible. To determine risk, we used the familial standardized incidence ratio (FSIR),32 which is calculated by taking a cohort of each decedent’s relatives and comparing the observed incidence of suicides in that cohort against the expected uniform distribution for suicide stratified by sex and age using the statewide population from the UPDB. The FSIR also weights the contribution of each relative to the familial risk by assessing the probability that the relative shares an allele with the decedent through a common ancestor.

Other phenotype data were available through the OME for the decedents on whom DNA was collected, including age at death, race, gender and method of death. We also investigated electronic diagnostic information from the UPDB on all pedigree relatives to determine familial risk patterns of co-morbid conditions, including affective disorders, psychosis, alcohol abuse and drug abuse.

Genotyping

Samples were genotyped using the Illumina HumanExome BeadChip. This chip contained variants derived from the results of exome sequencing of over 10 000 samples in previous studies, including studies of schizophrenia, major depression and other psychiatric disorders. Any variant observed ≥two times in these samples and passing quality control was included on the chip (genome.sph.umich.edu/wiki/Exome_Chip_Design).

DNA samples were tested for quality and quantity standards and genotyped at the University of Utah Health Sciences Center genomics core facility using Illumina protocols and instruments. Genotyping was also done on a larger sample of 459 other Utah suicide decedents that did not include the 15 individuals in these pedigrees. Within this sample, duplicate blind repeated controls within trays resulted in over 99.8% consistency in genotypes. Genotype data was screened using PLINK,33 resulting in the deletion of 456 poor performing single-nucleotide polymorphisms (SNPs) (missing for ≥10% of the sample), and 50 764 SNPs that were homozygous and uninformative in this cohort. For the present study, we also did not include 140 insertion/deletion sites (indels). We therefore retained 196 571 SNPs for analysis. The total genotyping rate for these remaining SNPs was 98.83%. Illumina’s TOP/BOT allele coding was converted to dbSNP forward/strand coding for comparability to control data using the GenGen utility.34

Genetic analyses

Genetic analyses were done using the newly developed Variant Annotation, Analysis, and Search (VAAST) program package.35 VAAST applies likelihood testing to determine the probability that a given variant may have a role in disease susceptibility. We used an extension of this analysis tool (pVAAST), which incorporates extended pedigree relationships directly into the likelihood calculations (presentation, Huff et al., ASHG Conference, November 2012).

We first annotated the chip variants using the VAAST Variant Analysis Tool using reference annotations available from www.sequenceontology.org/gff3.shtml. We used the VAAST variant selection tool to select the set of cases within each pedigree, and created files that reflected pedigree structure. VAAST then compared the likelihood for each locus of a null model in which there is no difference in minor allele frequency between the ‘target’ samples (pedigree cases) and the primary comparison control samples (‘background’), to an alternative model where the minor allele frequencies are different. VAAST uses a composite likelihood ratio, weighting variants by their potential for damaging effects according to Amino Acid Substitution data. The pVAAST extension also takes familial relationships into account in the composite likelihood ratio, prioritizing variants that tend to segregate with phenotype. Evidence across a gene is then aggregated to implicate particular genes. P-values are empirical, generated through a combination of a genome permutation process to account for the case-control allele frequency distribution, and a gene-drop simulation to account for pedigree structure. The maximum number of permutations was set at 1 000 000. When no permutation resulted in a higher likelihood than the actual composite likelihood ratio, then the number of target and background samples, in addition to the number of permutations, determined this lowest possible empirical P-value (in this case 1 × 10−6).

Four sources were used for comparison of pedigree variants. The primary comparison control data used to determine significance and rank order the variants (‘background’) was a group of 459 other Utah suicides not related to the selected extended pedigrees. These individuals were taken from the larger set of 2215 suicides, and were therefore from the same population. They were also genotyped on the same platform as the pedigree cases. Variants from our two selected pedigrees were compared with this background data set, and retained if the P-value was 0.05. These variants were then screened and deleted if they occurred only in one pedigree member, or if there were evident coding errors (reference allele not seen, two other alleles reported in dbSNP). The primary comparison to other suicides was conservative, and results only in those variants elevated within the pedigree (presumed to contain individuals at higher genetic risk), but not significantly elevated in other Utah suicides (presumed to have genetic and environmental risk factors to varying degrees).

We next removed variants as likely false positives if they had a frequency >5% in either dbSNP or in sequence data from 1358 individuals taken from the 1000 Genomes project and from the Complete Genomics Diversity Panel dbSNP. Finally, remaining variants were compared with genotyping from 398 Utah control individuals to determine the aggregate frequency of the variants in this second Utah resource. These individuals were ascertained as general population controls; the mean age at last contact was 65.4 (standard deviation=10.7). The gender ratio was 56% male. This sample was genotyped using the Illumina 610Q platform. Aggregate frequency comparisons used genotyped SNPs when available. If imputed SNPs were used for comparison, imputations were done with probability of>95% accuracy. Imputations were performed using Segmented HAPlotype and Imputation Tool version 2 (ShapeIT2; www.shapeit.fr)36, 37 for phasing and IMPUTE238, 39, 40, 41, 42 for calculations of imputed genotypes.

Results

Identification of pedigrees and determination of familial risk

We identified pedigrees with excess suicides (death certificate and OME), then filtered results to only contain pedigrees that had more suicide cases than expected by chance at a P-value 0.01. We used all the data combined to compute the FSIR for each pedigree. We obtained 773 pedigrees that met the criteria above, then eliminated pedigrees that did not contain five or more suicides with DNA (to maximize genetic informativeness), which left a set of 184 pedigrees. These 184 pedigrees were sorted by their FSIR and the top 30 founders were selected for an additional manual analysis to eliminate any multiple founders claiming the same descendants through marriage, resulting in 22 unique pedigrees. The incidence ratio was also adjusted for genetic (meiotic) distance between suicide decedents assuming a simple single allele inheritance. The FSIR values ranged from 2.26 to 2.91 for the top 22 pedigrees, indicating that the clustering of suicides in these highest risk suicide pedigrees is 2–3 times greater than expected by chance. Table 2 gives basic characteristics of each pedigree found in this initial search. The number of suicides per cluster in the top 22 suicide pedigrees ranged from 10 to 51; expected numbers of suicides in these same families ranged from 2.95 to 17.51. There were a total of 479 suicide cases within these pedigrees, 138 with DNA. Numbers of suicides with DNA in each pedigree ranged from 5 to 11. Figure 1 shows the two pedigrees chosen for this genetic analysis. Other living pedigree members have been omitted, and gender has been disguised to protect the privacy of families and minimize the risk of loss of confidentiality.

Table 2 Characteristics of 22 high-risk pedigrees. Pedigrees 5 and 12 were selected for initial study
Figure 1
figure 1

High-risk pedigree 12 and high-risk pedigree 5. Suicide cases that are shaded in red are those who have DNA samples collected by the OME; cases shaded in black are suicides known from death certificate data, but do not have DNA. In pedigree 12, for the three siblings on the left side, the suicides were separated by 6–26 years. The parent–child suicides on the right side of the pedigree are separated by 22 years. In pedigree 5, the half-sib suicides on the left side are separated by 6 years. The parent–child suicides in the center of pedigree 5 are separated by 13 years. The half-sib suicides on the right side of pedigree 5 are separated by 3 years.

Demographic characteristics

When we compared the gender distribution of decedents within the 22 pedigrees (OME and death certificate cases), vs the non-familial decedents, we found no significant difference (P=0.78). However, a Fisher’s exact test demonstrated a significant difference in gender ratio for pedigree 12 (more females than expected, P=0.001) and pedigree 4 (more males than expected, P=0.05). Average age of death of all OME decedents was 38.53 years. A comparison of decedents in the 22 pedigree to non-familial cases revealed significantly younger age at death in the familial cases (familial average=36.18 vs non-familial average=38.92; t=3.16, P=0.002). When we examined each individual pedigree separately, we found that this age effect was particularly pronounced in pedigree 2 (P=0.0002), pedigree 7 (P=0.01), pedigree 12 (P=0.04) and pedigree 18 (P=0.005). There was no significant difference in average calendar year of death between familial and non-familial cases, either across all pedigrees or within pedigrees, suggesting no substantial time trends.

Other phenotypes

Many pedigrees also showed significant elevation of co-morbid risk (Table 2). Method of death was obtained from the OME and also from the death certificates. For all familial suicides combined, violent death was common (79.33%). For the subset of suicides with DNA, violent death occurred for 80.43% of the deaths. Methods were primarily gunshot wounds and hanging. Non-violent methods were split evenly between asphyxiation and drug-related methods. Of the other familial co-morbid conditions investigated, drug-related disorders were by far the most common, showing significance for 18/22 (82%) of the pedigrees. Significant familial co-morbidity was also seen for affective disorders (8/22=36%), alcohol disorders (4/22=18%) and psychotic disorders (1/22=5%).

Pedigrees 12 and 5

We chose two pedigrees for initial analysis, attempting to choose those that may be most genetically informative (see Figure 1). In our first selected pedigree (pedigree 12), there were 19 observed suicides (7.28 expected suicides based on population data). Ten of these 19 suicides (53%) were female. DNA was available for 6 of these 10 suicides. Pedigrees with an excess of female cases, given the rarity of female suicide, may be an indication of distinct genetic risk factors. Additionally in this pedigree, the six decedents with DNA had relatively young ages at death, ranging from 16 to 36. Affective disorders were present in all subjects. Causes of death were violent in four cases and non-violent in two cases.

The second pedigree (pedigree 5; Figure 1) is our largest pedigree, and also contained the greatest number of related suicides (51 observed, 17.51 expected based on population data). This pedigree was chosen because it had the highest FSIR (2.91) of all the pedigrees, and a large number of related suicides were available to study for familial sharing. It also had the most significant profile of psychiatric and substance-use disorder co-morbidity. DNA was available for 10 of these 51 cases, though only 9 had sufficient quantity for this preliminary genotyping. The decedents with DNA were aged 17–75 at death, and cause of death was violent in nine cases and non-violent in one case. Depression and/or alcohol abuse was documented for five cases.

Molecular results

Table 3 presents genes with variants from the HumanExome BeadChip for pedigree 12. All frequencies are based on a simple count of heterozygotes unless the presence of one or more homozygotes is noted. Variants were retained for further scrutiny if they had an empirical P-value of 0.05, and if they occurred in more than one pedigree member. Variants were omitted because of likely scoring errors, or because the frequency in either dbSNP or in the unselected control data (N=1358) was >%5. Finally, we report the frequency of each SNP in an additional Utah control sample of 398 individuals. The table presents the remaining results after this screening was done. Several of the genes are of functional interest, as presented in the discussion below.

Table 3 For pedigree 12, candidate genes containing sequence variants meeting thresholds for further investigation as compared against Utah suicide decedents not related to the pedigree, and checked against other control data sets

Table 4 presents results for pedigree 5. Again, variants were eliminated if they only occurred in one pedigree member, or were likely scoring errors. As with pedigree 12, variants were also eliminated if they showed frequency of >5% in either dbSNP or the unselected control sequence data. The table presents the remaining results, with additional frequency data from the Utah control sample of 398 individuals. Relevant phenotypic associations of the findings for this pedigree are presented in the discussion below.

Table 4 For pedigree 5, candidate genes containing functional sequence variants that passed screening tests for false positives, were rare in unselected controls and showed significant familial evidence

Discussion

We have identified 22 unique high-risk pedigrees with DNA from related suicides, ascertained through the UPDB, a unique Utah data resource containing genealogical records, demographic information, and electronic medical data. Through comparisons with computed age- and sex-specific incidence rates of suicide in Utah, we found that clustering of suicides in these pedigrees was 2–3 times greater than expected by chance, consistent with a recent population-based study showing familial risk extending further than first-degree relatives.10

Each pedigree has at least five suicides with DNA, and several show interesting demographic characteristics and/or significant familial co-morbidity for psychiatric or substance abuse conditions. Taken as a group, familial decedents were significantly younger at death than non-familial cases, and four of the pedigrees had notably young average age at death. Although suicide attempts peak at 15 years of age, actual suicide completion peaks after age 65.43 Younger completed suicide is therefore more rare (7–13/100 000), and may reflect greater genetic risk. Eighteen of the 22 pedigrees had significant co-morbidity for drug abuse, alcohol disorders, affective disorders and/or psychotic disorders. Manner of death in the pedigrees was primarily violent, which has been associated with higher lifetime aggression and impulsivity.44

Gender distribution among all familial decedents combined did not differ from the non-familial OME cases; however, pedigree 12, in which over half of the decedents were female, was significantly different from the overall gender ratio, suggesting that this pedigree would possibly be interesting for initial molecular studies. We additionally chose pedigree 5 because it showed the highest familial risk and the most significant profile of psychiatric and drug/alcohol abuse co-morbidities. It was also our largest pedigree with the most relatives who were decedents (N=51), and had nine DNA samples available for genotyping and inspection of familial sharing of the resulting variants.

Results from the HumanExome BeadChip were prioritized using only the analysis methods described above without consideration of gene function, then were investigated for prior disease associations and functional significance. Of interest, no variant was found that was shared across all suicide decedents in either pedigree; rather, variants were shared usually across two or three of the related suicides. This finding may be due to the genotyping platform, which is made up of rare, potentially damaging sequence variants gleaned from other studies. It is possible that when these families are studied using more complete genotyping and/or sequence data, the pedigrees will show more common familial risk factors, shared across more of the related decedents.

Several variants occurred in genes coding for membrane proteins, including two rare variants in the FAM38A gene, which codes for Piezo-type mechanosensitive ion channel component 1,45 and has been associated with senile plaque-associated astrocytes.46 Both variants were found in pedigree five, and are considered probably damaging, with PolyPhen2 47 scores of 0.997 (rs202103485) and 0.999 (rs200970763). The variants in histidine rich carboxyl terminus 1 (HRCT1), found in pedigree 5, and in transmembrane protein 141 (TMEM141), found in pedigree 12, were rare in all comparison data sets. Although their functions are not clearly known, both have been identified as transmembrane proteins.48, 49 Finally, Anoctamin 5 (ANO5) showed evidence for familial sharing in pedigree 12. This gene is a calcium-activated chloride channel transmembrane protein associated with muscular dystrophies.50

Several additional variants had interesting prior associations with neuronal function and/or psychiatric conditions. Nuclear factor kappa B1 (NFKB1) seen in pedigree 12, is a transcription factor involved in learning and memory,51, 52 as well as inflammatory disease 53 and alcohol-use disorders.54 The variant in caspase-9 (CASP9) was shared in pedigree 12. CASP9 is a component of the apoptotic pathway, and has been associated with inflammatory diseases 55 as well as bipolar disorder.56 Plexin B1 (PLXNB1), found in pedigree 12, is a receptor for semaphorin, and is involved in neural circuitry 57 and neuronal development.58 Phosphodiesterase 11A (PDE11A), found in pedigree 12, downregulates cAMP and cGMP signaling and has shown associations with schizophrenia, bipolar disorder and major depression,59 and mixed evidence of association with antidepressant response.60, 61

Variants in several other genes had familial evidence with P-values<0.005 in pedigree 5. High-density-lipoprotein-binding protein (HDLBP) helps regulate HDL cholesterol levels. Low levels of HDL have been previously associated with suicide.62, 63 THO complex 1 (THOC1) is involved in transcriptional regulation,64 and is regulated by NEDD4,65 which is a ubiquitin–protein ligase gene essential for normal neuronal development and function.66 Autism susceptibility candidate 2 (AUTS2) has been associated with autism and neurodevelopmental conditions,67 but also with heroin addiction,68 alcohol consumption 69 and with suicide.70 MUTYH encodes a mitochondrial protein involved in repair to DNA caused by oxidative damage. Oxidative stress has been previously implicated as a risk factor for depression and suicide.71, 72, 73 Two genes with variants of interest found in pedigree 12 were related to DNA repair. Thymine-DNA glycosylase (TDG), is involved in DNA mismatch repair,74 and DNA-damage inducible transcript 4-like (DDIT4L), which has been shown to be upregulated in response to hypoxia, possibly related to oxidative stress.75

In summary, we describe a valuable pedigree resource which will continue to provide information for the study of genetic risk factors leading to suicide. The long-term goal of the study is to identify genes and gene pathways that may elevate risk to facilitate future therapeutic interventions. This preliminary analysis was done using the HumanExome BeadChip, and was therefore limited to the investigation of the rare sequence variants included in this genotyping platform. However, results from two of our high-risk pedigrees suggested several potential candidate risk variants. In addition to explorations of these candidate risk variants in other available data sets, future plans also include obtaining sequence data on these high-risk families, which will allow us to screen more comprehensively for shared functional genetic risk variants, and also to detect familial shared genomic regions that may harbor additional regulatory mutations.