Introduction

Antisocial personality disorder (ASPD) is a life-long condition involving habitual irresponsible and delinquent behavior, with prevalence of 1–3% in the general population, and 40–70% in prison populations.1, 2, 3, 4, 5 Previous twin and adoption studies report heritability estimates for ASPD up to 50%,6, 7 and several studies have attempted to unravel the genetic background of antisocial personality. Although men have consistently been found more often antisocial than women, it has been suggested that antisocial personality emerges from the same familial (including genetic and environmental factors) and non-familial influences in both sexes.8 Conduct disorder prior to age 15 is an essential diagnostic criterion for ASPD, and it markedly increases the risk for ASPD in adulthood.9 In a recent genome-wide association study (GWAS), Dick et al.10 found several markers with genome-wide significance associated with conduct disorder symptomatology, especially in the gene C1QTNF7, although none remained significant when individuals were classified dichotomously as cases and controls. Attention-deficit/hyperactivity disorder (ADHD) increases the risk for ASPD,11 and genes that have been previously found in association with ADHD have also been tested for association with ASPD. In a recent study, ADHD linked SNAP25 polymorphisms (DdeI T/T, MnlI T/T haplotype) were associated with novelty-seeking scores in male ASPD subjects, and were more common in ASPD males when compared with sex-matched controls.12 Another candidate gene study revealed an association of COL25A1 variant with comorbid ASPD and substance dependent in a subpopulation of African American and European American samples of substance-dependent patients with ASPD.13 In another recent study, a variant of a previously ADHD linked gene, CDH13, was associated with extreme violent behavior, and replicated in another Finnish cohort of violent offenders.4, 14 In the same study, the MAOA low-activity promoter genotype association with violent criminal behavior was replicated. MAOA deficiency was first reported in association with impulsive and aggressive behavior in a study of a Dutch family cohort,15 and later, MAOA low activity allele and childhood maltreatment interaction was linked to antisocial behavior.16 Recently, Tielbeek et al.17 reported the first genome-wide approach in the analysis of population-based samples of cases with antisocial behavior compared with population-based controls with only a little or no antisocial tendencies. Their study revealed no genome-wide significant associations, and the strongest association was observed in DYRK1A gene (P=8.70 × 10−5). In conclusion, although a number of studies have investigated antisocial behavior, no genome-wide significant or replicable findings on genes contributing to ASPD have been obtained thus far.

Here, our aim was to conduct the first GWAS in a cohort of prisoners fulfilling the diagnostic criteria of ASPD according to DSM-IV as compared to controls from the general population, and to replicate the most significant findings in another set of prisoners and controls. We then searched for the impact of the identified genetic risk for ASPD for antisocial features and its interaction with childhood risk environment in the general population. Finally, we investigated the functional implication of the risk variations for ASPD by searching for their correlations to gene expression in human tissue databases.

Materials and methods

Participants

The study population including genotype data consisted of a total of 543 subjects with ASPD and 9616 participants from the general population. The clinical and sociodemographic characteristics of the participants in this study are shown in Table 1, and in the flow chart in Figure 1.

Table 1 The clinical and sociodemographic characteristics of the participants
Figure 1
figure 1

The flow-chart of the study. Box (a) illustrates the GWAS analyses including 370 cases (339 males, 31 females) and 5850 controls (3345 males, 2505 females) from the GenMets (sub sample of Health 2000 study) and Corogene (sub sample of FINRISK study) control samples. (b) On the basis of the GWAS results, eight variants were selected for analyses in the replication sample composed of those CRIME cases (N=173, 141 males, 32 females) and Health 2000 controls (N=3766, 1587 males, 2179 females) that were not included in the GWAS analyses. (c) The eight replication variants were included in the meta-analyses of the GWAS sample and the replication sample, including altogether 543 cases and 9616 controls. (d) The most significant hit of the analyses, rs4714329, was analyzed in the population sample for the association with antisocial features (N=4944, 2211 males, 2733 females), and for the gene x environment interaction of antisocial features and childhood adverse environment (N=1536, 636 males, 900 females). GWAS, genome-wide association studies. HLA, human leukocyte antigen; LD, linkage disequalibrium; MAF, minor allele frequency.

CRIME cohort

The Finnish CRIME sample was collected during 2010–2011, and it comprises 794 samples from all of the largest prisons in Finland. The sample collecting procedures have been described in more detail previously.4 In short, prisoners were screened for antisocial personality utilizing the Structured Clinical Interview Axis II (SCID-II) for the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV). Sexual crime offenders, as well as individuals with diagnosis of psychosis, were excluded from the sample. Altogether, 568 (500 men, 68 women) criminal offenders fulfilled the criteria for antisocial personality disorder, whereas 196 were classified as non-antisocial, and 30 criminal offenders had an unknown ASPD status. Approximately 370 randomly selected individuals formed the discovery samples with GWAS data available, and 173 were included in the replication sample. The history of criminal convictions was obtained from the National Crime Register, and 77% of the antisocial prisoners were violent offenders having committed at least one violent crime. The remaining individuals of the ASPD sample were offenders who had been convicted for non-violent crimes, such as house break-ins or crimes against property. The history of substance abuse (alcohol, heroin, buprenorphine, amphetamine, cannabis and/or other) as well as childhood circumstances (for example, good circumstances, indifferent parents or severe maltreatment such as family violence) were screened with a questionnaire. Subjects provided a written informed consent. This study was approved by the Ethics Committee for Pediatrics, Adolescent Medicine and Psychiatry, Hospital District of Helsinki and Uusimaa, and Criminal Sanctions Agency. All the subjects who participated in the study received a voucher of 20 euros for their participation.

Control cohorts

The Health 2000 Study, including the GenMets sub cohort, and The National FINRISK Study, including the Corogene sub cohort, were used as control cohorts in the study (N=5850 with GWAS data available, used in the discovery phase; and N=3766 in the replication analysis; Supplementary Information).

Phenotypes

Phenotype in GWAS, replication and secondary analyses of the ASPD sample

The subjects of the CRIME cohort, who fulfilled the criteria for ASPD according to DSM-IV and diagnosed by SCID-II were classified as cases. In addition, 22 individual items of the SCID-II questionnaire for ASPD were used in secondary analyses and analyzed individually. The phenotypes are described in detail in the Supplementary Information.

Phenotype in the secondary analyses of the population-based sample

The phenotype of antisocial features in the Health 2000 cohort was assessed with a scale, which is part of the home interview of the Health 2000 survey. The scale is originally from a 50-item scale, ‘Cook-Medley hostility scale’,18, 19 and it included eight questions focusing on the aspects of deceitfulness, distrust to other people, and lack of empathy. Information on the childhood environment of economic difficulties or severe conflicts in the family were questioned and applied in gene-environment interaction analyses (Supplementary Information).

Genotyping and statistical analyses

Genotyping

Illumina Human670QuadCustom Beadchip and HumanHap610-Quad SNP array (Illumina, San Diego, CA, USA) were used for genotyping of common single-nucleotide polymorphisms (SNPs) genome-wide at the Welcome Trust Sanger Institute, Cambridge, UK for the CRIME and control cohorts, respectively. Stringent quality control was used and only SNPs originally included in all data sets of controls and cases were analyzed, resulting in 481866 SNPs in the final data set. Targeted genotyping for both controls and cases were performed by Sequenom MassARRAY iPLEX technology (San Diego, CA, USA) simultaneously at the Institute for Molecular Medicine, Finland (for details on genotyping, see Supplementary Information).

HLA-imputation

Imputation of the GWAS genotype data with the classical human leukocyte antigen (HLA) alleles was performed utilizing the HIBAG software20 (for more information, see Supplementary Information).

Statistical analyses and linkage disequilibrium analyses

The power calculations were performed utilizing Genetic Power Calculator (http://pngu.mgh.harvard.edu/~purcell/gpc/).21 All of the association analyses were performed using a generalized linear (logistic) model with PLINK v1.07 (http://pngu.mgh.harvard.edu/~purcell/plink/)22 with age and sex and the 10 first multidimensional scaling clusters as covariates. The meta-analyses with fixed effect model were performed with GWAMA software (http://www.geenivaramu.ee/en/tools/gwama.23 SPSS (IBM Released 2013. IBM SPSS Statistics for Windows, Version 22.0. Armonk, NY, USA). The linkage disequilibrium (LD) and haplotype analyses were performed utilizing PLINK v1.07, Haploview,24 and SNAP (https://www.broadinstitute.org/mpg/snap/).25 Detailed description of the statistical analyses can be found in the Supplementary Information.

Gene expression analyses

The GTEx Portal (http://www.gtexportal.org/home/)26 and The Braineac—the Brain eQTL Almanac (http://www.braineac.org/) were utilized to investigate correlations between variant genotype and brain- and testis-tissue gene-expression levels. More information can be found in the Supplementary Information.

Results

GWAS for ASPD

We conducted genome-wide association analyses, for DSM-IV based ASPD, using the entire GWAS study sample (N cases=370, N controls=5850), as well as in the sample including only males (N cases=339, N controls=3345). The number of the female cases was too small (N=31) to perform a separate analysis. The Manhattan plots for the genome-wide analyses are presented in Figure 2, illustrating the results of the analyses for the whole sample, and the sample including only males, respectively. Supplementary Tables 1a and 1b display the most significant 50 variants associated with antisocial personality disorder in the combined sample, and in males, respectively. None of the associations achieved genome-wide significance (P<5.0 × 10−8). However, in the quantile-quantile (Q–Q) plots for the observed versus expected results (−10 log(P-value)), for both analyses, several data points were observed above the lift of line indicating more significant associations than expected by chance (Supplementary Figures 1a and b).

Figure 2
figure 2

Manhattan plot of the entire sample of males and females combined in the analysis of the ASPD cases versus population-based controls (a) and for the males’ sub-sample (b). ASPD, antisocial personality disorder.

The strongest signal for association in the combined analysis of males and females was detected in chromosome 7p22.2, in the vicinity the SDK1 gene, for rs6462756 (odds ratio (OR)=1.84 (1.45–2.33), P=5.5 × 10−7). As no other variant within 500 kb (250 kb down/upstream, including 135 SNPs) of it gave any signal, it was considered spurious, and was not selected for replication in this study. The next strongest associations were observed for rs9268528 (OR=0.58 (0.46–0.72), P=9.9 × 10−7) and rs9268542 (OR=0.58 (0.46–0.72), P=1.1 × 10−6), located on chromosome 6p21.32, intragenic of the BNTL2 and HLA-DRA genes. These two variants were selected for replication, together with an additional nearby intragenic variant, rs2395163 (OR=0.59 (0.46–0.77), P=6.2 × 10−5), and an intronic variant of HLA-DRA gene, rs2239804 (OR=0.61 (0.49–0.77), P=1.2 × 10−5). All the association signals of these variants in the vicinity of the HLA-DRA gene originated from the major allele. Supplementary Figure 2a shows the regional Manhattan plot of chromosome 6 for the analysis of the whole sample.

In the analysis of the males alone, the strongest association signals were observed for variant rs6458146 (OR=1.72 (1.40–2.13), P=3.9 × 10−7), residing in an intergenic region on chromosome 6p21.2. An additional four markers in nearby loci gave signal with P-values<5 × 10−5 (rs9471290, OR=1.68 (1.37–2.06), P=8.1 × 10−7; rs10498746, OR=1.72 (1.38–2.13), P=9.9 × 10−7, rs7749170, OR=1.67 (1.35–2.08), P=3.8 × 10−6; rs4714329, OR=1.56 (1.27–1.92), P=2.5 × 10−5). Supplementary Figure 2b shows the regional plot for the males’ association analysis.

Overall, our findings suggest altogether eight variants that associated with ASPD, originating from distinctive regions located at 6p21 (6p21.2 and 6p21.32). These were chosen for the regenotyping and for the replication analyses.

Regenotyping, replication and meta-analysis

Re-genotyping and confirmatory analyses

In order to confirm the consistency of the genotypes from the GWA analyses, we performed simultaneous regenotyping of the selected eight variants both in the CRIME sample as well as in the GenMets controls from Health2000 (Genmets sample) by another technology (Sequenom MassARRAY iPLEX). There was a >99% similarity between the original genotypes from GWAS in the discovery analysis and in the regenotyping (Supplementary Information). Thus, the original signals from the GWAS on 6p21.2 and 6p21.32 were not due to spurious genotyping errors.

We also compared the minor allele frequencies (MAFs) of these variants in our cases, as well as their controls from the HapMap-CEU population (http://www.ncbi.nlm.nih.gov/SNP/) (Supplementary Table 4). The MAFs for all these eight SNPs were relatively consistent between the population controls of GWAS and of replication samples. When looking the genetic variation at these chromosomal regions more widely, the MAFs for the variants on 6p21.2 were well in line with the HapMap-CEU. The MAFs of the variants near the HLA-DRA gene, on 6p21.32, were considerably lower in both of the GWAS and replication population controls as compared with the HapMap-CEU. This is likely to reflect the genetically complicated nature of that particular chromosomal region with widely extending structures of LD and with high variation in different populations.

Replication analysis

Table 2 shows the results among the discovery sub-cohort (N cases=370, N controls=5850) and the replication sub-cohort (N cases=173, N controls=3766), and the entire study population (meta-analysis; N cases=543, N controls=9616). The variants on 6p21.2 showed a consistent signal in the replication analysis (rs7749170 failed in the Sequenom genotyping and is therefore absent), whereas the results for the SNPs on 6p21.32 were diluted or reversed when compared to the GWAS signal. Two out of four of the SNPs on 6p21.2 showed a statistically significant signal in the replication analysis (males and females combined). The most significant signals on 6p21.2 were revealed for rs4714329 (OR=1.75 (1.37–2.24), P=9.0 × 10−6) and for rs9471290 (OR=1.40 (1.09–1.81), P=8.3 × 10−3). The replicating variants were included in a meta-analysis in which rs4714329 yielded an OR of 1.59 (1.37–1.85) and P-value of 1.6 × 10−9, and rs9471290 had an OR of 1.49 (1.28–1.73) and a P-value of 2.9 × 10−7. A post hoc replication analysis of rs4714329 in the female sample (N cases=32, N controls=2179) revealed an OR of 2.55 (1.50–4.33) and a P-value of 5.3 × 10−4. Thus, we detected a replication with two variants at 6p21.2, the most robust finding with rs4714329 reaching genome-wide significance (P=1.6 × 10−9 in the meta-analysis), although there was evidence that variants on 6p21.32 were weaker according to the replication study.

Table 2 The results for the eight best hits in Discovery Cohort (GWAS) and Replication Cohort (Sequenom Genotyping)

LD and haplotype analyses on 6p21.2 and 6p21.32

LD and haplotype analysis on 6p21.2

To investigate whether the most significant variant from the meta-analysis (rs4714329) belonged to a longer haplotype, we performed LD and haplotype analyses for all of the four SNPs residing in the 6p21.2 genomic region. Two of the variants (rs6458146 and rs10498746) were located in the same haploblock (bp 40218128–40224268, GRCh37; Supplementary Figure 3). Also, rs9471290 (bp 40260515) was in a strong LD with those (D′=0.93 and 0.95, and LOD2.0, respectively). Rs4714329 was, however, only in a moderate LD with the other 6p21.2 variants (D′=0.77, 0.68, and 0.66 to rs9471290, rs6458146 and rs10498746, respectively). A post hoc haplotype analysis of the SNPs with the most significant signal from the replication analysis yielded an association signal from a relatively frequent (f=0.31) haplotype of minor alleles A–G (rs9471290 and rs4714329, respectively) of OR=1.59 (P=5.6 × 10−6) in the GWAS data, and of OR=1.49 (P=0.0022) in the replication cohort (males and females combined). Thus, the signal for association from 6p21.2 was not strengthened by the haplotype and the risk genotype was best captured by the G-allele of rs4714329.

In an effort to further elucidate the possible origin of the signal of our best hit, we investigated the LD between rs4714329 and the variants within the nearby genes (RP11-552E20.1, TDRG1, LINC00951 and LRFN2) utilizing the GWAS variant data of the CRIME cohort as well as the HapMap3 project data and the 1000 Genomes Pilot 1 data (Supplementary Information). The strongest LD was observed between rs4714329 and several variants of the LINC00951 gene, consistently in all three data sets. In addition, the CRIME sample LINC00951 gene variants (rs17619142 and rs17619309) revealed nominal association with ASPD (OR of 1.18 and 1.5, and P value of 0.002 and 0.159, respectively), which supported the role of LINC00951 as the strongest candidate of the nearby genes for the signal origin.

LD analysis on 6p21.2 and 6p21.32

As stated earlier, the HLA region at 6p21.32 is known for its complexity and extended LD ranging across multiple HLA and non-HLA genes in the region, and the HLA alleles may be captured by tag SNPs even outside the region.23 As the two most significant sets of variants with leading SNPs (rs4714329 at 6p21.2 and rs9268528 at 6p21.32) resided only approximately 8 Mb distance from each other, we investigated whether there would be LD between them. We also analyzed their haploblock structures, and which other SNPs would tag them in the data. Although the variants were strongly linked within the set, especially on the 6p21.32 loci, there was no LD between the two sets. Furthermore, they did not share same haploblocks, and no tagging was evident across the two sets (data not shown).

Analysis of HLA alleles at 6p21.32

The HLA region is highly polymorphic and individual SNPs at the HLA region are commonly part of more than one HLA allele and haplotype. The GWAS analysis of the associating locus (6p21.32) at the HLA region showed multiple SNPs associating with ASPD and the LD was carried over from BTNL2 gene to as far as the HLA-DQB1 gene region (Figure 3b). We observed that there were no directly genotyped SNPs at the HLA-DRB1 locus that would conceivably have depicted whether the highest association was indeed over HLA-DRA or perhaps over the HLA-DRB1 locus. It was also possible that some of the associating SNPs tagged an independent HLA allele, or that specific HLA alleles or their coding polymorphisms, rather than single SNPs, would drive the association with ASPD. The HLA region is also known to be difficult for probe designing or to impute with SNPs, therefore, we imputed the GWAS genotype data with the classical HLA alleles (Supplementary Information), and studied whether specific HLA alleles would associate with ASPD. We detected four HLA alleles at HLA-DRB1 and HLA-DQA1 genes with similar significance as individual SNPs but with higher OR in association with ASPD (Supplementary Table 2). The strongest associations were seen with a common HLA-DRB1 allele DRB1*01:01 (OR=2.19 (1.53–3.14), P=1.9 × 10−5, f=0.17) and with DQA1*01:01 (OR=2.09 (1.46–2.99), P=5.6 × 10−5, f=0.17) that are known to be in tight LD (current data r2=0.981). In addition, protective associations with DRB1*04:04 and DQB1*03:02 alleles were detected with similar protective effects as the strongest individual SNPs of the GWAS analysis (DRB1*04:04 OR=0.39 (0.18–0.57), P=9.7 × 10−5, f=0.04 and DQB1*03:02 OR=0.52 (0.37–0.74), P=2.5 × 10−4, f=0.11). To investigate whether the HLA allele effects were directly explained by the individual SNPs, we conditioned the analysis for the strongest variant at the HLA region (rs9268528), which failed to remove the association with DRB1*01:01 (conditioned P=2.3 × 10−3). Similarly, conditioning for the strongest four HLA alleles failed to remove the association with rs9268528, suggesting independent roles for the HLA alleles and rs9268528 in ASPD.

Figure 3
figure 3

Association peaks at the chr 6 LINC00951 (a) and HLA (b) regions. Extended LD over the HLA class II region can be observed in b. HLA, human leukocyte antigen; LD, linkage disequalibrium; SNP, single nucleotide polymorphism.

As individual amino acids coded by the HLA genes have been previously indicated to drive disease association,27, 28 we examined whether the associations of the top SNPs could be explained by individual coding polymorphisms at the HLA-DRB1 or DQA1 alleles (Supplementary Table 3). The DRB1 position 11 V was observed with the strongest protection for ASPD (OR=0.49 (0.37–0.66), P=2.4 × 10−6). Interestingly, DRB1 position 11 V is detected here with those DRB1*04 alleles that were associated at the allele level. In addition, DRB1 position 11 V was in relatively high LD with rs2395163 (D′=0.83, r2=0.56). Conditioning for position 11 V removed the association with rs2395163, but not with the leading variant at the HLA region (rs9268528 conditioned P=8.3 × 10−3) or with DRB1*01:01 (conditioned P=5.2 × 10−4), which was the strongest association at the HLA-region after conditioning for DRB1 position 11 V. Finally, we studied whether DRB1 position 11 V, together with HLA-DRB1*01:01, would explain the association with rs9268528. This was not the case. Even though the P-value for rs9268528 association was weaker after conditioning for both HLA-DRB1*01:01 and for DRB1 position 11 V, it was still significant with a point-wise P-value<0.05.

Analyses for sensitivity and the individual SCID-II items

Association sensitivity

Given that the main focus of this study is ASPD, cases were selected from the whole of the CRIME cohort based on the SCID-II diagnostics. However, the non-ASPD criminals group may include cases near the ASPD diagnosis cut-off threshold. Thus, we performed an analysis on non-ASPD status (all males, N=129) to examine whether the signal from the selected eight variants was specific for ASPD or if it covered a broader range of criminal activity. No association signal was detected for any of the selected variants. As substance abuse, as well as violent behavior, is linked to antisocial personality disorder, we performed sensitivity analyses including only those ASPD cases with substance abuse (N=358, males=327) or violent crime (N=285, males=265). Among substance abusers, a slight strengthening of the signal for the variants near the HLA-DRA gene on 6p21.32 was observed, but the signal was diluted for the variants on 6p21.2 (data not shown). Among violent criminals, the signals from all of the variants were diluted (data not shown).

Individual SCID-II items

ASPD diagnosis is based on a relatively broad range of antisocial behavioral characteristics. To elucidate the origin of the strong association signal arising from rs4714329 to ASPD, we analyzed each 22 characterizing items of the SCID II questionnaire individually (Supplementary Information; page 1, Supplementary Tables 7 and 8 and Supplementary Figure 5). The odds ratios for all of the 22 questions were quite similar, ranging between 1.4 and 1.7. The most significant association was achieved for item (20), ‘Reckless disregard for safety of self or others’ with the OR of 1.66 (1.41–1.94) and a P-value of 6.0 × 10−10 (heterogeneity index I2=0.44 (scale 0–1), indicating more variation than expected by chance) in the meta-analysis (N cases=477, N controls=9616). The most significant association with the heterogeneity index value of zero was observed for item (16), ‘Failure to conform to social norms with respect to lawful behaviors, as indicated by repeatedly performing acts that are ground for arrest’ (OR=1.60 (1.37–1.87), P=1.7 × 10−9, N cases=529, N controls=9616). The most significant items typically included >400 individuals each, whereas some of the items, such as items (5) or (8) only included <100 individuals, hence, the P-values here should preferably be interpreted as a reflection of power. Consequently, rs4714329 was found to be broadly associated with the different aspects of ASPD.

Association of rs4714329 to antisocial features in general population

The prevalence of ASPD is low in the population (1–3%),1 with a well-established role of early adverse environment in its etiology.29, 30 We hypothesized that the most significant hit in the ASPD analyses, the risk allele G of rs4714329, would also have an impact on antisocial features in the general population, especially among those individuals who had encountered severe difficulties in their childhood. Thus, we tested the risk variant in the Health 2000 population sample for association with antisocial tendencies, and also performed a gene × environment interaction analysis, where reported severe conflicts or economic difficulties in the childhood family were considered as a risk environment. As a measure of antisocial features, we utilized a scale assessed in the Health 2000 survey and focusing on deceitfulness, distrust to other people, and lack of empathy (see Supplementary Information for phenotype in the secondary analysis of the population-based sample). No signal was detected for the main effect of antisocial features in the complete sample (males and females: N=4944, β=0.13, P=0.175; males only: N=2211, β=0.203, P=0.143). However, there was a modest signal for interaction between the risk environment and rs4714329 in males, but not in females, in the general population (males: a β of 0.647, and a P-value of 0.045 for the interaction term), and the risk allele G associated significantly with antisocial features among males with the childhood risk environment (N=636, β=0.68, P=0.012).

Expression analysis of rs4714329 in brain tissue databases

The most significant variant of the replication and meta-analyses for ASPD, rs4714329 resides in the proximity of the genes LINC00951, LRFN2, RP11-552E20.1 and TDRG1. LINC00951 and LRFN2 are expressed mostly in the brain, especially in the cortex and testis. TDRG1 gene is expressed mostly in testis and in the brain. Thus, we investigated in the GTEx Portal and in the Braineac database the possible association of rs4714329 with the nearby residing genes’ expression levels in all of the available brain tissues and in testis (testis only in the GTEx Portal; Supplementary Information).

In the GTEx Portal, the most significant association was achieved with LINC00951 gene expression in cerebellum (N=103) with the β of 0.51 and a P-value of 2.0 × 10−6. LRFN2 and TDRG1 gene expressions revealed association in the cerebellar hemisphere (N=89; P-value=2.0 × 10−5 and β=0.56, P=4.0 × 10−5 and β=0.51, respectively; Supplementary Table 6a). In Braineac, no data were available for LINC00951, but rs4714329 was associated most significantly (P=7.4 × 10−4) with LRFN2 expression in cerebellum (N=130), although consistently with the GTEx Portal, LRFN2 is mostly expressed in the cortex. For TDRG1 expression in Braineac, the most significant (P=3.8 × 10−3) association was observed in occipital cortex (OCTX, N=129; Supplementary Table 6b). Neither databases had data available for RP11-552E20.1 gene expression; in Braineac, no data were available for LINC00951, and RP11-552E20.1 genes. Thus, the results revealed that rs4714329 associated to expression of the nearby located genes in brain so that the risk allele G of rs4714329 was found to correlate with reduced levels of LINC00951, LFRN2 and TRG1 mRNAs in cerebellum and of TDRG1 in occipital cortex.

Discussion

The results from this first GWAS on ASPD reveal genome-wide significant and replicable associations for variants residing on chromosome 6p21.2. According to our linkage investigation, our best hit, rs4714329 is in considerable LD with several polymorphisms of the LINC00951 gene, and not up to the same level with the polymorphisms of the other genes in that genomic region, suggesting that the LINC00951 gene is the strongest candidate for the signal origin. However, analysis of the gene expression data from the GTEx and Braineac databases revealed equally strong associations between rs4714329 and the expression levels of three nearby genes, namely LINC00951, LRFN2 and TDRG1, in tissues from cerebellum, of which the LINC00951 association was the strongest in the cerebellum and LRFN2 in the cerebellar hemisphere. Although cerebellum has traditionally been linked to motor control, it has also been suggested to be involved in cognition31 and personality,32 as well as autism spectrum disorders33 and psychosis.34 Most interestingly, Leutgeb et al. reported an altered cerebellum–amygdala connectivity among violent offenders, although their study excluded psychopathic or personality-disordered individuals.35 Although there was no correlation of rs4714329 to RNA levels in frontal cortex, it is highly interesting that the general levels of the transcripts for both LINC00951 and LRFN2 are particularly high in frontal cortex,36, 37 as the reduced GMV at that particular cortical region is one of the most consistently-reported biological findings in ASPD.38, 39

Chromosomes 6p21.2 and 6p21.32 belong to the chromosomal region of major histocompatibility complex, and have been previously linked to psychiatric disorders including schizophrenia6, 40 and bipolar affective disorder,41 as well as neurologic disease such as photosensitive epilepsy,42 late-onset Alzheimer disease43 and restless legs syndrome.44 In this study, we discovered novel suggestive associations of HLA-DRB1*01:01 and DRB1*04:04 with ASPD. Previous studies have reported the same alleles in association with schizophrenia,45, 46, 47 whereas other studies have not been able to replicate those findings48 and the link has remained controversial. In this study, additional analysis of coding polymorphisms suggested an independent role for Valine at position 11 on DRB1 that was tagged by one of the most significant GWAS hits in this study, rs2395163. The findings suggest that ASPD may share genetic risk factors in this locus with other psychiatric and neurological disorders.

The study sample in this study overlapped partially with the sample in a previous study of extreme violent behavior.4 However, the significant variants were not associated with violent offending in general, which indicates that the variants were associated with ASPD per se. This conclusion is also supported by the finding on antisocial attitudes observed in the cohort from the general population, where the risk allele was found to increase the presence of the antisocial features of deceitfulness and lack of empathy among those individuals who reported severe family conflicts or economic difficulties in their childhood families.

The ASPD diagnosis is controversial, since many clinicians and researchers think that the category is too heterogeneous and has overlap with other disorders.30 ASPD and other personality disorders have been reported to have comorbidity with psychopathy, a personality trait tightly connected to criminality and forensic psychiatry.5, 49 Although ASPD and psychopathy diagnoses are based on different assessment tools, the distinction between those two is not completely clear.50, 51 We consider our study sample relatively homogenous as the participants were all criminal offenders and had a convergent diagnosed ASPD. However, to further study the behavioral origin of our genome-wide significant hit, rs4714329, we investigated its association individually with each of the SCID II items used in this study. All of the 22 items gave consistent ORs, indicating that the original signal covers all the different aspects of ASPD according to DSM-IV, supporting the homogeneity of the study sample. However, we would like to point out, that the most significant associations were observed for those items that particularly describe social learning. Association with weak learning from social feedback would seem very logical for different frontal cortex genetic regulation, but it may also reflect those in cerebellum. It has also been reported that 21–45% of the prison inmates have a comorbid ADHD,52, 53 and a shared etiology has been suggested for such externalizing phenotypes as adult substance use and antisocial personality, and childhood conduct disorder and ADHD.54, 55, 56 ADHD was not diagnosed in our study sample, however, ~96% of the analyzed sample were substance abusers. The externalizing dimension of behavior is hypothesized to reflect an inherited predisposition for developing one or more of the aforementioned disorders.57 Thus, the previous findings on reduced GMV in frontal brain areas associated with ASPD may also reflect the comorbidity of the externalizing problems, consistent with the findings of ADHD prevalence in prisoners and supported by our findings here.

There were limitations in the study. In the GWAS study, the CRIME sample was genotyped with a different array than the control samples (GenMets, Corogene), although in the same facility of Welcome Trust Sanger Institute. Therefore, only the SNPs that overlapped in all of the three genotyping data sets were included in the analyses and even unusually stringent thresholds were used in the quality control. Furthermore, the regenotyping of the eight replication variants in the CRIME subjects and GenMets controls that were included in the GWAS, revealed practically identical genotypes as the GWAS genotyping with >99% concordance between the two methods. In a post hoc repeat GWAS analysis, including the CRIME subjects and GenMets controls, the results were consistent with the original analysis. This also indicated that had there been discrepancy with the Corogene control sample genotypes of the eight SNPs in the GWAS, this would have shown in the repeat analysis results. In spite of intensive QC, spurious associations may occur. We considered the top variant (rs6462756, near SDK1 gene) of the analysis of males and females combined as a plausible false positive, as it was the only signal from that genomic region (7p22.2), where the SNP density was not particularly sparse, rather to the contrary, relatively dense (35 SNPs per 100 kb, compared with chr 7p average 21 SNPs per 100 kb and the nearby 500 kb average 25 SNPs per 100 kb), and thus, it was not selected to the regenotyping nor to the replication study.

Another factor raising concerns regards the reversed replication results of the set of hits on 6p21.32, near HLA-DRA gene. A possible explanation may lie in the high gene density and polymorphism rate of the HLA region, combined with the lack of power in this study. The MAF investigation of the eight variants, including the HapMap-CEU population sample, revealed considerable MAF variance for the HLA-DRA variants, but not for the variants on 6p21.2, near to the LINC00951 gene. The only variant in the HLA-DR region on 6p21.32 showing similar MAF in discovery and replication was rs2395163 that tagged a relatively rare HLA-DRB1*04:04 association. As the allele frequency of HLA-DRB1*04:04 was only 0.04, it was not surprising that the association was not replicated in the relatively small replication sample. However, it is important to emphasize that the HLA-findings emerge only from the imputed GWAS data, and no replication was addressed in this study.

These findings are limited to the Finnish population-based cohorts, and therefore replications in other samples of other populations should be pursued. With altogether 543 cases and 9616 controls, our study remains very much underpowered, as indicated in the power calculations. Bigger sample sizes are needed, although collection of large samples is rather demanding, and especially for behavioral phenotypes of a relatively special nature, such as ASPD, large samples may materialize at the expense of phenotype homogeneity. The large GWAS’s have proven the method a useful tool for identifying genetic loci in complex disease of small individual factor effect sizes, such as in schizophrenia.34 However, GWAS in smaller samples may also reveal important associations, particularly, when the phenotype is accurate, and the suggestive findings may be of value for future studies with larger sample sizes.

The results of this study need to be interpreted carefully. There are examples of how the genetic associations of ASPD may be or have been misused for prediction purposes or in courthouses.58, 59 The concept of heritability is sometimes misinterpreted. When ASPD is described as 50% heritable, it implicates that heritability explains 50% of the variance in that population in which the study was conducted in, and not that on the individual level an ASPD subject had inherited 50% of the disorder from their parents and 50% was due to the environmental influences. The findings of this study cannot be implemented for any prediction purposes, or brought into courthouses to be given any legal weight. Instead, this study serves well as a candidate variant reservoir for personality traits and disorders adding to the few candidate variants that have previously existed for ASPD. Important next steps would be to discover the causal factors tagged by the findings here and to elucidate the molecular functions involved with ASPD. In conclusion, our results showed a robust signal for several variants located beside LINC00951 with potential functional effects. To our knowledge, this is the first study showing genome-wide significant and replicable findings on genetic variants associated with any personality disorder.