Polygenic prediction and GWAS of depression, PTSD, and suicidal ideation/self-harm in a Peruvian cohort

Shen, Hanyang; Gelaye, Bizu; Huang, Hailiang; Rondon, Marta B.; Sanchez, Sixto; Duncan, Laramie E.

doi:10.1038/s41386-020-0603-5

Article
Published: 11 January 2020

Polygenic prediction and GWAS of depression, PTSD, and suicidal ideation/self-harm in a Peruvian cohort

Neuropsychopharmacology volume 45, pages 1595–1602 (2020)Cite this article

3257 Accesses
21 Citations
13 Altmetric
Metrics details

Subjects

Abstract

Genome-wide approaches including polygenic risk scores (PRSs) are now widely used in medical research; however, few studies have been conducted in low- and middle-income countries (LMICs), especially in South America. This study was designed to test the transferability of psychiatric PRSs to individuals with different ancestral and cultural backgrounds and to provide genome-wide association study (GWAS) results for psychiatric outcomes in this sample. The PrOMIS cohort (N = 3308) was recruited from prenatal care clinics at the Instituto Nacional Materno Perinatal (INMP) in Lima, Peru. Three major psychiatric outcomes (depression, PTSD, and suicidal ideation and/or self-harm) were scored by interviewers using valid Spanish questionnaires. Illumina Multi-Ethnic Global chip was used for genotyping. Standard procedures for PRSs and GWAS were used along with extra steps to rule out confounding due to ancestry. Depression PRSs significantly predicted depression, PTSD, and suicidal ideation/self-harm and explained up to 0.6% of phenotypic variation (minimum p = 3.9 × 10⁻⁶). The associations were robust to sensitivity analyses using more homogeneous subgroups of participants and alternative choices of principal components. Successful polygenic prediction of three psychiatric phenotypes in this Peruvian cohort suggests that genetic influences on depression, PTSD, and suicidal ideation/self-harm are at least partially shared across global populations. These PRS and GWAS results from this large Peruvian cohort advance genetic research (and the potential for improved treatments) for diverse global populations.

You have full access to this article via your institution.

Download PDF

Exome-wide analysis implicates rare protein-altering variants in human handedness

Article Open access 02 April 2024

Dick Schijven, Sourena Soheili-Nezhad, … Clyde Francks

Genome-wide association studies

Article 26 August 2021

Emil Uffelmann, Qin Qin Huang, … Danielle Posthuma

The serotonin theory of depression: a systematic umbrella review of the evidence

Article Open access 20 July 2022

Joanna Moncrieff, Ruth E. Cooper, … Mark A. Horowitz

Introduction

Polygenic risk scores (PRSs) are individual-level metrics of genetic risk for particular phenotypes, and they are also known as risk profile scores (RPS) and genetic risk scores (GRS) [1]. Methods for polygenic scoring were first developed for livestock and agricultural purposes [2] and were applied in humans over a decade ago [3, 4]. PRSs have become increasingly popular as the size (and importantly, the statistical power) of genome-wide association studies (GWAS) have increased, resulting in improved performance of polygenic risk scores for the prediction of psychiatric disorders. For example, as the sample size (and statistical power) of schizophrenia GWAS increased from approximately N = 7000 to approximately N = 36,000, the maximum amount of phenotypic variance explained by schizophrenia PRSs increased from ~3 to ~12% [4, 5].

Over a decade ago, the original developers of PRSs demonstrated the poorer transferability of PRSs across ancestry groups as compared to within-ancestry predictions [4]. Subsequent theoretical work suggested an approximately linear decline in predictive performance of PRSs as a function of genetic distance between training and target samples [6], and empirical observations are generally consistent with this expectation [7, 8]. However, it is still the case that most GWAS (training) PRS (testing) studies are conducted in high-income countries. GWAS analyses conducted in populations with different environmental, cultural, and ancestral backgrounds are needed to increase the generalizability of PRS prediction, and PRS studies are needed to clarify the extent to which environmental factors also impact generalizability of PRSs.

As has been the case in medical genetics more broadly, there have been few GWAS and PRS studies of psychiatric outcomes in South America. Of the available studies, Hispanic/Latino samples from the US are most similar to admixed populations in South America in terms of ancestry. However, environmental conditions and culture are oftentimes considerably different. Compared to people living in North America and Europe, many individuals living in South America are exposed to high rates of traumatic events and low-socioeconomic status, which lead to increased burden of psychiatric disorders [9].

Despite knowledge of the impact of environmental factors on psychiatric disorder risk, environmental variables have often been omitted from polygenic analyses. The challenge is often a practical one in consortium analyses, in which collecting genetic and minimal phenotypic data is already a tremendous challenge, and consequently environmental variables are not collected from contributing investigators. Thus, there is a need for inclusion of relevant environmental variables in GWAS and polygenic prediction studies.

Here we present a genetics study of 3308 participants from Lima, Peru [10, 11]. This is the largest psychiatric genetics study in South America to date. The goals of the current study were twofold. First, polygenic prediction analyses were conducted on three major psychiatric phenotypes—depression, PTSD, and suicidal ideation/self-harm—with controlling for potential environmental variables. The results demonstrate successful cross-ancestry, cross-cultural predictions. Second, GWASs of these three phenotypes were conducted in the cohort in order to make summary statistics from this Peruvian cohort available to researchers worldwide.

Materials and methods

Study population

The population for the present study was drawn from participants of the Pregnancy Outcomes, Maternal and Infant Study (PrOMIS) cohort. The PrOMIS cohort is a longitudinal study aimed at understanding the life course and intergenerational effects of interpersonal violence and other forms of trauma among Peruvian women. The study was conducted between February 2012 and November 2015. The methodology and study procedures have been described previously [12]. Briefly, the present sample consists of 3308 women who attended prenatal care clinics at the Instituto Nacional Materno Perinatal (INMP) in Lima, Peru. The INMP is a reference national institution for maternal and perinatal care. Participants were invited to take part in an interview where trained research personnel used a structured questionnaire to elicit information regarding maternal socio-demographics, lifestyle characteristics, medical and reproductive histories, and mental health symptoms. All participants provided written informed consent. The institutional review boards of the INMP, Lima, Peru, and the Office of Human Research Administration, Harvard T.H. Chan School of Public Health, Boston, MA approved all procedures used in this study.

Depressive symptoms

Depression symptoms were assessed using the Patient Health Questionnaire (PHQ-9). The PHQ-9 is a nine-item, self-report depression screening scale derived from the Primary Care Evaluation of Mental Disorders [13, 14]. This questionnaire assesses nine depressive symptoms experienced over the past 2 weeks prior to the interview: anhedonia, depressed mood, trouble sleeping, feeling tired, change in appetite, guilt or worthlessness, concentration problems, psychomotor agitation/retardation, and suicidal thoughts/self-harm. The total PHQ-9 score was calculated by summing scores of 0, 1, 2, or 3 to the response categories of “not at all,” “several days,” “more than half the days,” or “nearly every day” for each symptom. As a symptom measure, the total score ranged from 0 to 27. The PHQ-9 has been previously validated in Spanish-speaking populations [15].

PTSD assessment

The Post-traumatic Stress Disorder Checklist-Civilian Version (PCL-C) was used to assess PTSD symptoms. The PCL-C is a 17-item self-reported questionnaire designed according to the Diagnostic and Statistical Manual of Mental Disorders (DMS-IV) criteria. Each item assesses PTSD symptoms experienced over the past month on a 5-point Likert scale, with a total score ranging from 17 to 85. Higher scores indicate more severe PTSD. The PCL-C has demonstrated adequate levels of internal consistency, inter-rater reliability, test−retest reliability, and convergent validity when applied to different clinical and nonclinical populations [16]. The Spanish-language version has been found to have similar psychometric properties to the English-language version.

Suicidal ideation and self-harm

The phenotype of suicidal ideation/self-harm in our study is based on one single item from the PHQ-9 form [13]. Item 9 asks about “thoughts that you would be better off dead, or of hurting yourself” in the 14 days prior to evaluation. Participants who responded “not at all” were classified as no for suicidal ideation/self-harm and participants with any other responses were coded as yes.

Clinical, demographic, and environmental covariates

Self-reported clinical, demographic, and environmental variables were also collected at participants’ prenatal care visits [17]. These environmental variables include: experience of any childhood abuse (yes vs. no); experience of any lifetime Intimate Partner Violence (IPV) (yes vs. no); age (<20, 20−24, 25−29, 30−34, and ≥35 years); educational attainment (≤6, 7−12, and >12 completed years of schooling); marital status (married/living with a partner vs. others); access to basic foods (hard vs. not very hard); employment status (employed vs. not); ethnicity (Mestizo vs. others); early pregnancy body mass index (BMI) (<18.5, 18.5−24.9, 25−29.9, ≥30); parity (nulliparous vs. multiparous); and planned pregnancy (yes vs. no).

GWAS: quality control, imputation, and GWAS methods

GWAS quality control and imputation was performed according to published PGC procedures, and this (PrOMIS) sample is part of the second empirical paper from the PTSD group of the PGC (PGC-PTSD) [18]. Briefly, samples and variants were excluded sequentially according to the following criteria: SNPs with missingness >5%, samples with variant missingness >2%, samples with deviation from expected inbreeding coefficient (f_het < −0.2 or >0.2), samples with sex discrepancy (discordant reported vs. sex estimated from genotypes), SNPs with missingness >2%, SNPs with missingness differences between cases and controls >2%, monomorphic SNPs, SNPs with Hardy Weinberg Equilibrium deviation p value < 1 × 10⁻⁶ in controls from the largest ancestry group (list of SNPs then applied to all samples). For imputation, the 1000Genomes phase 3 data were used [19]. Prephasing was conducted with default settings in SHAPEIT2 v2.r837 [20], followed by phasing in 3 megabase (MB) blocks, where an additional 1 MB of buffer was added to either end of each block. IMPUTE2 v2.2.2 was then used with default settings in order to obtain imputed genotypes. For subsequent analyses, imputed variants with imputation quality (INFO) scores >0.8, missingness <1%, and minor allele frequency >5% were retained.

We used PC-relate to exclude related participants, and we also used PC-Air for principal component (PC) analysis in our finalized sample [21]. In order to show how PrOMIS samples are distributed among global populations with various ancestries, we merged the cleaned PrOMIS genotype data with the 1000Genomes dataset [22]. PCs were also calculated for the merged dataset using PC-Air method. Scatterplots of individuals’ scores on pairs of PCs were visually inspected.

Construction and analysis of PRSs

This investigation used cleaned individual-level genotype data from the PrOMIS participants and summary statistics for depression [23] and PTSD [18] to calculate PRSs for each participant. Summary statistics files were downloaded from the PGC website (https://www.med.unc.edu/pgc/results-and-downloads). The depression summary statistics were from a meta-analysis of 807,553 people of European ancestry (file name: 2019-pgc-ukb-depression-genome-wide.txt; md5checksum value: ed4597a4e7fa168fb96970e3286a0b31). The PTSD summary statistics were from a study of 206,655 people of diverse ancestries. In order to avoid overlapping samples between training GWAS and target PRS analysis, we used internal PGC GWAS summary statistics file that had PrOMIS samples omitted (file name: all_prom_maf01_info6.results_neff.gz; md5checksum value: dde2a515fb1274a9014a5f4c425437b2). We pruned the publicly available dataset using LD information from the 1000Genomes samples (all participants) with window size 500, p value < 1, and r² equals 0.2.

PRSs were calculated with PLINK. Each person’s score was the sum of weighted risk alleles. Weights were the log of odd ratios (OR) from the summary statistics files, for each risk allele. SNPs with missing ORs were excluded from the calculation and the –no-mean-imputation flag was used. Our original analysis plan was to use 13 p value thresholds for inclusion of SNPs to calculate PRSs based on sets of variants ranging from Psi (meaning genome-wide significant variants only, i.e. p_T < 5 × 10⁻⁸), to all, meaning all (pruned) SNPs available. The 13 possible thresholds were: Psi (p_T < 5 × 10⁻⁸), Pe6 (p_T < 1 × 10⁻⁶), pe4 (p_T < 1 × 10⁻⁴), pe3 (p_T < 1 × 10⁻³), pe2 (p_T < 1 × 10⁻²), P05 (p_T < 0.05), P10 (p_T < 0.1), P20 (p_T < 0.2), P30 (p_T < 0.3), P40 (p_T < 0.4), P50 (p_T < 0.5), P75 (p_T < 0.75), and all (p_T < 1). However, we could only use the latter 12 thresholds for PTSD, given that the PTSD dataset had no genome-wide significant variants.

First, we standardized the scores by subtracting the mean and dividing by the standard deviation of each score. Then we then fit logistic regression models using these standardized PRSs to predict the phenotypes. The reported p values for the polygenic prediction terms were from a full model that included the top ten PCs calculated in PrOMIS samples (as noted above) and the additional clinical, demographic, and environmental covariates given in Table 1. The effect size reported for polygenic analyses was linear r²/Nagelkerke’s pseudo-r² of the full model minus by linear r²/Nagelkerke’s pseudo-r² from a basic model which included no PRS term, to predict phenotype. R [24] 3.5 was used for regression models and visualizing the results.

Table 1 Baseline characteristics of PrOMIS participants.

Full size table

Multiple testing correction

Multiple testing correction was conducted for polygenic prediction analyses, for the 39 depression-based polygenic predictions (p < 0.05/39 tests = p < 1.3 × 10⁻³), given that these were, a priori, the tests deemed likely to be adequately powered. We also corrected for all 75 statistical tests (p < 0.05/75 = p < 6.7 × 10⁻⁴), in order to denote statistical significance with full Bonferroni correction.

Sensitivity analyses

First, we repeated our analyses without the clinical, demographic, and environmental covariates. Recognizing that population stratification might be the biggest threat to the validity of our results, we also conducted sensitivity analyses. Second, we conducted our analyses in more homogeneous subgroups of participants. Visual inspection of PC scatterplots was used to select two increasingly stringent subgroups of participants (from original N = 3308 to 2964, to 2690). Third, we repeated our analyses with multiple choices of PCs in order to test the robustness of our results to varying choices of which PCs to include. The choices tested, in addition to our a priori decision to use ten PCs were: (1) no PCs, (2) first PC, (3) first two PCs, (4) first three PCs, and (5) 20 PCs.

There is no PGC-PTSD plan to make the PrOMIS summary statistics (alone) publicly available, so we are releasing them with this publication. In addition to the PTSD GWAS for this sample (N = 3414, 1698 cases), we also conducted GWAS on depression (N = 3404, 1076 cases) and suicidal ideation/self-harm (N = 3404, 522 cases) in order to make these summary statistics available. For these analyses, we use binary outcomes so that researchers may meta-analyze these results with other psychiatric GWAS, which nearly always use binary (rather than continuous) outcomes. Ten principal components were used as covariates in all GWAS (logistic regression conducted with PLINK). Manhattan plots of results were created using R [24].

Results

Participant characteristics

The participants’ clinical, demographic, and environmental characteristics are shown in Table 1. All participants in our cohort are females between the age of 17−47 years old (mean = 28.2, SD = 6.3). BMI was in the normal range for 46.6% of participants. A majority of the participants self-identified as Mestizo ethnicity (76.1%), married/living with a partner (81.4%), unemployed (53.5%), and with less than 12 years of education (53.5%). The mean depression (PHQ-9) score and PTSD (PCL-C) scores are 8.2 (SD = 5.3) and 27.4 (SD = 9.3) respectively. The prevalence of suicidal ideation/self-harm in the cohort is 15.4%.

Ancestry assessment, plus matching of cases and controls on ancestry indicators

The PC results from the combined dataset of PrOMIS and 1000Genomes were informative regarding the ancestry of PrOMIS samples. Figure 1a shows how PrOMIS samples are distributed when plotted with 1000Genomes participants, as represented in scatterplots of pairs of the first three PCs. Most of the PrOMIS participants aligned well with the 1000Genomes PEL samples (Peruvians from Lima, Peru). A small number of PrOMIS samples were dispersed among other samples from populations in the Americas, and this was also the case for a minority of the PEL samples as well. Figure 1b shows the top three PCs from PrOMIS participants only, with PTSD cases and controls denoted. Figure S1 shows the same plot but with depression and suicidal ideation/self-harm status denoted by color. The even distribution of symptom scores reflects good matching on ancestry indicators (PCs).

**Fig. 1: Principal components plots of PrOMIS and 1000Genomes participants.**

PRS predictions and comparison to effect sizes for covariates

The PRS prediction results are shown in Fig. 2 and Table S1. Depression PRS constructed using SNPs with discovery GWAS p_T < 1 yielded the best prediction result for depression score, explaining 0.62% of phenotypic variance (p_T < 4 × 10⁻⁶). The betas for all variables in the model are shown in Fig. 3a. Compared to the covariate with the highest risk (education less than 6 years) with beta as 1.66 on depression symptom score, depression PRS with p_T < 1 has a beta of 0.45. Compared to PRSs based on less SNPs, PRSs based on more SNPs tended to yield better prediction results. Depression PRS with p_T < 0.2 and p_T < 0.1 explained the highest phenotypic variance in PTSD and suicidal ideation/self-harm with both around 0.3% in the PrOMIS samples (p = 0.001 and p = 0.01 respectively). PRSs for PTSD constructed using SNPs with discovery GWAS p_T < 1 × 10⁻² and p_T < 0.05 predicted PTSD (nominal significance), with variance explained at 1% (p = 0.04). Compared to the covariate with the highest risk, IPV, with beta of 4.72 on PTSD symptom score, PTSD PRS with p_T < 0.05 has a beta of 0.54 as shown in Fig. 3b.

Fig. 2: Polygenic prediction results in the PrOMIS cohort for three phenotypes (depression severity score, PTSD severity score, and suicidal ideation/self-harm) using two psychiatric polygenic risk scores, controlling for covariates.

**Fig. 3: Magnitude of effects of polygenic scores and covariates on psychiatric outcome variables.**

Sensitivity analyses

Three types of sensitivity analyses were conducted. The results remained similar to our primary analyses. First, Fig. S2 shows PRS results without covariates (note: PC covariates still included). The variance explained by PRSs was slightly higher, but the results are highly similar to the full model. Second, as shown in Figs. S3 and S4, with decreasing sample size the variance explained varied somewhat, but the pattern of results did not change. Third, alternative choices of PCs were used, and the results are shown in Figs. S5−S9. The phenotypic variance explained varied somewhat when different sets of PCs were included, but PRSs remained statistically significant predictors of phenotypes. In sum, the results of this analysis are robust to the inclusion/exclusion of clinical, demographic, and environmental covariates. Further, the results are also relatively similar even when the number of PCs (which correct for ancestry) varied from zero to 20 PCs.

GWAS results

Three sets of GWAS results (for depression, PTSD, and suicidal ideation/self-harm) are available as Supplementary Data (see GWAS results files for depression, PTSD, and suicidal ideation/self-harm). Manhattan plots of these analyses are shown in Fig. S10. Full GWAS results are provided in the service of increasing access to non-European-ancestry summary statistics and to facilitate future meta-analyses. As shown in Fig. S10, the results are null with respect to individual loci. This was the expected outcome for an appropriately cleaned but underpowered GWAS. For completeness, we also provided the polygenic prediction results for our binary phenotypes in Fig. S11; results are similar to those using continuous phenotypic outcomes.

Discussion

To our knowledge, this is the first report to provide psychiatric polygenic scoring and GWAS results from a large South American sample. Our results demonstrate that polygenic scores derived from primarily European-ancestry, non-admixed individuals, from high-income countries (almost exclusively in Europe, the US, and Australia) are valid predictors of psychiatric phenotypes, even for individuals from considerably different environments, cultures, and ancestries. This means that genetic influences on depression, PTSD, and suicidal ideation/self-harm are at least partially shared across these populations [4, 7, 8]. The GWAS results made available here support the broader research goal of increasing genetic resource availability from non-European-ancestry populations, and from LMICs. To the extent that polygenic risk scores and other genetics-based treatments become useful in the clinic, it is critical that such genetic data resources be made available for multiple major global populations, so that new interventions are not disproportionately helpful to European-ancestry individuals.

The present polygenic results are consistent with previous literature; depression and PTSD PRSs are predictive of psychiatric outcomes in population from other ancestries [4, 8, 25]. However, compared to previous PRS prediction in European-ancestry samples, the variance explained in our study was lower. In Levey et al. [26], the depression-based PRSs explained up to 0.7% of the phenotypic variance for suicide attempt, which is higher than our results (0.3%). In Nievergelt et al. [18] the PTSD-based PRSs explained 0.15% variance of PTSD phenotype in their samples, which is also higher than our results (0.12%). There are several potential explanations for these differences. First, as has been reported previously, polygenic scores tend to work best when the ancestry of the training cohort (typically European, given historical sample collection rates) matches the testing cohort. As genetic distance between populations increases, prediction performance is hypothesized to decrease [6], and this has been demonstrated empirically [7, 8].

Notably, PRS performance in Latino and Hispanic samples has been relatively comparable to performance in European-ancestry samples in prior studies conducted within the United States [8]. Thus, it is possible that additional factors contributed to the somewhat lower predictive performance of PRSs in this study. Participants in this study are from a disadvantaged environment compared to their peers in developed countries. These substantial environmental differences could alter the relative importance of genetic and environmental factors in the development of depression, PTSD, and suicidal ideation/self-harm. Second, heterogeneity in the disease phenotype across populations and potential differences in measurement could also decrease the maximum potential prediction. Third, compared to other South American populations, Peruvian individuals have been found to have a higher proportion of Native American ancestry [27, 28]. Given all of these differences, it is noteworthy that our results still show that PRSs are valid predictors in a sample that differs in ancestry, culture, and severity of environmental risks. Finally, it should be noted that there is discussion within the research community regarding how polygenic scoring results should be reported (e.g. as area under the receiver operating characteristics, AUROC) and terminology (e.g. whether or not the term “prediction” should be used) [29].

The depression PRSs explained more phenotypic variance in PTSD, than the PTSD PRSs explained in PTSD. This is almost certainly due to two factors. First, the best-available depression GWAS is considerably better powered than the best-available PTSD GWAS [18, 30]. Second, given that depression and PTSD share genetic influences, the better-powered depression GWAS affords better prediction of PTSD than the currently available PTSD GWAS. As PTSD sample sizes increase, it is reasonable to expect improvement in predictive performance of PTSD PRSs.

Regarding the biggest potential threat to external validity of these results—population stratification—extra steps were taken to ensure that these results were not attributable to confounding of ancestry and phenotypic status. Sensitivity analyses used more ancestrally homogeneous subsamples and alternative choices of PCs to test the robustness of the main findings. Even using these extra steps (which are infrequently employed in polygenic scoring studies), the results were nearly identical to the main findings. Regarding future directions, researchers can examine female- and male-specific polygenic risk scores for PTSD when power increases in discovery GWAS for PTSD. This is important given evidence that there may be sex differences in the genetics of PTSD [30, 31].

In sum, this study provides an important extension to global genetics research focused on mental health. Clinically, our participants are females from a low-socioeconomic status community in Peru. This is a population at higher risk of mental disorders compared to their peers from high-income countries, likely due to high exposure to adverse life events. The women in our sample have been highly exposed to violence both during childhood and adulthood, and—as in higher-income countries—these are potent risk factors for psychiatric phenotypes. Consistent with what has been found in high-income countries, these results suggest that vulnerability to depression, PTSD, and suicidal ideation/self-harm is also partially influenced by genetics, and moreover, that polygenic liability to psychiatric phenotypes is at least partially shared across populations around the globe. As research continues in this and other populations from LMICs, the ability to include genetic information will prove valuable as scientists build upon the broader body of literature, which has historically been more focused on European-ancestry populations from high-income countries. Thus, the sharing of GWAS results from the PrOMIS samples is an important contribution because these results can be included in future meta-analyses and other genetic analyses, which are more tailored to South American and other Latino and Hispanic populations.

Funding and disclosure

LED and HS have been funded by startup funds from Stanford and a pilot grant to LED from the Stanford Center for Clinical and Translation Research and Education (UL1 TR001085, PI Greenberg). LED has also been funded by Cohen Veterans Bioscience (CVB), and she is part of the CVB Working Group for PTSD Adaptive Platform Trial. BG has been funded by the NIH (R01-HD-059835, PI Williams) and CVB. HH has been funded by the NIH (NIH K01DK114379 and NIH R21AI139012), the Zhengxu and Ying He Foundation, and the Stanley Center for Psychiatric Research. MBR received funds from WPA Congress Mexico City 2018, Guayaquil CEPAM 2019, Asunción X CONGRESO LATINOAMERICANO DE LA FLAPB 2018, Guayaquil 2019 (Bago), and Lancet Psychiatry, London (commission on Violence against women) 2019. SS declares no potential conflict of interest.

References

Wray NR, Lee SH, Mehta D, Vinkhuyzen AAE, Dudbridge F, Middeldorp CM. Research review: polygenic methods and their application to psychiatric traits. J Child Psychol Psychiatry. 2014;55:1068–87.
Article Google Scholar
Wray NR, Kemper KE, Hayes BJ, Goddard ME, Visscher PM. Complex trait prediction from genome data: contrasting EBV in livestock to PRS in humans: genomic prediction. Genetics. 2019;211:1131–41.
Article Google Scholar
Wray NR, Goddard ME, Visscher PM. Prediction of individual genetic risk to disease from genome-wide association studies. Genome Res. 2007;17:1520–8.
Article CAS Google Scholar
Purcell SM, Wray NR, Stone JL, Visscher PM, O’Donovan MC, Sullivan PF, et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature. 2009;460:748–52.
Article CAS Google Scholar
Pardiñas AF, Holmans P, Pocklington AJ, Escott-Price V, Ripke S, Carrera N, et al. Publisher correction: common schizophrenia alleles are enriched in mutation-intolerant genes and in regions under strong background selection. Nat Genet. 2019;51:1193.
Article CAS Google Scholar
Scutari M, Mackay I, Balding D. Using genetic distance to infer the accuracy of genomic prediction. PLoS Genet. 2016;12:e1006288.
Article CAS Google Scholar
Martin AR, Kanai M, Kamatani Y, Okada Y, Neale BM, Daly MJ. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat Genet. 2019;51:584–91.
Article CAS Google Scholar
Duncan L, Shen H, Gelaye B, Meijsen J, Ressler K, Feldman M, et al. Analysis of polygenic risk score usage and performance in diverse human populations. Nat Commun. 2019;10:3328.
Article CAS Google Scholar
Borba CPC, Gelaye B, Zayas L, Ulloa M, Lavelle J, Mollica RF, et al. Making strides towards better mental health care in Peru: results from a primary care mental health training. Int J Clin Psychiatry Ment Health. 2015;3:9–19.
Article CAS Google Scholar
Sanchez SE, Pineda O, Chaves DZ, Zhong Q-Y, Gelaye B, Simon GE. et al. Childhood physical and sexual abuse experiences associated with post-traumatic stress disorder among pregnant women. Ann Epidemiol. 2017;27:716–.723.e1.
Article Google Scholar
Gelaye B, Zhong Q-Y, Basu A, Levey EJ, Rondon MB, Sanchez S, et al. Trauma and traumatic stress in a sample of pregnant women. Psychiatry Res. 2017;257:506–13.
Article Google Scholar
Barrios YV, Sanchez SE, Nicolaidis C, Garcia PJ, Gelaye B, Zhong Q, et al. Childhood abuse and early menarche among Peruvian women. J Adolesc Health Publ Soc Adolesc Med. 2015;56:197–202.
Article Google Scholar
Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. 2001;16:606–13.
Article CAS Google Scholar
Spitzer RL, Kroenke K, Williams JB. Validation and utility of a self-report version of PRIME-MD: the PHQ primary care study. Primary Care Evaluation of Mental Disorders. Patient Health Questionnaire. JAMA. 1999;282:1737–44.
Article CAS Google Scholar
Gelaye B, Zheng Y, Medina-Mora ME, Rondon MB, Sánchez SE, Williams MA. Validity of the posttraumatic stress disorders (PTSD) checklist in pregnant women. BMC Psychiatry. 2017;17:179.
Article Google Scholar
Wilkins KC, Lang AJ, Norman SB. Synthesis of the psychometric properties of the PTSD checklist (PCL) military, civilian, and specific versions. Depress Anxiety. 2011;28:596–606.
Article Google Scholar
Zhong Q-Y, Bizu G, Rondon MB, Sánchez SE, Simon GE, Henderson DC, et al. Using the Patient Health Questionnaire (PHQ-9) and the Edinburgh Postnatal Depression Scale (EPDS) to assess suicidal ideation among pregnant women in Lima, Peru. Arch Women’s Ment Health. 2015;18:783–92.
Article Google Scholar
Nievergelt CM, Maihofer AX, Klengel T, Atkinson EG, Chen C-Y, Choi KW, et al. International meta-analysis of PTSD genome-wide association studies identifies sex- and ancestry-specific genetic risk loci. Nat Commun. 2019;10:4558.
Article CAS Google Scholar
1000 Genomes Project Consortium Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, et al. A global reference for human genetic variation. Nature. 2015;526:68–74.
Article CAS Google Scholar
Delaneau O, Zagury J-F, Marchini J. Improved whole-chromosome phasing for disease and population genetic studies. Nat Methods. 2013;10:5–6.
Article CAS Google Scholar
Conomos MP, Reiner AP, Weir BS, Thornton TA. Model-free estimation of recent genetic relatedness. Am J Hum Genet. 2016;98:127–48.
Article CAS Google Scholar
Abecasis GR, Altshuler D, Auton A, Brooks LD, Durbin RM, Gibbs RA, et al. A map of human genome variation from population-scale sequencing. Nature. 2010;467:1061–73.
Article CAS Google Scholar
Howard DM, Adams MJ, Clarke T-K, Hafferty JD, Gibson J, Shirali M, et al. Genome-wide meta-analysis of depression identifies 102 independent variants and highlights the importance of the prefrontal brain regions. Nat Neurosci. 2019;22:343–52.
Article CAS Google Scholar
Development Core Team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. ISBN 3-900051-07-0. 2005. 2005.
Hoffmann TJ, Ehret GB, Nandakumar P, Ranatunga D, Schaefer C, Kwok P-Y, et al. Genome-wide association analyses using electronic health records identify new loci influencing blood pressure variation. Nat Genet. 2017;49:54–64.
Article CAS Google Scholar
Levey DF, Polimanti R, Cheng Z, Zhou H, Nuñez YZ, Jain S, et al. Genetic associations with suicide attempt severity and genetic overlap with major depression. Transl Psychiatry. 2019;9:22.
Article CAS Google Scholar
Homburger JR, Moreno-Estrada A, Gignoux CR, Nelson D, Sanchez E, Ortiz-Tello P, et al. Genomic insights into the ancestry and demographic history of South America. PLoS Genet. 2015;11:e1005602.
Article CAS Google Scholar
Luo Y, Suliman S, Asgari S, Amariuta T, Baglaenko Y, Martínez-Bonet M, et al. Early progression to active tuberculosis is a highly heritable trait driven by 3q23 in Peruvians. Nat Commun. 2019;10:1–10.
Article CAS Google Scholar
Poldrack RA, Huckins G, Varoquaux G. Establishment of best practices for evidence for prediction: a review. JAMA Psychiatry. 2019. https://doi.org/10.1001/jamapsychiatry.2019.3671.
Duncan LE, Ratanatharathorn A, Aiello AE, Almli LM, Amstadter AB, Ashley-Koch AE, et al. Largest GWAS of PTSD (N=20 070) yields genetic overlap with schizophrenia and sex differences in heritability. Mol Psychiatry. 2018;23:666–73.
Article CAS Google Scholar
Duncan LE, Cooper BN, Shen H. Robust findings from 25 years of PTSD genetics research. Curr Psychiatry Rep. 2018;20:115.
Article Google Scholar

Download references

Acknowledgements

The authors are indebted to the participants of the PrOMIS study for their cooperation. They are also grateful to the dedicated staff members of Asociacion Civil Proyectos en Salud (PROESA), Peru and Instituto Especializado Maternao Perinatal, Peru, for their expert technical assistance with this research. Some of the computing for this project was performed on the Sherlock cluster. We would like to thank Stanford University and the Stanford Research Computing Center for providing computational resources and support that contributed to these research results.

Author information

These authors contributed equally: Hanyang Shen, Bizu Gelaye
These authors jointly supervised this work: Sixto Sanchez, Laramie E. Duncan

Authors and Affiliations

Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA
Hanyang Shen & Laramie E. Duncan
Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
Bizu Gelaye
Massachusetts General Hospital, Boston, MA, USA
Hailiang Huang
Stanley Center for Psychiatric Research Broad Institute of MIT and Harvard, Cambridge, MA, USA
Hailiang Huang
Universidad Peruana Cayetano Heredia, Lima, Peru
Marta B. Rondon
Universidad Peruana de Ciencias Aplicadas, Lima, Perú
Sixto Sanchez
Asociación Civil Proyectos en Salud, ACPROESA, Lima, Perú
Sixto Sanchez

Authors

Hanyang Shen
View author publications
You can also search for this author in PubMed Google Scholar
Bizu Gelaye
View author publications
You can also search for this author in PubMed Google Scholar
Hailiang Huang
View author publications
You can also search for this author in PubMed Google Scholar
Marta B. Rondon
View author publications
You can also search for this author in PubMed Google Scholar
Sixto Sanchez
View author publications
You can also search for this author in PubMed Google Scholar
Laramie E. Duncan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

LED, BG, HS conceived of the investigation and developed the analysis plan. MBR, SS, BG recruited and communicated with participants, and collected and cleaned the clinical data. HS conducted the analyses. HS and LED did the literature review for the paper. HS, LED, BG, and HH drafted the manuscript, and all authors contributed and edited the final manuscript.

Corresponding authors

Correspondence to Bizu Gelaye or Laramie E. Duncan.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Tables S1 and S2

Supplemental Figures 1 to 6

Supplemental Figures 7 to 11

GWAS results Depression_10PCs_N_of_3404

GWAS results PTSD_10PCs_N_of_3414

GWAS results Suicidal ideation_self harm_10PCs_N_of_3404

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shen, H., Gelaye, B., Huang, H. et al. Polygenic prediction and GWAS of depression, PTSD, and suicidal ideation/self-harm in a Peruvian cohort. Neuropsychopharmacol. 45, 1595–1602 (2020). https://doi.org/10.1038/s41386-020-0603-5

Download citation

Received: 01 July 2019
Revised: 30 December 2019
Accepted: 31 December 2019
Published: 11 January 2020
Issue Date: 01 September 2020
DOI: https://doi.org/10.1038/s41386-020-0603-5

This article is cited by

Multi-ancestry genome-wide association study of major depression aids locus discovery, fine mapping, gene prioritization and causal inference
- Xiangrui Meng
- Georgina Navoly
- Karoline Kuchenbaecker
Nature Genetics (2024)
Lost genome segments associate with trait diversity during rice domestication
- Xiaoming Zheng
- Limei Zhong
- Qingwen Yang
BMC Biology (2023)
Genome-wide association study meta-analysis of suicide death and suicidal behavior
- Qingqin S. Li
- Andrey A. Shabalin
- Hilary Coon
Molecular Psychiatry (2023)
Famine Exposure during Early Life and Risk of Cancer in Adulthood: A Systematic Review and Meta-Analysis
- J. Zhou
- Y. Dai
- Suyi Li
The Journal of nutrition, health and aging (2023)
A Polygenic Approach to Understanding Resilience to Peer Victimisation
- Jessica M. Armitage
- R. Adele H. Wang
- Claire M. A. Haworth
Behavior Genetics (2022)

Subjects

Abstract

Similar content being viewed by others

Introduction

Materials and methods

Study population

Depressive symptoms

PTSD assessment

Suicidal ideation and self-harm

Clinical, demographic, and environmental covariates

GWAS: quality control, imputation, and GWAS methods

Construction and analysis of PRSs

Multiple testing correction

Sensitivity analyses

Results

Participant characteristics

Ancestry assessment, plus matching of cases and controls on ancestry indicators

PRS predictions and comparison to effect sizes for covariates

Sensitivity analyses

GWAS results

Discussion

Funding and disclosure

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links