Introduction

Polygenic risk scores (PRSs) are individual-level metrics of genetic risk for particular phenotypes, and they are also known as risk profile scores (RPS) and genetic risk scores (GRS) [1]. Methods for polygenic scoring were first developed for livestock and agricultural purposes [2] and were applied in humans over a decade ago [3, 4]. PRSs have become increasingly popular as the size (and importantly, the statistical power) of genome-wide association studies (GWAS) have increased, resulting in improved performance of polygenic risk scores for the prediction of psychiatric disorders. For example, as the sample size (and statistical power) of schizophrenia GWAS increased from approximately N = 7000 to approximately N = 36,000, the maximum amount of phenotypic variance explained by schizophrenia PRSs increased from ~3 to ~12% [4, 5].

Over a decade ago, the original developers of PRSs demonstrated the poorer transferability of PRSs across ancestry groups as compared to within-ancestry predictions [4]. Subsequent theoretical work suggested an approximately linear decline in predictive performance of PRSs as a function of genetic distance between training and target samples [6], and empirical observations are generally consistent with this expectation [7, 8]. However, it is still the case that most GWAS (training) PRS (testing) studies are conducted in high-income countries. GWAS analyses conducted in populations with different environmental, cultural, and ancestral backgrounds are needed to increase the generalizability of PRS prediction, and PRS studies are needed to clarify the extent to which environmental factors also impact generalizability of PRSs.

As has been the case in medical genetics more broadly, there have been few GWAS and PRS studies of psychiatric outcomes in South America. Of the available studies, Hispanic/Latino samples from the US are most similar to admixed populations in South America in terms of ancestry. However, environmental conditions and culture are oftentimes considerably different. Compared to people living in North America and Europe, many individuals living in South America are exposed to high rates of traumatic events and low-socioeconomic status, which lead to increased burden of psychiatric disorders [9].

Despite knowledge of the impact of environmental factors on psychiatric disorder risk, environmental variables have often been omitted from polygenic analyses. The challenge is often a practical one in consortium analyses, in which collecting genetic and minimal phenotypic data is already a tremendous challenge, and consequently environmental variables are not collected from contributing investigators. Thus, there is a need for inclusion of relevant environmental variables in GWAS and polygenic prediction studies.

Here we present a genetics study of 3308 participants from Lima, Peru [10, 11]. This is the largest psychiatric genetics study in South America to date. The goals of the current study were twofold. First, polygenic prediction analyses were conducted on three major psychiatric phenotypes—depression, PTSD, and suicidal ideation/self-harm—with controlling for potential environmental variables. The results demonstrate successful cross-ancestry, cross-cultural predictions. Second, GWASs of these three phenotypes were conducted in the cohort in order to make summary statistics from this Peruvian cohort available to researchers worldwide.

Materials and methods

Study population

The population for the present study was drawn from participants of the Pregnancy Outcomes, Maternal and Infant Study (PrOMIS) cohort. The PrOMIS cohort is a longitudinal study aimed at understanding the life course and intergenerational effects of interpersonal violence and other forms of trauma among Peruvian women. The study was conducted between February 2012 and November 2015. The methodology and study procedures have been described previously [12]. Briefly, the present sample consists of 3308 women who attended prenatal care clinics at the Instituto Nacional Materno Perinatal (INMP) in Lima, Peru. The INMP is a reference national institution for maternal and perinatal care. Participants were invited to take part in an interview where trained research personnel used a structured questionnaire to elicit information regarding maternal socio-demographics, lifestyle characteristics, medical and reproductive histories, and mental health symptoms. All participants provided written informed consent. The institutional review boards of the INMP, Lima, Peru, and the Office of Human Research Administration, Harvard T.H. Chan School of Public Health, Boston, MA approved all procedures used in this study.

Depressive symptoms

Depression symptoms were assessed using the Patient Health Questionnaire (PHQ-9). The PHQ-9 is a nine-item, self-report depression screening scale derived from the Primary Care Evaluation of Mental Disorders [13, 14]. This questionnaire assesses nine depressive symptoms experienced over the past 2 weeks prior to the interview: anhedonia, depressed mood, trouble sleeping, feeling tired, change in appetite, guilt or worthlessness, concentration problems, psychomotor agitation/retardation, and suicidal thoughts/self-harm. The total PHQ-9 score was calculated by summing scores of 0, 1, 2, or 3 to the response categories of “not at all,” “several days,” “more than half the days,” or “nearly every day” for each symptom. As a symptom measure, the total score ranged from 0 to 27. The PHQ-9 has been previously validated in Spanish-speaking populations [15].

PTSD assessment

The Post-traumatic Stress Disorder Checklist-Civilian Version (PCL-C) was used to assess PTSD symptoms. The PCL-C is a 17-item self-reported questionnaire designed according to the Diagnostic and Statistical Manual of Mental Disorders (DMS-IV) criteria. Each item assesses PTSD symptoms experienced over the past month on a 5-point Likert scale, with a total score ranging from 17 to 85. Higher scores indicate more severe PTSD. The PCL-C has demonstrated adequate levels of internal consistency, inter-rater reliability, test−retest reliability, and convergent validity when applied to different clinical and nonclinical populations [16]. The Spanish-language version has been found to have similar psychometric properties to the English-language version.

Suicidal ideation and self-harm

The phenotype of suicidal ideation/self-harm in our study is based on one single item from the PHQ-9 form [13]. Item 9 asks about “thoughts that you would be better off dead, or of hurting yourself” in the 14 days prior to evaluation. Participants who responded “not at all” were classified as no for suicidal ideation/self-harm and participants with any other responses were coded as yes.

Clinical, demographic, and environmental covariates

Self-reported clinical, demographic, and environmental variables were also collected at participants’ prenatal care visits [17]. These environmental variables include: experience of any childhood abuse (yes vs. no); experience of any lifetime Intimate Partner Violence (IPV) (yes vs. no); age (<20, 20−24, 25−29, 30−34, and ≥35 years); educational attainment (≤6, 7−12, and >12 completed years of schooling); marital status (married/living with a partner vs. others); access to basic foods (hard vs. not very hard); employment status (employed vs. not); ethnicity (Mestizo vs. others); early pregnancy body mass index (BMI) (<18.5, 18.5−24.9, 25−29.9, ≥30); parity (nulliparous vs. multiparous); and planned pregnancy (yes vs. no).

GWAS: quality control, imputation, and GWAS methods

GWAS quality control and imputation was performed according to published PGC procedures, and this (PrOMIS) sample is part of the second empirical paper from the PTSD group of the PGC (PGC-PTSD) [18]. Briefly, samples and variants were excluded sequentially according to the following criteria: SNPs with missingness >5%, samples with variant missingness >2%, samples with deviation from expected inbreeding coefficient (fhet < −0.2 or >0.2), samples with sex discrepancy (discordant reported vs. sex estimated from genotypes), SNPs with missingness >2%, SNPs with missingness differences between cases and controls >2%, monomorphic SNPs, SNPs with Hardy Weinberg Equilibrium deviation p value < 1 × 10−6 in controls from the largest ancestry group (list of SNPs then applied to all samples). For imputation, the 1000Genomes phase 3 data were used [19]. Prephasing was conducted with default settings in SHAPEIT2 v2.r837 [20], followed by phasing in 3 megabase (MB) blocks, where an additional 1 MB of buffer was added to either end of each block. IMPUTE2 v2.2.2 was then used with default settings in order to obtain imputed genotypes. For subsequent analyses, imputed variants with imputation quality (INFO) scores >0.8, missingness <1%, and minor allele frequency >5% were retained.

We used PC-relate to exclude related participants, and we also used PC-Air for principal component (PC) analysis in our finalized sample [21]. In order to show how PrOMIS samples are distributed among global populations with various ancestries, we merged the cleaned PrOMIS genotype data with the 1000Genomes dataset [22]. PCs were also calculated for the merged dataset using PC-Air method. Scatterplots of individuals’ scores on pairs of PCs were visually inspected.

Construction and analysis of PRSs

This investigation used cleaned individual-level genotype data from the PrOMIS participants and summary statistics for depression [23] and PTSD [18] to calculate PRSs for each participant. Summary statistics files were downloaded from the PGC website (https://www.med.unc.edu/pgc/results-and-downloads). The depression summary statistics were from a meta-analysis of 807,553 people of European ancestry (file name: 2019-pgc-ukb-depression-genome-wide.txt; md5checksum value: ed4597a4e7fa168fb96970e3286a0b31). The PTSD summary statistics were from a study of 206,655 people of diverse ancestries. In order to avoid overlapping samples between training GWAS and target PRS analysis, we used internal PGC GWAS summary statistics file that had PrOMIS samples omitted (file name: all_prom_maf01_info6.results_neff.gz; md5checksum value: dde2a515fb1274a9014a5f4c425437b2). We pruned the publicly available dataset using LD information from the 1000Genomes samples (all participants) with window size 500, p value < 1, and r2 equals 0.2.

PRSs were calculated with PLINK. Each person’s score was the sum of weighted risk alleles. Weights were the log of odd ratios (OR) from the summary statistics files, for each risk allele. SNPs with missing ORs were excluded from the calculation and the –no-mean-imputation flag was used. Our original analysis plan was to use 13 p value thresholds for inclusion of SNPs to calculate PRSs based on sets of variants ranging from Psi (meaning genome-wide significant variants only, i.e. pT < 5 × 10−8), to all, meaning all (pruned) SNPs available. The 13 possible thresholds were: Psi (pT < 5 × 10−8), Pe6 (pT < 1 × 10−6), pe4 (pT < 1 × 10−4), pe3 (pT < 1 × 10−3), pe2 (pT < 1 × 10−2), P05 (pT < 0.05), P10 (pT < 0.1), P20 (pT < 0.2), P30 (pT < 0.3), P40 (pT < 0.4), P50 (pT < 0.5), P75 (pT < 0.75), and all (pT < 1). However, we could only use the latter 12 thresholds for PTSD, given that the PTSD dataset had no genome-wide significant variants.

First, we standardized the scores by subtracting the mean and dividing by the standard deviation of each score. Then we then fit logistic regression models using these standardized PRSs to predict the phenotypes. The reported p values for the polygenic prediction terms were from a full model that included the top ten PCs calculated in PrOMIS samples (as noted above) and the additional clinical, demographic, and environmental covariates given in Table 1. The effect size reported for polygenic analyses was linear r2/Nagelkerke’s pseudo-r2 of the full model minus by linear r2/Nagelkerke’s pseudo-r2 from a basic model which included no PRS term, to predict phenotype. R [24] 3.5 was used for regression models and visualizing the results.

Table 1 Baseline characteristics of PrOMIS participants.

Multiple testing correction

Multiple testing correction was conducted for polygenic prediction analyses, for the 39 depression-based polygenic predictions (p < 0.05/39 tests = p < 1.3 × 10−3), given that these were, a priori, the tests deemed likely to be adequately powered. We also corrected for all 75 statistical tests (p < 0.05/75 = p < 6.7 × 10−4), in order to denote statistical significance with full Bonferroni correction.

Sensitivity analyses

First, we repeated our analyses without the clinical, demographic, and environmental covariates. Recognizing that population stratification might be the biggest threat to the validity of our results, we also conducted sensitivity analyses. Second, we conducted our analyses in more homogeneous subgroups of participants. Visual inspection of PC scatterplots was used to select two increasingly stringent subgroups of participants (from original N = 3308 to 2964, to 2690). Third, we repeated our analyses with multiple choices of PCs in order to test the robustness of our results to varying choices of which PCs to include. The choices tested, in addition to our a priori decision to use ten PCs were: (1) no PCs, (2) first PC, (3) first two PCs, (4) first three PCs, and (5) 20 PCs.

There is no PGC-PTSD plan to make the PrOMIS summary statistics (alone) publicly available, so we are releasing them with this publication. In addition to the PTSD GWAS for this sample (N = 3414, 1698 cases), we also conducted GWAS on depression (N = 3404, 1076 cases) and suicidal ideation/self-harm (N = 3404, 522 cases) in order to make these summary statistics available. For these analyses, we use binary outcomes so that researchers may meta-analyze these results with other psychiatric GWAS, which nearly always use binary (rather than continuous) outcomes. Ten principal components were used as covariates in all GWAS (logistic regression conducted with PLINK). Manhattan plots of results were created using R [24].

Results

Participant characteristics

The participants’ clinical, demographic, and environmental characteristics are shown in Table 1. All participants in our cohort are females between the age of 17−47 years old (mean = 28.2, SD = 6.3). BMI was in the normal range for 46.6% of participants. A majority of the participants self-identified as Mestizo ethnicity (76.1%), married/living with a partner (81.4%), unemployed (53.5%), and with less than 12 years of education (53.5%). The mean depression (PHQ-9) score and PTSD (PCL-C) scores are 8.2 (SD = 5.3) and 27.4 (SD = 9.3) respectively. The prevalence of suicidal ideation/self-harm in the cohort is 15.4%.

Ancestry assessment, plus matching of cases and controls on ancestry indicators

The PC results from the combined dataset of PrOMIS and 1000Genomes were informative regarding the ancestry of PrOMIS samples. Figure 1a shows how PrOMIS samples are distributed when plotted with 1000Genomes participants, as represented in scatterplots of pairs of the first three PCs. Most of the PrOMIS participants aligned well with the 1000Genomes PEL samples (Peruvians from Lima, Peru). A small number of PrOMIS samples were dispersed among other samples from populations in the Americas, and this was also the case for a minority of the PEL samples as well. Figure 1b shows the top three PCs from PrOMIS participants only, with PTSD cases and controls denoted. Figure S1 shows the same plot but with depression and suicidal ideation/self-harm status denoted by color. The even distribution of symptom scores reflects good matching on ancestry indicators (PCs).

Fig. 1: Principal components plots of PrOMIS and 1000Genomes participants.
figure 1

Each point is one person. a PrOMIS samples and 1000Genomes samples are plotted together, based upon principal components computed on combined samples. PrOMIS samples (magenta) cluster with PEL 1000Genomes samples (gold triangles). b PrOMIS samples only: depression severity scores are evenly distributed across principal components. Population abbreviations are those specified by the 1000 Genomes Consortium. *For major populations: AFR African ancestry, AMR Americas ancestry, EAS East Asian ancestry, EUR European ancestry, SAS South Asian ancestry [22]. **For subpopulations: ACB African Caribbean in Barbados, ASW African Ancestry in Southwest US, BEB Bengali in Bangladesh, CDX Chinese Dai in Xishuangbanna, CEU Utah residents with Northern and Western European ancestry, CHB Han Chinese in Beijing, CHS Southern Han Chinese, CLM Colombian in Medellin, ESN Esan in Nigeria, FIN Finnish in Finland, GBR British in England and Scotland, GIH Gujarati Indian in Houston, GWD Gambian in Western Division, IBS Iberian populations in Spain, ITU Indian Telugu in the UK, JPT Japanese in Tokyo, KHV Kinh in Ho Chi Minh City, LWK Luhya in Webuye, MSL Mende in Sierra Leone, MXL Mexican Ancestry in Los Angeles, PEL Peruvian in Lima, PJL Punjabi in Lahore, PUR Puerto Rican in Puerto Rico, STU Sri Lankan Tamil in the UK, TSI Toscani in Italy, YRI Yoruba in Ibadan.

PRS predictions and comparison to effect sizes for covariates

The PRS prediction results are shown in Fig. 2 and Table S1. Depression PRS constructed using SNPs with discovery GWAS pT < 1 yielded the best prediction result for depression score, explaining 0.62% of phenotypic variance (pT < 4 × 10−6). The betas for all variables in the model are shown in Fig. 3a. Compared to the covariate with the highest risk (education less than 6 years) with beta as 1.66 on depression symptom score, depression PRS with pT < 1 has a beta of 0.45. Compared to PRSs based on less SNPs, PRSs based on more SNPs tended to yield better prediction results. Depression PRS with pT < 0.2 and pT < 0.1 explained the highest phenotypic variance in PTSD and suicidal ideation/self-harm with both around 0.3% in the PrOMIS samples (p = 0.001 and p = 0.01 respectively). PRSs for PTSD constructed using SNPs with discovery GWAS pT < 1 × 10−2 and pT < 0.05 predicted PTSD (nominal significance), with variance explained at 1% (p = 0.04). Compared to the covariate with the highest risk, IPV, with beta of 4.72 on PTSD symptom score, PTSD PRS with pT < 0.05 has a beta of 0.54 as shown in Fig. 3b.

Fig. 2: Polygenic prediction results in the PrOMIS cohort for three phenotypes (depression severity score, PTSD severity score, and suicidal ideation/self-harm) using two psychiatric polygenic risk scores, controlling for covariates.
figure 2

a Depression polygenic risk scores; b PTSD polygenic risk scores.

Fig. 3: Magnitude of effects of polygenic scores and covariates on psychiatric outcome variables.
figure 3

Betas and 95% confidence intervals are given with reference to the outcome variables of a Depression symptom score, and b PTSD symptom score.

Sensitivity analyses

Three types of sensitivity analyses were conducted. The results remained similar to our primary analyses. First, Fig. S2 shows PRS results without covariates (note: PC covariates still included). The variance explained by PRSs was slightly higher, but the results are highly similar to the full model. Second, as shown in Figs. S3 and S4, with decreasing sample size the variance explained varied somewhat, but the pattern of results did not change. Third, alternative choices of PCs were used, and the results are shown in Figs. S5S9. The phenotypic variance explained varied somewhat when different sets of PCs were included, but PRSs remained statistically significant predictors of phenotypes. In sum, the results of this analysis are robust to the inclusion/exclusion of clinical, demographic, and environmental covariates. Further, the results are also relatively similar even when the number of PCs (which correct for ancestry) varied from zero to 20 PCs.

GWAS results

Three sets of GWAS results (for depression, PTSD, and suicidal ideation/self-harm) are available as Supplementary Data (see GWAS results files for depression, PTSD, and suicidal ideation/self-harm). Manhattan plots of these analyses are shown in Fig. S10. Full GWAS results are provided in the service of increasing access to non-European-ancestry summary statistics and to facilitate future meta-analyses. As shown in Fig. S10, the results are null with respect to individual loci. This was the expected outcome for an appropriately cleaned but underpowered GWAS. For completeness, we also provided the polygenic prediction results for our binary phenotypes in Fig. S11; results are similar to those using continuous phenotypic outcomes.

Discussion

To our knowledge, this is the first report to provide psychiatric polygenic scoring and GWAS results from a large South American sample. Our results demonstrate that polygenic scores derived from primarily European-ancestry, non-admixed individuals, from high-income countries (almost exclusively in Europe, the US, and Australia) are valid predictors of psychiatric phenotypes, even for individuals from considerably different environments, cultures, and ancestries. This means that genetic influences on depression, PTSD, and suicidal ideation/self-harm are at least partially shared across these populations [4, 7, 8]. The GWAS results made available here support the broader research goal of increasing genetic resource availability from non-European-ancestry populations, and from LMICs. To the extent that polygenic risk scores and other genetics-based treatments become useful in the clinic, it is critical that such genetic data resources be made available for multiple major global populations, so that new interventions are not disproportionately helpful to European-ancestry individuals.

The present polygenic results are consistent with previous literature; depression and PTSD PRSs are predictive of psychiatric outcomes in population from other ancestries [4, 8, 25]. However, compared to previous PRS prediction in European-ancestry samples, the variance explained in our study was lower. In Levey et al. [26], the depression-based PRSs explained up to 0.7% of the phenotypic variance for suicide attempt, which is higher than our results (0.3%). In Nievergelt et al. [18] the PTSD-based PRSs explained 0.15% variance of PTSD phenotype in their samples, which is also higher than our results (0.12%). There are several potential explanations for these differences. First, as has been reported previously, polygenic scores tend to work best when the ancestry of the training cohort (typically European, given historical sample collection rates) matches the testing cohort. As genetic distance between populations increases, prediction performance is hypothesized to decrease [6], and this has been demonstrated empirically [7, 8].

Notably, PRS performance in Latino and Hispanic samples has been relatively comparable to performance in European-ancestry samples in prior studies conducted within the United States [8]. Thus, it is possible that additional factors contributed to the somewhat lower predictive performance of PRSs in this study. Participants in this study are from a disadvantaged environment compared to their peers in developed countries. These substantial environmental differences could alter the relative importance of genetic and environmental factors in the development of depression, PTSD, and suicidal ideation/self-harm. Second, heterogeneity in the disease phenotype across populations and potential differences in measurement could also decrease the maximum potential prediction. Third, compared to other South American populations, Peruvian individuals have been found to have a higher proportion of Native American ancestry [27, 28]. Given all of these differences, it is noteworthy that our results still show that PRSs are valid predictors in a sample that differs in ancestry, culture, and severity of environmental risks. Finally, it should be noted that there is discussion within the research community regarding how polygenic scoring results should be reported (e.g. as area under the receiver operating characteristics, AUROC) and terminology (e.g. whether or not the term “prediction” should be used) [29].

The depression PRSs explained more phenotypic variance in PTSD, than the PTSD PRSs explained in PTSD. This is almost certainly due to two factors. First, the best-available depression GWAS is considerably better powered than the best-available PTSD GWAS [18, 30]. Second, given that depression and PTSD share genetic influences, the better-powered depression GWAS affords better prediction of PTSD than the currently available PTSD GWAS. As PTSD sample sizes increase, it is reasonable to expect improvement in predictive performance of PTSD PRSs.

Regarding the biggest potential threat to external validity of these results—population stratification—extra steps were taken to ensure that these results were not attributable to confounding of ancestry and phenotypic status. Sensitivity analyses used more ancestrally homogeneous subsamples and alternative choices of PCs to test the robustness of the main findings. Even using these extra steps (which are infrequently employed in polygenic scoring studies), the results were nearly identical to the main findings. Regarding future directions, researchers can examine female- and male-specific polygenic risk scores for PTSD when power increases in discovery GWAS for PTSD. This is important given evidence that there may be sex differences in the genetics of PTSD [30, 31].

In sum, this study provides an important extension to global genetics research focused on mental health. Clinically, our participants are females from a low-socioeconomic status community in Peru. This is a population at higher risk of mental disorders compared to their peers from high-income countries, likely due to high exposure to adverse life events. The women in our sample have been highly exposed to violence both during childhood and adulthood, and—as in higher-income countries—these are potent risk factors for psychiatric phenotypes. Consistent with what has been found in high-income countries, these results suggest that vulnerability to depression, PTSD, and suicidal ideation/self-harm is also partially influenced by genetics, and moreover, that polygenic liability to psychiatric phenotypes is at least partially shared across populations around the globe. As research continues in this and other populations from LMICs, the ability to include genetic information will prove valuable as scientists build upon the broader body of literature, which has historically been more focused on European-ancestry populations from high-income countries. Thus, the sharing of GWAS results from the PrOMIS samples is an important contribution because these results can be included in future meta-analyses and other genetic analyses, which are more tailored to South American and other Latino and Hispanic populations.

Funding and disclosure

LED and HS have been funded by startup funds from Stanford and a pilot grant to LED from the Stanford Center for Clinical and Translation Research and Education (UL1 TR001085, PI Greenberg). LED has also been funded by Cohen Veterans Bioscience (CVB), and she is part of the CVB Working Group for PTSD Adaptive Platform Trial. BG has been funded by the NIH (R01-HD-059835, PI Williams) and CVB. HH has been funded by the NIH (NIH K01DK114379 and NIH R21AI139012), the Zhengxu and Ying He Foundation, and the Stanley Center for Psychiatric Research. MBR received funds from WPA Congress Mexico City 2018, Guayaquil CEPAM 2019, Asunción X CONGRESO LATINOAMERICANO DE LA FLAPB 2018, Guayaquil 2019 (Bago), and Lancet Psychiatry, London (commission on Violence against women) 2019. SS declares no potential conflict of interest.