Abstract
Age at menopause (AOM) has a substantial impact on fertility and disease risk. While many loci with variants that associate with AOM have been identified through genome-wide association studies (GWAS) under an additive model, other genetic models are rarely considered1. Here through GWAS meta-analysis under the recessive model of 174,329 postmenopausal women from Iceland, Denmark, the United Kingdom (UK; UK Biobank) and Norway, we study low-frequency variants with a large effect on AOM. We discovered that women homozygous for the stop-gain variant rs117316434(A) in CCDC201 (p.(Arg162Ter), minor allele frequency ~1%) reached menopause 9 years earlier than other women (P = 1.3 × 10−15). The genotype is present in one in 10,000 northern European women and leads to primary ovarian insufficiency in close to half of them. Consequently, homozygotes have fewer children, and the age at last childbirth is 5 years earlier (P = 3.8 × 10−5). The CCDC201 gene was only found in humans in 2022 and is highly expressed in oocytes. Homozygosity for CCDC201 loss-of-function has a substantial impact on female reproductive health, and homozygotes would benefit from reproductive counseling and treatment for symptoms of early menopause.
Similar content being viewed by others
Main
Menopause is caused by the depletion of the primordial follicle pool. There is a broad variation in the age of menopause (AOM), and early menopause (EM) impacts health, quality of life (https://www.menopausemandate.com/) and fertility potential2,3,4. It is estimated that natural fertility ends on average 10 years before menopause3,5. At the extreme end of the AOM distribution is primary ovarian insufficiency (POI) with cessation of menses before the age of 40 years, which occurs in 1–4% of women4. EM and POI are a well-known cause of infertility, which is increasingly relevant as women in many populations are choosing to have children later in life3.
Through genome-wide association studies (GWAS), we and others have reported associations of rare and low-frequency variants with variation in AOM, mostly under an additive model6,7,8. Rare variants in several genes have also been reported to cause Mendelian forms of POI9 although many are only reported in a small number of cases or in single families4,10. Despite advances in understanding the genetic causes of EM and POI, genetic screening has mainly been focused on Turner syndrome, which has a prevalence of 1 in 2,000, and the FMR1 premutation, found in 1 in 8,000 women4,10.
We performed a GWAS meta-analysis for AOM under the recessive model as well as the additive one (not affected by surgical procedures, such as hysterectomy and/or oophorectomy) on 174,329 postmenopausal women from Iceland, the United Kingdom (UK), Denmark and Norway (nIceland = 27,281, nUK = 137,906, nDenmark = 5,978 and nNorway = 3,161; Supplementary Tables 1 and 2). We tested 39.3 million sequence variants for associations with AOM (Fig. 1 and Supplementary Figs. 1 and 2).
Homozygosity (n = 27 women) for the low-frequency stop-gain variant p.(Arg162Ter) (rs117316434(A), chr7: 45863165; minor allele frequency (MAF) ~1%) in CCDC201 is associated with earlier AOM by 9 years than in heterozygotes and noncarriers (recessive effect = −1.59 s.d.; 95% confidence interval (CI): −1.98, −1.20), recessive P = 1.3 × 10−15; Figs. 1 and 2 and Table 1). The effect of the variant did not differ between the four groups (Phet = 0.28; Table 1). The association was genome-wide significant in the UK, the largest of the four groups (P = 3.6 × 10−13), and was also significant in the remaining three sample sets combined (P = 2.4 × 10−4; Table 1). The effect of p.(Arg162Ter) in CCDC201 on AOM deviates from the additive model and is limited to homozygotes (Supplementary Figs. 3 and 4). We did not detect an association with AOM under the additive model (additive effect = 0.029 s.d. (95% CI: −0.0094, 0.066), P = 0.16). We did not find a significant association of the p.(Arg162Ter) variant with any case–control or quantitative traits under the additive model.
AOM data can be used to define EM (AOM before age 45 years) and POI (AOM before age 40 years) as case–control traits11. As expected, homozygotes are at high risk of EM (odds ratio (OR) = 35.5 (95% CI: 17.5, 71.6), P = 5.4 × 10−23), with 93% of homozygotes entering menopause before the age of 45 compared to 11% of heterozygotes and noncarriers. Homozygotes are also at high risk of POI (OR = 27.3 (95% CI: 9.38, 82.6), P = 2.2 × 10−9; Table 2), with 33% of homozygotes (9 of 27 women) entering menopause before the age of 40 compared with 3% of heterozygotes and noncarriers. It has been observed that there is a tendency to report values ending in 0 or 5 when women are asked to recall their AOM12, and we observed a similar trend (Supplementary Fig. 5). Assuming that half of women who report their AOM at exactly 40 are truly below that age, we estimated the probability of developing POI among homozygotes to be 46% ((9 + 3.5)/27) compared to 4% among heterozygotes and noncarriers ((4,687 + 1,728.5)/174,302). Based on this estimate of POI derived from AOM, 0.19% (or 1 of 513) of all POI cases are caused by p.(Arg162Ter) homozygosity. This is in line with POI based on the International Classification of Diseases, Tenth Revision (ICD-10) diagnostic code E283 in the UK Biobank 500k whole-genome sequenced (WGS) set, where one homozygote is observed among the 571 cases (0.17%).
We tested the effect of p.(Arg162Ter) homozygosity on 34 reproductive, anthropometric and hormonal traits in women (requiring P < 0.05/32 = 0.0015 to account for multiple testing; Table 3, Supplementary Tables 2–7 and Supplementary Data 1). p.(Arg162Ter) homozygous women had one fewer child on average than other women (P = 0.00011). Also, the 17 childbearing p.(Arg162Ter) homozygous women had their last childbirth 5 years earlier than other women (recessive effect = −1.17 s.d. (95% CI: −1.72, −0.61), P = 3.8 × 10−5).
A substantial fraction (51%) of noncarrier and heterozygous mothers give birth after the age of 30 years, while very few homozygous mothers have children after the age of 30 years (16%; Fig. 3a,b). Additionally, homozygous women are more likely to have no children or only one child than noncarrier and heterozygous women (Fig. 3c–f). Consistently, homozygous women are at greater risk of being diagnosed with infertility in electronic health records (ICD-10 code N97; OR = 7.3, P = 0.00019). While homozygotes exhibit a trend toward earlier childbearing, the proportion of women with children before 25 is not significantly different between homozygotes and noncarriers (OR = 1.23, P = 0.066; Fig. 3c and Supplementary Table 8). This suggests that earlier childbearing in homozygotes is not due to a higher frequency of very early births. Instead, it may be linked to infertility, potentially requiring conception attempts at a younger age for homozygotes.
Notably, p.(Arg162Ter) homozygosity did not associate significantly with age at menarche, anthropometric traits, sex hormones, twinning or recombination phenotypes13 (Supplementary Tables 5–7), nor did it associate with the reproductive profile or infertility of males, indicating a female-specific effect (Supplementary Table 3). Under the additive model, the effect of p.(Arg162Ter) on twinning is nominally significant (OR = 1.46, P = 0.017), but does not meet our threshold for statistical significance after accounting for multiple testing (Supplementary Table 5).
In total, 290 variants have previously been reported in ref. 14 to associate with the age of menopause under the additive model and 44 under the recessive model, for which we provide robust replication (reported under the additive model excluding UK Biobank data: 281/290 = 97%; reported under the recessive model: 43/44 = 98%; Supplementary Data 2 and 3). At the CCDC201 locus, we note that a study discussed in ref. 14 reported a common variant (rs1826838; MAF = 38%) with a small effect on AOM under the additive model (effect in current meta-analysis = 0.025 s.d., P = 7.8 × 10−11). The variant rs1826838 is a 3′-UTR variant in the CCDC201 gene located 2,814 bp downstream of p.(Arg162Ter), and the variants represent two independent signals at the locus (r2 = 0.0065; Supplementary Table 9 and Supplementary Fig. 6). We note that the effect of the rare p.(Arg162Ter) homozygous genotype is 60-fold greater than that of heterozygotes for the common variant rs1826838.
For the current study, we provide summary statistics for the GWAS meta-analysis of the age of menopause for all tested variants under the recessive and additive models (Data availability).
The p.(Arg162Ter) variant is well imputed (imputation info >0.96) in the three population sets (Iceland, Denmark and Norway) based on the imputation of sequence variants detected through whole-genome sequencing. The UK Biobank analysis is based on a set of 500k WGS individuals. There was no discordance between homozygous genotypes of the p.(Arg162Ter) based on the set of 500k individuals with whole-genome sequence and the 500k set based on imputation from 155k sequenced individuals15.
The allele frequency (AF) of the p.(Arg162Ter) stop-gain variant in CCDC201 ranges from 0.74% to 1.15% in the four European sample sets, and 1 in 10,000 north Europeans are homozygous (Table 1). Within the UK Biobank set of 500k WGS individuals, 1000 Genomes and gnomAD data, p.(Arg162Ter) is very rare among those of Asian and African ancestry (Supplementary Fig. 7 and Supplementary Table 10). There was a north-to-south gradient among individuals of European descent in the UK Biobank, with Scandinavians showing the highest AF (AF = 1.25%, which means 1 in 6,400 individuals is expected to be homozygous) and the lowest frequency among southeast Europeans (AF = 0.11%, which means that 1 in 826,000 individuals is expected to be homozygous; Supplementary Fig. 7). To our knowledge, there are no reports of selection at this locus in European populations16,17,18,19, and in our data, no significant association with pigmentation traits is observed that might be indicative of selection (Supplementary Table 11).
CCDC201 encodes the coiled-coil domain-containing protein 201, a 187-amino acid protein. The stop-gain variant p.(Arg162Ter) is located in the third and final exon of CCDC201 and is predicted to result in a protein shortened by 25 amino acids (Supplementary Fig. 8). We note that p.(Arg162Ter) is located in the last exon of the CCDC201 gene and is therefore flagged as a low-confidence pLOF by the LOFTEE algorithm20. Possible loss-of-function mechanisms are truncation or instability of the truncated protein, but this awaits functional validation. The CCDC201 protein sequence is conserved across various mammals, suggesting that its function can be studied in other species. Interestingly, the 25 amino acids that could be lost by the p.(Arg162Ter) variant appear to be more conserved than the rest of the protein (Supplementary Figs. 9 and 10).
Interestingly, CCDC201 was one of 33 new protein-coding genes added to the National Center for Biotechnology Information (NCBI) Homo sapiens Annotation Release in November 2022, and it was previously not even annotated as a lncRNA or pseudogene (update 109.20211119, RefSeq release 210). The coding sequence for CCDC201 was previously missed due to a lack of spliced cDNA or expressed sequence tag evidence because its expression is restricted to female tissues21. Based on RNA sequencing (RNA-seq) data from Genotype-Tissue Expression (GTEx)22 and the human protein atlas23, CCDC201 shows the strongest expression in female tissues (ovary, breast and placenta) but is also present in other tissues such as testis. Based on single-cell RNA-seq data from the human protein atlas23, CCDC201 is one of 98 genes that show greater expression in oocytes than in other tissues.
Based on data from the GTEx project, among women 20–79 years of age, CCDC201 was most highly expressed in ovarian tissue among women aged 20–49 (premenopausal age), and CCDC201 gene expression is almost nonexistent in women over 50 years old (P = 0.00055, Wilcoxon rank-sum test; Supplementary Fig. 11). We did not observe a difference in the expression of CCDC201 in testes between age groups, which showed much lower expression than the ovaries (Supplementary Fig. 12).
Consistent with human tissue expression data, in mouse ENCODE RNA-seq data, expression of Ccdc201, the mouse homolog of CCDC201, is specific to the ovary and placenta21. Furthermore, in mice, Ccdc201 has been identified as a target of the oocyte-specific transcription factor FIGLA, which is known to control early folliculogenesis without affecting male germ cell differentiation24,25,26,27. Also, there is conflicting evidence that variants in FIGLA may cause autosomal dominant POI and female infertility9,28 (OMIM: 608697). Given that CCDC201 is a downstream target of FIGLA and shows oocyte-specific expression, we speculate that it may have a role in primordial follicle development and/or oocyte survival.
Performing a GWAS of AOM under the recessive model in 174,329 postmenopausal women from four European countries, we discovered that homozygosity of the low-frequency p.(Arg162Ter) in CCDC201 causes menopause to occur 9 years earlier than in noncarrier or heterozygotes and leads to POI of close to half of carriers. Despite the large effect of homozygosity for p.(Arg162Ter) on AOM, the association was not detected in large GWASs of that trait using an additive model29,30,31. In addition, annotating the variant as a coding loss-of-function variant has only been possible since 2022 when CCDC201 was annotated as a protein-coding gene in humans. In contrast with WGS data used in the current study, the CCDC201 region was not covered by capture libraries used for exome sequencing of UKB participants because this gene was uncharacterized when libraries were designed32. Around 1 in 10,000 northern European women are p.(Arg162Ter) homozygotes. This genotypic frequency is comparable to the frequency of the premutation of the FMR1 gene, which is the most common known genetic cause of POI (1 in 8,000 women)4. However, the penetrance of POI among carriers of the FMR1 gene premutation is approximately 20%, which is less than that of p.(Arg162Ter) homozygotes.
We observe a nominal association of p.(Arg162Ter) with earlier childbirth (before age 25) and an increased likelihood of twinning. Further work is needed to determine if these associations are real or chance observations due to multiple testing. If real, these observations suggest that p.(Arg162Ter) may increase oocyte activation, leading to earlier childbirth and twinning, but also hastening oocyte depletion and leading to EM.
Identification of p.(Arg162Ter) homozygosity in women presents an opportunity to take action in line with a shortened reproductive lifespan. This would involve referring homozygotes to a fertility specialist to plan their reproductive life and treat symptoms of EM, as is done for other genetic causes of POI4.
Methods
Study population
AOM was derived for individuals who were considered to have undergone natural menopause not affected by surgical procedures, such as hysterectomy and/or oophorectomy.
In Iceland, we used data on AOM obtained from the Icelandic Cancer Society’s Cancer Registry (n = 9,794) and from questionnaires from various genetic programs at deCODE genetics (n = 21,390), of which the majority was gathered through deCODE’s osteoporosis project and the deCODE Health study, which had also been genotyped. The Cancer Society’s data were collected from a questionnaire in the years 1964–1994, and deCODE genetics data from 1999 to 2022. All Icelandic data were collected through studies approved by the National Bioethics Committee (approvals VSN-15-198 and VSN-15-214) following review by the Icelandic Data Protection Authority. Participants donated blood or buccal samples after signing a broad informed consent allowing the use of their samples and data in all projects at deCODE genetics approved by the NBC. All personal identifiers of the participants’ data were encrypted by a third-party system, approved and monitored by the Icelandic Data Protection Authority.
The UK Biobank study33 is a large prospective cohort study of ~500,000 individuals in the age range of 40–69 years from across the UK. AOM (Data Field 3581) was collected from a touchscreen questionnaire at the UK Biobank assessment centers from 140,688 genotyped females who indicated that their periods had stopped (Data Field 2724). Only British individuals of European ancestry were included in the study. The UK Biobank data were obtained under application 56270. All phenotype and genotype data were collected following informed consent obtained from all participants. The North West Research Ethics Committee reviewed and approved the UK Biobank’s scientific protocol and operational procedures (REC reference: 06/MRE08/65).
Data on menopause status from Denmark were provided by the Danish Blood Donor Study (DBDS)34. Around 51% of participants were females with an age span at inclusion 18–70 years. The data were obtained from a paper questionnaire (v1) on self-reported health status and lifestyle sent to all participants in the DBDS (n = 110,000) from 2010 to mid-year 2015. Around 85,000 participants responded to it. In the end, AOM from 8,037 chip-typed females was used in the analysis. All participants signed an informed consent statement, and the DBDS genetic study was approved by the Danish National Committee on Health Research Ethics (NVK-1700407) and by the Danish Capital Region Data Protection Office (P-2019-99).
Data on female infertility from Denmark were provided by the Copenhagen Hospital Biobank (CHB) Reproduction Study, which involves a targeted selection of patients with reproductive phenotypes from the CHB, a biobank based on patient blood samples drawn in Danish hospitals35.
The AOM data from Norway were provided by the Hordaland Health Studies (HUSK). The HUSK surveys are a collaborative project between the University of Bergen, the Norwegian Health Screening Service (SHUS) and the Municipal Health Service in Hordaland aimed at gathering information so that disease ultimately can be prevented36. In the first phase of the studies (HUSK1), in 1992–1993, around 18,000 residents of Hordaland County born in 1925–1952 participated in the study. In 1997–1999 (HUSK2), previous participants born in 1950–1951 and 1925–1927 were re-invited, in addition to all residents in Hordaland County born in 1953–1957. In total, approximately 36,000 individuals participated in the study (18,000 in 1992–1993 and 26,000 in 1997–1999), with some participating at both times. Age at last menstruation (proxy for menopause) was collected from questionnaires sent to participants both in HUSK1 and HUSK2. All participants signed an informed consent statement, and the HUSKment study was approved by the Regional Committee for Medical Research Ethics Western Norway (REK Vest 10279 (2018/915)). In the end, AOM from 3,161 genotyped females was used in the analysis.
For all strata, in the case of repeat measurements, the mean age of menopause or the mean age at the last period was used to represent each individual’s AOM.
Rounding tendency in reported age of menopause
It has been observed that when women are asked to recall their AOM, they tend to report values ending in 0 or 5 (ref. 12). Thus, we need to take into account the possibility that some women who reported menopause at the age of 40 years may not have been included as POI cases due to this tendency and could lead to an underestimation of the risk of POI in our study. Of the 27 homozygotes for p.(Arg162Ter) with AOM information, nine reported AOM before the age of 40, while seven reported experiencing menopause exactly at the age of 40. Assuming an equal probability of rounding reported AOM up or down to 40, we estimated the penetrance of POI among homozygotes as 46% ((9 + 3.5)/27). Likewise, for noncarriers and heterozygotes, the estimated penetrance of POI is 3.7% ((4,678 + 1,728.5)/174,302).
Estimating the proportion of POI explained by p.(Arg162Ter) homozygosity
Using AOM data to define POI as AOM before the age of 40 years, we can observe nine homozygotes among the 4,687 females with AOM before the age of 40 years. Thus, we estimate that the proportion of all POI cases caused by p.(Arg162Ter) homozygosity is around 0.19% (that is, 1 of 521). Similarly, taking into account rounding bias, the proportion of all POI cases estimated to be caused by homozygosity is also 0.19% (that is, 1 of 513, or (9 + 3.5)/(4,687 + 1728.5)).
In the UK Biobank 500k WGS set, one homozygote was observed among the 571 females with the ICD-10 diagnostic code E283, indicative of POI. Thus, the incidence of p.(Arg162Ter) homozygosity is 1 of 571 among POI cases.
Genotyping
In Iceland, 34,453,001 sequence variants identified in WGS data from 63,460 Icelanders participating in various disease projects at deCODE genetics were tested. The samples were sequenced using standard TruSeq (Illumina) methodology to an average genome-wide coverage of 40×. SNPs and insertions and deletions (InDels) were identified, and their genotypes were called using joint calling with Graphtyper37. Variant Effect Predictor from RefSeq was used to annotate the effects of sequence variants on protein-coding genes. We chip-typed 173,025 Icelanders (around 50% of the population) using Illumina SNP arrays, and the chip-typed individuals were long-range phased38. The variants identified in the whole-genome sequencing of Icelanders were imputed into the chip-typed individuals. In addition, based on Icelandic genealogy, the genotype probabilities for 292,636 untyped close relatives of chip-typed individuals were calculated39,40.
From the UK Biobank, we used data from around 428k WGS individuals who were of British/Irish ancestry. The WGS was performed using Illumina standard TruSeq methodology (mean depth of 32×) in a collaborative work between deCODE genetics in Iceland and The Wellcome Sanger Institute in the UK. Sequence variants from the WGS were identified and called jointly using Graphtyper37. Phasing from previous chip-typing of the same sample was used as the basis to assign haplotypes15.
From Denmark and Norway, we chip-typed 464,016 and 254,304 individuals, respectively. The samples were chip-typed by deCODE genetics using both Omni microarrays (Illumina) and Global Screening Array (Illumina). Graphtyper was used to identify SNPs and InDels and jointly call their genotypes37. Using the identified variants, the samples were then phased (using SHAPEIT4 (ref. 41)) along with an international set of 1,041,174 genotyped individuals from 49 countries (including Denmark and Norway), chip-typed at deCODE genetics. For variant imputation, we compiled an international reference panel from 50,839 WGS individuals from 14 countries, including 10,985 from Denmark and 3,467 from Norway. The identified variants from WGS were subsequently imputed into the chip-typed individuals.
Association analysis
We performed a meta-analysis on GWAS on 180,564 females from Iceland, the UK, Denmark and Norway with self-reported AOM or age at last menstruation. We tested a total of 39,281,741 sequence variants (imputation info >0.80 and MAFIce > 0.02%, MAFUK > 0.01%, MAFDen > 0.1%, MAFNor > 0.2%), identified in the WGS, for association with AOM. The quantitative traits were transformed to a standard normal distribution. For the quantitative traits, the year of birth was included as a covariate in the analysis, with additional adjusting for the first 20 principal components in the UK, for population stratification. For each population, the quantitative traits were tested using a linear mixed model implemented in BOLT-LMM42. For the meta-analysis, we used a fixed-effects inverse variance method based on effect estimates and s.e. from each population43. For each study, we used linkage disequilibrium (LD) score regression to account for distribution inflation in the dataset due to cryptic relatedness and population stratification44. Using a set of about 1.1 million sequence variants with available LD scores, we regressed the χ2 statistics from our GWAS scan against the LD score and used the intercept as a correction factor. The estimated correction factor for AOM, based on LD score regression, was 0.97 for the recessive model in the Icelandic sample, 1.01 in the UK, 1.01 in Denmark and 1.02 in Norway.
We report the effect estimates for POI and EM phenotypes against population controls and as a categorical trait among women who reported AOM (AOM < 40 versus AOM ≥ 40; AOM < 45 versus AOM ≥ 45; Table 2). The effect estimates from the two methods do not differ significantly, and we have reached the same conclusion (Phet > 0.25).
Significance thresholds
We applied genome-wide significance thresholds corrected for multiple testing using an adjusted Bonferroni procedure weighted for variant classes and predicted functional impact. With 39,281,741 sequence variants being tested in the meta-analysis, the weights given in ref. 45 were rescaled to control the family-wise error rate. The adjusted significance thresholds are 2.0 × 10−7 for variants with high impact (n = 9,910), 4.0 × 10−8 for variants with moderate impact (n = 202,465), 3.7 × 10−9 for low-impact variants (n = 3,244,032), 1.8 × 10−9 for other variants in DNase I hypersensitivity sites (n = 5,001,568) and 6.1 × 10−10 for all other variants (n =30,823,766).
Variant frequency map
UK Biobank participants were first grouped by birth country. We then defined regional ancestry groupings with the aim that the groups be representative of the region’s current population, be homogeneous by genetic ancestry and have at least 200 individuals (for accurate estimation of variant frequencies).
We assessed the current genetic ancestry profiles of regions by comparing our ancestry analyses15 to in-house and published results of human genome diversity datasets like Human Origins46 and HGDP47, comparing genetic ancestry results across neighboring countries, surveying country demographics through resources like The World Factbook48 and examining participants’ self-reported ethnicity information and UK census data49 to determine the extent to which individuals who migrated to the UK were representative of the source countries’ current demographics.
In some cases, we split off ancestry-based groupings representing distinct populations or unrepresentative migrant communities (for example, South Asian ancestry born in Africa and West Asia) to achieve homogeneous birthplace-based groupings. Groups depicted on the maps in Supplementary Fig. 7 are those best representing the current demographic majority. If countries had fewer than 200 participant birthplaces, we merged them with neighboring countries with similar assessed ancestry profiles. Map geometries were obtained via R package maps and manipulated with sf50. The maps in Supplementary Fig. 7 are sourced from Natural Earth (https://www.naturalearthdata.com/about/terms-of-use/).
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
Sequence variants tested for association have been deposited in the European Variation Archive under accession PRJEB15197 (https://www.ebi.ac.uk/ena/browser/view/PRJEB15197). This research has been conducted using the UK Biobank Resource under application 56270. Data from the UK Biobank are available by application to all bona fide researchers in the public interest at https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access. Additional information about registration for access to the data is available at www.ukbiobank.ac.uk/register-apply/. Data access for approved applications requires a data transfer agreement between the researcher’s institution and UK Biobank, the terms of which are available on the UK Biobank website (www.ukbiobank.ac.uk/media/ezrderzw/applicant-mta.pdf). The genome-wide association meta-analysis summary data for the age of menopause will be made available at http://www.decode.com/summarydata.
References
Snaebjarnarson, A. S. et al. Complex effects of sequence variants on lipid levels and coronary artery disease. Cell 186, 4085–4099 (2023).
Faubion, S. S., Kuhle, C. L., Shuster, L. T. & Rocca, W. A. Long-term health consequences of premature or early menopause and considerations for management. Climacteric 18, 483–491 (2015).
Te Velde, E. R. & Pearson, P. L. The variability of female reproductive ageing. Hum. Reprod. Update 8, 141–154 (2002).
Stuenkel, C. A. & Gompel, A. Primary ovarian insufficiency. N. Engl. J. Med. 388, 154–163 (2023).
Lambalk, C. B., van Disseldorp, J., de Koning, C. H. & Broekmans, F. J. Testing ovarian reserve to predict age at menopause. Maturitas 63, 280–291 (2009).
Stolk, L. et al. Meta-analyses identify 13 loci associated with age at menopause and highlight DNA repair and immune pathways. Nat. Genet. 44, 260–268 (2012).
Perry, J. R. B. et al. A genome-wide association study of early menopause and the combined impact of identified variants. Hum. Mol. Genet. 22, 1465–1472 (2013).
Zhang, L. et al. Joint genome-wide association analyses identified 49 novel loci for age at natural menopause. J. Clin. Endocrinol. Metab. 106, 2574–2591 (2021).
Zhao, H. et al. Transcription factor FIGLA is mutated in patients with premature ovarian failure. Am. J. Hum. Genet. 82, 1342–1348 (2008).
Huhtaniemi, I. et al. Advances in the molecular pathophysiology, genetics, and treatment of primary ovarian insufficiency. Trends Endocrinol. Metab. 29, 400–419 (2018).
Xu, C., Ruan, X. & Mueck, A. O. Progress in genome-wide association studies of age at natural menopause. Reprod. Biomed. Online 46, 607–622 (2023).
Hahn, R. A., Eaker, E. & Rolka, H. Reliability of reported age at menopause. Am. J. Epidemiol. 146, 771–775 (1997).
Halldorsson, B. V. et al. Characterizing mutagenic effects of recombination through a sequence-level genetic map. Science 363, eaau1043 (2019).
Ruth, K. S. et al. Genetic insights into biological mechanisms governing human ovarian ageing. Nature 596, 393–397 (2021).
Halldorsson, B. V. et al. The sequences of 150,119 genomes in the UK Biobank. Nature 607, 732–740 (2022).
Temple, S. D., Waples, R. K. & Browning, S. R. Modeling recent positive selection in Americans of European ancestry. Preprint at bioRxiv https://doi.org/10.1101/2023.11.13.566947 (2023).
Irving-Pease, E. K. et al. The selection landscape and genetic legacy of ancient Eurasians. Nature 625, 312–320 (2024).
Le, M. K. et al. 1,000 ancient genomes uncover 10,000 years of natural selection in Europe. Preprint at bioRxiv https://doi.org/10.1101/2022.08.24.505188 (2022).
Murga-Moreno, J., Coronado-Zamora, M., Bodelón, A., Barbadilla, A. & Casillas, S. PopHumanScan: the online catalog of human genome adaptation. Nucleic Acids Res. 47, D1080–D1089 (2019).
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
Mudge, J. M. et al. Discovery of high-confidence human protein-coding genes and exons by whole-genome PhyloCSF helps elucidate 118 GWAS loci. Genome Res. 29, 2073–2087 (2019).
GTEx Consortium The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).
Uhlén, M. et al. Proteomics. Tissue-based map of the human proteome. Science 347, 1260419 (2015).
Joshi, S., Davies, H., Sims, L. P., Levy, S. E. & Dean, J. Ovarian gene expression in the absence of FIGLA, an oocyte-specific transcription factor. BMC Dev. Biol. 7, 67 (2007).
Liang, L., Soyal, S. M. & Dean, J. FIGα, a germ cell specific transcription factor involved in the coordinate expression of the zona pellucida genes. Development 124, 4939–4947 (1997).
Yatsenko, S. A. & Rajkovic, A. Genetics of human female infertility. Biol. Reprod. 101, 549–566 (2019).
Bayne, R. A. L., Martins da Silva, S. J. & Anderson, R. A. Increased expression of the FIGLA transcription factor is associated with primordial follicle formation in the human fetal ovary. Mol. Hum. Reprod. 10, 373–381 (2004).
Tosh, D., Rani, H. S., Murty, U. S., Deenadayal, A. & Grover, P. Mutational analysis of the FIGLA gene in women with idiopathic premature ovarian failure. Menopause 22, 520–526 (2015).
Ward, L. D. et al. Rare coding variants in DNA damage repair genes associated with timing of natural menopause. HGG Adv. 3, 100079 (2022).
Shekari, S. et al. Penetrance of pathogenic genetic variants associated with premature ovarian insufficiency. Nat. Med. 29, 1692–1699 (2023).
Karczewski, K. J. et al. Systematic single-variant and gene-based association testing of thousands of phenotypes in 394,841 UK Biobank exomes. Cell Genom. 2, 100168 (2022).
Backman, J. D. et al. Exome sequencing and analysis of 454,787 UK Biobank participants. Nature 599, 628–634 (2021).
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
Hansen, T. F. et al. DBDS Genomic Cohort, a prospective and comprehensive resource for integrative and temporal analysis of genetic, environmental and lifestyle factors affecting health of blood donors. BMJ Open 9, e028401 (2019).
Sørensen, E. et al. Data resource profile: the Copenhagen Hospital Biobank (CHB). Int. J. Epidemiol. 50, 719–720e (2021).
Refsum, H. et al. The Hordaland Homocysteine Study: a community-based study of homocysteine, its determinants, and associations with disease. J. Nutr. 136, 1731S–1740S (2006).
Eggertsson, H. P. et al. Graphtyper enables population-scale genotyping using pangenome graphs. Nat. Genet. 49, 1654–1660 (2017).
Kong, A. et al. Detection of sharing by descent, long-range phasing and haplotype imputation. Nat. Genet. 40, 1068–1075 (2008).
Gudbjartsson, D. F. et al. Large-scale whole-genome sequencing of the Icelandic population. Nat. Genet. 47, 435–444 (2015).
Jónsson, H. et al. Whole genome characterization of sequence diversity of 15,220 Icelanders. Sci. Data 4, 170115 (2017).
Delaneau, O., Zagury, J.-F., Robinson, M. R., Marchini, J. L. & Dermitzakis, E. T. Accurate, scalable and integrative haplotype estimation. Nat. Commun. 10, 5436 (2019).
Loh, P.-R. et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat. Genet. 47, 284–290 (2015).
Mantel, N. & Haenszel, W. Statistical aspects of the analysis of data from retrospective studies of disease. J. Natl Cancer Inst. 22, 719–748 (1959).
Bulik-Sullivan, B. K. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
Sveinbjornsson, G. et al. Weighting sequence variants based on their annotation increases power of whole-genome association studies. Nat. Genet. 48, 314–317 (2016).
Lazaridis, I. et al. Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature 513, 409–413 (2014).
Bergström, A. et al. Insights into human genetic variation and population history from 929 diverse genomes. Science 367, eaay5012 (2020).
The World Factbook. Travel the globe with CIA’s world factbook. www.cia.gov/the-world-factbook/about/archives/2021/ (2021).
Office for National Statistics. Census. www.ons.gov.uk/census (2021).
Pebesma, E. Simple features for R: standardized support for spatial vector data. R J. 10, 439 (2018).
Acknowledgements
We thank the individuals who participated in this study and whose contributions made this work possible. We also thank our valued colleagues at the Icelandic Patient Recruitment Center and the deCODE genetics core facilities who contributed to the data collection and phenotypic characterization of clinical samples as well as to the genotyping and analysis of the whole-genome association data. The researchers are indebted to the participants for their willingness to participate in the study. This research has been conducted using the UK Biobank resource, a major biomedical database (application 56270; https://www.ukbiobank.ac.uk/). We acknowledge the Novo Nordisk Foundation (grants NNF22OC0077221 (to D.W., K.B. and H.S.N.) and NNF17OC0027594 and NNF14CC0001 (to D.W. and K.B.)) and the A.P. Moller Foundation (to D.W., K.B. and H.S.N.). We also acknowledge the participants and investigators of the HUSK study, supported in part by grant REC2018/915.
Author information
Authors and Affiliations
Consortia
Contributions
A.O., P.S., K.S. and D.F.G. designed the study and interpreted the results. A.O., P.S., D.F.G. and K.S. drafted the manuscript. A.O. implemented the analysis pipelines with input from V.S., G.S., G.R.O., K.H.S.M., R.F., B.O.J., G.A.A., H.J., A.S., A.S.S., G.H.H., S.I., G.T. and D.F.G. A.O., G.H.H. and P.S. performed expression analyses. A.O., P.S., G.S., G.H.H. and D.F.G. performed the statistical and bioinformatics analyses. Participant recruitment, phenotype data acquisition and biological material collection were organized and carried out by V.S., G.S., D.W., U.S., H.S.N., J.H., G.B.W., M.N., C.E., T.S., P.M., I.J., B.V.H., K.B., E.S., L.T., T.R., O.B.P., G.M., O.A.A., R.T.L., S.R.O., H.H., H.S. and K.S. Sequencing and genotyping were supervised by O.T.M. and J.S. All authors contributed to the final version of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
Authors affiliated with deCODE genetics/Amgen declare competing interests as employees. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks Elena Tucker and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs. 1–12, Supplementary Tables 1–11 and Supplementary Note (members of the DBDS Genomic Consortium).
Supplementary Data
Supplementary Data 1–3.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Oddsson, A., Steinthorsdottir, V., Oskarsson, G.R. et al. Homozygosity for a stop-gain variant in CCDC201 causes primary ovarian insufficiency. Nat Genet 56, 1804–1810 (2024). https://doi.org/10.1038/s41588-024-01885-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41588-024-01885-6