Colorectal cancer (CRC) is a leading cause of cancer-related mortality worldwide. Identifying people at high risk of developing CRC and optimizing their screening can reduce the incidence of, and mortality from, CRC. Up to 35% of CRC cases are estimated to be attributable to genetic factors;1 however, known hereditary CRC syndromes caused by high-penetrance germ-line mutations account for only 3–5% of all CRCs.2 In recent years, gene discovery efforts have identified several novel CRC susceptibility genes, including rare germ-line variants within the polymerase proofreading domains of the POLE and POLD1 genes, which are reported to be associated with CRC and polyposis (referred to as polymerase proofreading–associated polyposis).3,4,5 These novel CRC susceptibility genes are now included on multigene sequencing panels for clinical testing, but the age-specific cumulative risks (penetrance) of CRC for people who carry mutations in these genes have not yet been quantified, which is an impediment to optimizing personalized clinical management. The aim of this study was to estimate the age-specific cumulative risks of CRC separately for male and female carriers of POLE and POLD1 gene mutations.

Materials and methods

We searched in PubMed for relevant studies published before October 2016 reporting pedigree and cancer data for families with germ-line POLE or POLD1 variants that were either novel or previously observed at population frequency ≤0.002 according to the non-Finnish European population in the Exome Aggregation Consortium (ExAC) database,6 given recent evidence that shows low-variant allele frequency is an important guide in determining disease-causing variants.7 Variants identified from the literature were re-annotated, for consistency, via in silico methods using Annovar8 with default settings. We applied a criterion for predicting pathogenicity of missense variants in both genes, as recommended by the American College of Medical Genetics and Genomics,9 namely using (i) multiple commonly used in silico tools (SIFT, PolyPhen2, MutationTaster, CADD, GERP, REVEL, and M-CAP (Supplementary Table S1 online)) and (ii) a high level of consensus between multiple in silico tools for prediction of deleterious effect. For this study, we applied the recommended or default thresholds for prediction of deleteriousness for each of the seven in silico tools (see Supplementary Table S1 for thresholds). For each variant, the sum of in silico tools that reported the variant to be deleterious was calculated (maximum score of 7). A variant was considered to be likely pathogenic for this study where at least four of seven in silico tools predicted the variant has a deleterious effect (Supplementary Table S1). Families were excluded from the penetrance analysis if they were (i) families of probands with variants not predicted to be pathogenic, (ii) discovery families in which the POLE c.1270C>G p.Leu424Val and POLD1 c.1433G>A p.Ser478Asn variants were originally described,3 (iii) families of carriers of de novo mutations, or (iv) uninformative owing to missing information on sex or age at cancer diagnosis of probands. For the families included in the analysis, information regarding mutation carrier status, sex, cancer- or polyp-affected status, age at cancer or polyp/polyposis diagnosis, last known age or death, and country of study of families was extracted, where possible, from identified studies.

We searched for POLE or POLD1 mutation carrier families by genotyping 669 population-based probands diagnosed with CRC before 60 years of age from the Australasian Colorectal Cancer Family Registry (ACCFR)10,11 (Supplementary Table S2) for 17 rare germ-line variants within the exonuclease domains of the POLE and POLD1 genes (Supplementary Table S3 and Supplementary Materials and Methods). These 17 variants were selected based on multiple sources, namely:

1. Rare germ-line variants in the exonuclease domains of POLE and POLD1 reported in the ExAC database (≤0.002 allele frequency)

2. Variants, identified from our in-house whole-genome and whole-exome sequencing studies of 100 multiple-case CRC-affected families from the clinic-based recruitment arm of the ACCFR that were either rare or novel variants according to the ExAC database (≤0.002 allele frequency)

3. Variants reported in the discovery paper by Palles et al.3

4. Variants predicted to be deleterious by at least four of the seven in silico tools

Using data from both published studies and the ACCFR, we estimated the hazard ratio (HR) and corresponding 95% confidence interval of CRC for mutation carriers compared with the general population (based on age-, sex-, and country-specific incidences) and the age-specific cumulative risks (penetrance), using a modified segregation analysis that incorporated data for all family members, whether genotyped or not, and whether affected or not. We properly adjusted for ascertainment of families in which each pedigree’s data was conditioned on the proband’s genotype, cancer status, and age at diagnosis (for population-based families) or on the proband’s genotype, and the cancer statuses and ages at diagnoses of all family members (for clinic-based families) to produce unbiased estimates (see details in Supplementary Materials and Methods). All statistical tests were two-sided, and P values lower than 0.05 were considered statistically significant.


From the literature, we identified 15 studies reporting 37 rare variants (minor allele frequency ≤0.002) within the exonuclease domains of POLE and POLD1. Of these 37 variants, 32 were predicted to be pathogenic by in silico criteria (Supplementary Table S1). Of the 89 families with POLE (n = 70) or POLD1 (n = 19) mutations, 42 were excluded (7 families with variants not predicted to be pathogenic, 3 families from the study by Palles et al.3 that initially reported the association of POLE and POLD1 with CRC, 3 families of de novo mutation carriers, 29 families of probands without age or sex information). From the ACCFR, two unrelated carriers of the POLE c.861T>A p.Asp287Glu variant and one carrier of the POLE c.1336C>T p.Arg446Trp variant were identified (pedigrees shown in Supplementary Figure S1).

We included 47 families (38 POLE and 9 POLD1) from published studies, and 3 families with POLE mutations from the ACCFR (Supplementary Figure S1), in the penetrance analysis. Of these, 28 (21 POLE and 7 POLD1) families were ascertained because they had a family history of cancer, and 22 (20 POLE and 2 POLD1) were ascertained via population-based cancer registries, regardless of family history. We observed 67 CRCs with a mean age at diagnosis of 50.2 (SD = 13.8) years among 364 first- and second- degree relatives (53% female) from 41 families with POLE mutations, and 6 CRCs with a mean age at diagnosis of 39.7 (SD = 6.83) years among 69 first- and second- degree relatives (45% female) from 9 families with POLD1 mutations (Table 1).

Table 1 Numbers and mean ages at diagnosis of cancers in the first- and second-degree male and female relatives of probands from POLE and POLD1 mutation families

We estimated cumulative risks of CRC to age 70 years for males and females to be, respectively, 28% (95% CI, 10%–42%) and 21% (95% CI, 7%–33%) for POLE mutation carriers and 90% (95% CI, 33%–99%) and 82% (95% CI, 26%–99%) for POLD1 mutation carriers (Figure 1). The CRC HR was estimated to be 12.2 (95% CI, 7.35–20.2) and 87.2 (95% CI, 15.3–495) for POLE and POLD1 mutation carriers, respectively (Table 2). The HRs decreased with age, being 38.7 (95% CI, 17.5–85.4) for <50 years compared with 8.21 (95% CI, 4.24–15.9) for ≥50 years for POLE mutation carriers (P = 0.003), and 201 (95% CI, 62.0–651) for <50 years compared with 3.34 (95% CI, 0.22–50.1) for ≥50 years for POLD1 mutation carriers (P = 0.007; Table 2). There was no evidence for a difference in HRs by sex (all P > 0.1).

Figure 1
figure 1

Cumulative risks (unbroken lines) and corresponding 95% confidence intervals (dotted lines) of colorectal cancer. Risks are shown for (a) POLE mutation carriers, (b) POLD1 mutation carriers, and the general population (dashed lines). Blue and red represent males and females, respectively. Information for the general population is based on SEER (Surveillance, Epidemiology and End Results) data (nine registries) for race/ethnicity cancer incidence between 2003–2007. A full color version of this figure is available at the GENETICS in MEDICINE journal online.

Table 2 Hazard ratio (95% confidence interval) of colorectal cancer for carriers of a germ-line mutation in POLE or POLD1

The POLE c.1270C>G, p.Leu424Val mutation has been reported in 19 families. For these specific mutation carriers, the estimated cumulative risks of CRC to age 70 years were 97% (95% CI, 85%–99%) for males and 92% (95% CI, 75%–99%) for females. The corresponding HR was 131 (95% CI, 71.3–242) when both sexes were combined, and 75.0 (95% CI, 37.9–149) for males and 269 (95% CI, 111–650) for females. Results for penetrance and HR estimates were not materially different between analyses with and those without imputing missing ages.


To our knowledge, this is the first report of both relative and cumulative risks of CRC for people who carry germ-line mutations within the exonuclease domains of the POLE or POLD1 genes. Our analysis included all of the reported rare germ-line variants within the exonuclease domains of POLE and POLD1 predicted to be pathogenic by multiple commonly used in silico tools. However, it is possible that not all variants result in the same level of risk as is evidenced by the recurring POLE c.1270C>G, p.Leu424Val mutation.

CRC has been the predominant cancer identified in POLE and POLD1 mutation carriers to date (perhaps because of the way the families had been selected for genetic testing). However, a broader extracolonic spectrum of cancers is being revealed as additional carrier families are identified. In addition to CRC, cancers of the endometrium, ovaries, pancreas, brain, and small intestine have been reported for carriers, similar to what has been observed for DNA mismatch repair (MMR) gene mutation carriers (Lynch syndrome).12 The presence of 10 to <100 adenomas and/or the presence of duodenal polyps could be distinguishing features of POLE or POLD1 mutation carriers from MMR gene mutation carriers.5 Interestingly, carriers of the POLE c.1270C>G, p.Leu424Val mutation showed no predilection for site and present with different histological types including mucinous adenocarcinoma, suggesting variability in phenotype even for the same mutation, implicating potential environmental or genetic modifiers.

Additional phenotypic variability was observed with regard to tumor DNA MMR status where the majority of CRCs from POLE and POLD1 carriers were MMR-proficient or microsatellite-stable. However, a small subset of CRCs in POLE mutation carriers showed tumor MMR deficiency, without evidence of a germ-line MMR gene mutation.13,14 Therefore, germ-line POLE exonuclease domain variants may account for a proportion of people with tumor MMR-deficient phenotype not explained by germ-line MMR gene mutations or acquired MLH1 promoter hypermethylation (suspected Lynch syndrome).15 Furthermore, somatic POLE and POLD1 mutations have been reported in both colorectal and endometrial cancers,16,17 supporting the hypothesis that loss of polymerase proofreading and the resultant hypermutation tumor phenotype can underlie inactivation of the MMR genes through somatic mutations.

This study has several limitations, including the lack of detailed information regarding how cancer histories of family members in published studies were verified. A large proportion of families were excluded from the analysis owing to missing information for age and sex of probands, thereby reducing the precision of our risk estimates. We used in silico predictions to assign pathogenicity for variant inclusion in the penetrance analysis; however, it has been shown that in silico tools and their algorithms for missense variant effect prediction are only 65–80% accurate when examining known disease-causing missense variants.18 The POLE and POLD1 variants included in the analysis were predicted to be deleterious by multiple in silico variant effect prediction tools (by at least four out of the seven tools used in this study), as recommended by the American College of Medical Genetics and Genomics,9 and further selected based on a variant allele frequency filter of ≤0.002, adding confidence that our CRC risk estimates were based on only those variants likely to be pathogenic. We genotyped 17 POLE and POLD1 rare, likely pathogenic variants to identify additional carriers, but we cannot exclude the possibility that other POLE and POLD1 variants besides those genotyped may exist within the ACCFR CRC-affected individuals. Apart from CRC, we were unable to estimate risks of other cancers because there were too few cancer diagnoses. Finally, our estimates might not necessarily be applicable to non-Caucasians.

In summary, the increased CRC risks for all carriers of a POLE pathogenic or likely pathogenic exonuclease domain variant, particularly for the recurrent POLE c.1270C>G, p.Leu424Val mutation, warrant consideration of annual colonoscopy screening and clinical management guidelines comparable to those currently recommended for people with Lynch syndrome or familial adenomatous polyposis. As yet the risk of metachronous CRC is not known, but is likely to be similarly increased, raising consideration of subtotal colectomy rather than segmental resection for POLE mutation carriers. Functional studies to support variant classification may help to further refine the CRC risk estimates, as will additional carrier families. For POLD1 mutation carriers, refinement of penetrance estimates for CRC is needed; however, clinical management recommendations could follow those suggested for POLE mutation carriers.