Novel mutations and phenotypic associations identified through APC, MUTYH, NTHL1, POLD1, POLE gene analysis in Indian Familial Adenomatous Polyposis cohort

Colo-Rectal Cancer is a common cancer worldwide with 5–10% cases being hereditary. Familial Adenomatous Polyposis (FAP) syndrome is due to germline mutations in the APC or rarely MUTYH gene. NTHL1, POLD1, POLE have been recently reported in previously unexplained FAP cases. Unlike the Caucasian population, FAP phenotype and its genotypic associations have not been widely studied in several geoethnic groups. We report the first FAP cohort from South Asia and the only non-Caucasian cohort with comprehensive analysis of APC, MUTYH, NTHL1, POLD1, POLE genes. In this cohort of 112 individuals from 53 FAP families, we detected germline APC mutations in 60 individuals (45 families) and biallelic MUTYH mutations in 4 individuals (2 families). No NTHL1, POLD1, POLE mutations were identified. Fifteen novel APC mutations and a new Indian APC mutational hotspot at codon 935 were identified. Eight very rare FAP phenotype or phenotypes rarely associated with mutations outside specific APC regions were observed. APC genotype-phenotype association studies in different geo-ethnic groups can enrich the existing knowledge about phenotypic consequences of distinct APC mutations and guide counseling and risk management in different populations. A stepwise cost-effective mutation screening approach is proposed for genetic testing of south Asian FAP patients.

hepatoblastoma 7 . Benign manifestations like congenital hypertrophy of the retinal pigment epithelium (CHRPE), desmoid tumors, osteomas and dental anomalies are also common 7 . Correlation between the location of mutations in APC gene (genotype) and the clinical phenotype in terms of the number of polyps, age of onset of polyps and CRC and distinct extracolonic manifestations is well described 8 . An attenuated variant of FAP (AFAP) due to mutations in 5′ or 3′ end of the APC gene, is characterized by polyps not exceeding 100 and late age of onset 1,5 . Up to 10% of FAP cases in whom APC mutation is not identified, there is bi-allelic germline mutation in the MUTYH gene 5 . Unlike FAP, the MUTYH associated polyposis has a lower polyp burden which rarely exceeds 100 5,9 . Comprehensive genetic analysis of APC and MUTYH fails to identify underlying gene mutation in 10-20% of FAP cases 2,4,10,11 and only a small proportion of these are explained by the recently described NAP and PPAP syndromes 2,4 .
Current knowledge regarding the spectrum of APC gene mutation, mutational hotspots and the genotype phenotype correlations is derived mainly from studies in Caucasian cohorts 5,8,12 . In recent years, studies from other geo-ethnic groups have identified several novel APC genotypes, phenotypes and genotype-phenotype associations 10,[13][14][15][16] . The underlying reason for differences in phenotypic associations has not been investigated but may be due to difference in the underlying genetic background or dietary habits 17,18 . APC genotype-phenotype association studies in different geo-ethnic groups can enrich the existing knowledge about phenotypic consequences of distinct APC mutations and guide counseling and risk management in different populations. This is the first FAP cohort being reported from South Asia and the only non-Caucasian cohort with comprehensive molecular genetic analysis of all the five adenomatous polyposis associated genes (APC, MUTYH, NTHL1, POLD1 and POLE).

Results
The 53 unrelated Indian FAP families reported here represent the diverse regions and religions of the Indian subcontinent with 15 hailing from northern, 15 from eastern, 14 from western and 9 from southern states of India and belonging to Hindu (46), Muslim (2), Christian (3) and Jain (2) religions. Of the 53 probands, 25 had no family history of polyposis or cancer suggesting a de novo mutation. The remaining 28 probands reported a family history of polyposis with or without CRC or other extracolonic manifestations. All the probands had classical polyposis except three AFAP cases with <100 adenomatous polyps. Through Sanger sequencing and MLPA of APC and MUTYH genes, 45 families were found to harbor deleterious germline mutation in the APC gene (35 distinct mutations) and 2 families with bi-allelic MUTYH gene mutation. With extended testing of family members, a total of 60 carriers of APC mutation and 4 carriers of bi-allelic MUTYH mutations were identified. In a combined analysis in 60 APC mutation carriers and their 58 untested relatives with FAP associated cancer or benign manifestation, the phenotypic features observed were 79 CRC, 5 upper GI cancers, 3 thyroid cancer, 2 brain tumors, 13 desmoid tumors/fibromatosis. CHRPE was noted in 14/34 APC mutation carriers for whom fundus examination details were available.
Mutation spectrum. Of the 35 distinct APC mutations described in Table 1  LGRs identified were a duplication of the Promoter1B identified in two families and deletion of exons 9-13 in one family. In 2 of the 3 AFAP cases, biallelic MUTYH mutations were identified. A homozygous MUTYH mutation E466X (now E480X) was identified in a South Indian Tamil AFAP patient with 40 polyps and CRC. Compound heterozygous MUTYH mutations R241W and G286E were identified in a case with less than 100 polyps. In the 6 APC and MUTYH mutation negative cases with classical FAP phenotype, sequencing of the entire coding region of NTHL1 gene and the exonuclease domain of POLD1 gene (exons [6][7][8][9][10][11][12][13] and POLE gene (exons 9-14) did not identify any mutation.
Phenotypic features and rare genotype-phenotype associations. Of the 60 APC mutation carriers, 31 had developed CRC at a mean age of 38.3 years (range18-53 years) in a background of classical polyposis with hundreds to thousands of polyps in all but one case of AFAP with only 50 polyps. In 23 APC carriers, polyposis was diagnosed at a mean age of 32 years (range: 9-60 years) without CRC on endoscopic evaluation or histopathological examination of prophylactic procto-colectomy. In the remaining 6 carriers, colonoscopy was yet to be performed or its details were not available. Six APC carriers developed extracolonic cancers with or without CRC. These included 2 cases with papillary thyroid cancer, 1 case with duodenal cancer, 1 case with intracranial germinoma, 1 case with papillary thyroid carcinoma and duodenal cancer, and 1 case with duodenal cancer and small intestine cancer. One or more benign extracolonic manifestations were identified in 27/60 APC mutation carriers. These included CHRPE (n = 14), desmoid tumor or fibromatosis (n = 13), upper GI polyps (n = 8) and osteomas (n = 3). Eight very rare FAP phenotypes or phenotypes rarely associated with mutations outside specific regions of the APC gene were observed. These include the second reported case of intracranial germ cell tumor in an APC carrier 19 , absence of profuse polyposis and early onset CRC in 3 of the 7 codon 1309 mutation carriers as is classically described 20 , attenuated phenotype with only 50 polyps at age 33 years in a codon 593 mutation carrier, desmoid tumor with codon 1228 mutation, papillary thyroid cancer with codon 1346 mutation and most interestingly CHRPE with codon 1483 mutation 7, 8 .

Discussion
In FAP, the mutation spectrum of APC gene and genotype-phenotype correlations is well characterized for the Caucasian population 12, 21-25 , and to some extent for the East Asian population 15,[26][27][28][29][30] . Also, comprehensive molecular characterization of all the 5 known genes has been performed in very limited number of cases, that too only in the Caucasian population. Our study is the first report of a South Asian cohort of 53 FAP families and the only non-Caucasian FAP cohort analysed for all the 5 adenomatous polyposis associated genes.
The wide variation in the reported frequency of germline APC or MUTYH mutations in FAP cohorts from as low as 40-60% 23, 31, 32 to as high as 75-94% 10, 24, 33 is due to the stringency in making a syndromic diagnosis or lack of comprehensive genetic analysis. The high mutation detection rate of 89% in our cohort reflects the appropriateness of our clinical characterization for making the syndromic diagnosis and the comprehensive genetic analysis for APC and MUTYH including MLPA.
This study has identified a new Indian mutational hotspot at codon 935 seen in 4 (9%) FAP families. In addition, the two other known hotspot mutations at codons 1309 and 1061 were seen in 18% and 9% families respectively. High frequency of codon 1309 and 1061 mutations worldwide 32 is a result of repetitive nucleotides in DNA sequence making it a mutational hotspot. Identification of APC LGR in 3 of the 11 families negative for APC point mutation or small indels and biallelic MUTYH mutation in 2 of the 8 families without APC mutation or LGR mandates its inclusion in comprehensive genetic analysis for south Asian FAP/AFAP cases. The MUTYH mutation E466X (now E480X), previously described in 3 unrelated Indian families living in the UK 34 was identified as a homozygous mutation in one of our AFAP case from Tamil Nadu in south India. E466X may thus be a founder MUTYH mutation in Indians, possibly of Tamil ancestry. The founder effect of E466X needs to be confirmed with haplotyping studies and its population frequency can be established in a larger cohort. NTHL1, POLD1 or POLE mutations were not identified in any of the 6 FAP probands negative for APC or MUTYH mutations. This is not surprising as none of these families fulfilled the salient features of PPAP or NAP as described in the literature 4,35 .
Of the 35 distinct mutation identified in our cohort, 15 (43%) are novel and not previously reported in Caucasian or other geo-ethnic groups. Moreover eight very rare FAP phenotype or phenotypes rarely associated with mutations outside specific regions of the APC gene were identified. APC genotypes and genotype-phenotype associations rarely or never observed in Caucasian cohorts are being increasingly reported from other geo-ethnic groups 10,[14][15][16]29 . This highlights the need to study different geo-ethnic groups to enrich the global APC mutational spectrum and expand our knowledge of phenotypic associations of distinct APC mutations.
Based on the mutational spectrum and hotspots identified, a pragmatic stepwise genetic testing algorithm is proposed for FAP cases in south Asian countries where genetic testing is not routinely performed due to resource constraints (Fig. 2). Initial screening of three amplicons (15D-15F) harboring the mutational hotspot codons 1309, 1061 and 935 could identify 40% of all APC mutations and sequencing of additional 3 amplicons of exon 15 (15 C, 15 G, 15 H) could identify two thirds of all APC mutations. If no mutation is identified rest of the APC should be screened followed by LGR analysis and MUTYH gene sequencing. Extended testing of other adenomatous polyposis associated genes (NTHL1, POLD1 and POLE) may be considered but the yield is likely to be very low. The present study and few recent reports 36 highlight that a significant proportion of FAP cases do not harbor  A pragmatic stepwise screening strategy to improve mutation detection rates in FAP patients. Cumulative mutation detection rates with step wise screening of exons/genes most likely to be mutated in south Asian FAP cases. Arrows on left side shows the cumulative mutation detection rates in our cohort achieved after each step. In our cohort, the cumulative mutation detection rate did not change with NTHL1, POLD1 and POLE gene analysis it may increase the detection rate slightly in larger cohorts of APC and MUTYH negative adenomatous polyposis cases from different geo-ethnic background. pathogenic mutations in the genes known to be associated with FAP, MAP, NAP, PPAP syndrome. Germline exome sequencing in an adenomatous polyposis cohort has recently reported loss-of-function germline mutations in a few promising candidate genes (DSC2, PIEZO1, ZSWIM7) 36 and biallelic mutations in MSH3 gene 37 . However these recently identified adenomatous polyposis genes are likely to remain under-reported, unless they are tested as single genes or included in multi-gene next generation sequencing (NGS) panels. The currently used multi-gene panels may not be informative as they do not include NTHL1, POLD1 and POLE genes. Therefore there is a need to conduct comprehensive genetic analysis of all the known adenomatous polyposis genes or exome sequencing studies in large pooled cohorts of APC and MUTYH negative adenomatous polyposis cases with detailed phenotypic and geo-ethnicity correlation.
In conclusion, the comprehensive investigation of all the five adenomatous polyposis genes in a well characterized Indian FAP cohort confirms the high frequency of APC mutations in classical FAP, MUTYH in AFAP cases and absence of NTHL1, POLD1 and POLE mutations in cases not showing syndromic features of PPAP or NAP. The pragmatic stepwise approach proposed can improve uptake of genetic testing for FAP in south Asian countries. Identification of a large number of novel APC mutations and genotype phenotype associations that are rare in the Caucasian population highlights the need for comprehensive phenotypic characterization and genetic analysis in large FAP cohorts from diverse geo-ethnic backgrounds.

Methods
Patients and Phenotype characterization. The study was conducted on 53 FAP families recruited through Cancer Genetics Clinic at Tata Memorial Centre, Mumbai and Christian Medical College, Vellore; India. The study was approved by the Hospital Ethics Committee of the Tata Memorial Hospital and all participating subjects provided written informed consent. All experiments were carried out in accordance with the approved guidelines and regulations. Syndromic diagnosis of FAP or AFAP was based on the number of adenomatous polyps in the colorectum with or without colorectal cancer. Further phenotypic characterization was done based on colonoscopy, esophago-gastro-duodenoscopy (EGD), computerized tomography of abdomen, thyroid ultrasound and ophthalmic examination. Detailed family history and medical records were taken from all the families reported in this study. Genetic testing was extended on first and second degree relatives if a deleterious germline mutation was identified in the proband. Blood sample was collected from 112 members from these 53 families.

PCR and Sequencing. For germline mutation analysis the complete coding sequence of the APC, MUTYH
and NTHL1 genes and the exonuclease domain of POLD1 gene (exons 6-13) and POLE gene (exons 9-14) were amplified by Polymerase Chain Reaction (PCR). Primer sequences and annealing temperatures for PCR used are given in the supplementary Tables S1-S4. PCR products were purified with ExoSAP-IT [USB products, Affymetrix] and sequenced using an ABI 310 Avant, 3500 and 3730 DNA sequencer (Applied Biosystems). All mutations were confirmed by bidirectional sequencing. For most of the cases, the mutations were further reconfirmed on a second independent sample collected after the identification of mutation. InSiGHT database (LOVD) and available literature was used to check if the mutations identified was reported or novel. The mutations identified in our cohort are submitted in the InSiGHT database (www.insight-group.org).

MLPA analysis.
If no APC mutation was identified on sequencing, large genomic rearrangement (LGR) in APC and MUTYH gene were evaluated with Multiplex ligation-dependent probe amplification (MLPA) using the SALSA MLPA APC P043 kit [MRC-Holland] as per the instructions provided by the company. The data was analyzed with Coffalyser software. All deletions or duplications identified and all uncertain results were confirmed in at least two independent MLPA reactions.