Introduction

Colorectal cancer incidence in Latin America has steadily increased during the last decades;1 however, these rates are less than half of those observed in African Americans and Caucasian Americans in North America.2 Colorectal cancer incidence rates largely vary across continents, showing the highest rates in those countries with mainly Caucasian populations.1 Such differences, and even the recent rising trends observed in developing countries, have been attributed to the high and increasing prevalence of risk factors associated with a ‘westernized’ lifestyle, such as obesity and physical inactivity.3 The reasons behind the higher colorectal cancer risk among African Americans compared with non-Hispanic Whites (Caucasians) in the United States are still not clear. Several studies propose health disparities as the main reason behind these differences.4, 5 However, prospective evidence showed that such factors explain roughly 27% of the excess risk in African Americans relative to Whites,6 suggesting that variation in genetic susceptibility across populations may have an important role.7, 8

The observed increase in colorectal cancer incidence has possibly been accompanied, if not led, by an increase in the incidence of colorectal adenomas.9, 10 Adenomas are the main precursor lesions to most sporadic colorectal cancers and develop through the complex interactions of environmental and genetic risk factors.11 Although recent reports have suggested that the risk of colorectal adenoma may be influenced by racial differences12, 13 in admixed populations, these findings are based on reported ethnicity rather than measured genetic ancestry. Self-reported measures of ethnicity in admixed populations are notoriously inaccurate regarding the individual ancestry.14 This is especially important in populations of Latin America where the admixture between people of at least three continents (Africa, Europe and America) has been widespread since the seventeenth century.

In this study we used a sparse set of single-nucleotide polymorphisms (SNPs) to evaluate the association of genetic ancestry with the risk of colorectal adenomas and adenocarcinomas in the Colombian population controlling for well-known colorectal cancer and adenoma risk factors.

Material and methods

Study population and enrollment

Cases and controls were randomly extracted from a larger multicenter case–control study aimed at identifying the environmental and genetic risk factors of colorectal cancer in Colombia. After ethical approval from the Ethics Board of The National Cancer Institute of Colombia, we recruited incident cases (diagnosed at enrollment) of colorectal adenoma and adenocarcinomas at major colonoscopy medical centers in six of the largest Colombian cities (Barranquilla, Bogota, Bucaramanga, Cali, Cartagena and Santa Marta) from January 2008 to February 2011. Colombia has not yet established a colorectal cancer screening program; therefore, most of the colonoscopy examinations were medically indicated. Cases were originally diagnosed after a complete and satisfactory colonoscopy examination, but only pathologically confirmed cases were finally enrolled in the study. Eligible cases were Colombians, residents in the city of enrollment, aged between 30 and 75 years at the time of colonoscopy, willing and mentally capable to participate, and did not have any personal history of colorectal cancer, ulcerative colitis or Crohn’s disease. Controls were approached at the waiting room of primary care units, nearby or in the same hospital where the cases were recruited among individuals attending for medical conditions different from gastrointestinal discomfort and willing to participate; they were unrelated to cases and had no personal history of cancer or colorectal adenomas. Controls were matched by sex and age group (±5 years) to the cases.

Participants gave written informed consent, donated a blood sample and answered a full epidemiological survey, a food frequency questionnaire designed for the study15 and the short version of the IPAQ16 (International Physic Activity Questionnaire), looking into the best-known risk factors for colorectal adenoma and adenocarcinoma.17 Buffy coats were kept in portable liquid nitrogen containers until transferred in dry ice to the National Cancer Institute Facilities in Bogotá for final storage at −80 °C. Questionnaires were processed centrally. We used Teleform, version 5.2 software package (Cardiff Software, Inc., Highland Park, IL, USA) to increase the efficiency of data management and reduce typing error. By the end of the recruitment phase, we enrolled 506 controls, 322 adenocarcinomas and 239 colorectal adenomas. Because of funding constraints, we restricted our genetic analyses to a random subset sample of 264 controls, 206 adenocarcinomas and 126 adenomas. Adenomas included into the analysis were large (≥1 cm) without severe dysplasia, and with histopathological diagnosis of tubular, tubulovillous or villous adenoma (<20%, 20–80% and ≥80% of villous component, respectively).

SNP genotyping

DNA was extracted from buffy coat samples, using the QIAamp DNA Blood Mini KIT (Qiagen, Hilden, Germany), as recommended by the manufacturer and eluted in 100 μl of Nuclease-free Water (Ambion, Carlsbad, CA, USA). Two hundred and fifty nanograms (250 ng) of DNA were resuspended in 5 μl of TE Buffer, denatured and bound to paramagnetic beads for high-throughput genotyping using the protocols described for the highly multiplexed GoldenGate assay18 (Illumina Inc., San Diego, CA, USA). Briefly, two allele-specific oligonucleotide (ASO) probes, linked to universal primer sequences (labeled with either Cy3 or Cy5 for each allele), along with one locus-specific oligonucleotide (LSO) probe, also linked to a universal primer and an address sequence, are hybridized to the DNA. Extension of the ASO and ligation to the LSO is carried out, and the product is amplified by PCR. The amplified product was hybridized to the chips containing sequences complementary to each unique address sequence and the alleles were determined by the scanner according to the fluorescent emitted (Cy3, Cy5 or both). Microarray analysis was done at the LSUHSC/LCRC Genomic Facility in the Stanley S Scott Cancer Center at Louisiana State University Health Sciences Center, New Orleans, LA, USA. The SNP panel used for this study (Illumina Cancer Panel) consist of 1421 thoroughly screened and validated SNP loci, covering all chromosomes and tagging 408 genes chosen from the National Cancer Institute’s Cancer Genome Anatomy Project SNP500 Cancer Database.19 According to the manufacturer, the mean minor allele frequency (MAF) across all the SNPs in the genotyping panel was 0.25, 0.22 and 0.21 for Caucasians, Han Chinese/Japanese and Yoruba Africans, respectively.

Quality control

We followed a standard quality control (QC) protocol for case–control genetic association studies20 using the PLINK software.21 SNPs were excluded from the analysis if they departed from Hardy–Weinberg equilibrium (P<0.01), there was a significant difference between missing genotype rates among cases and controls (P<0.01), the SNP overall call rate was <0.95 or the MAF was <0.04. Participants with call rates ≤0.95, or with heterozygosity rates >3 SD from the sample mean, were also excluded. In addition, we excluded one individual of each pair featuring an identity by descendant value >0.375 from the analysis, avoiding duplicated, related or contaminated samples. Gender could not be reliably estimated from the limited number of SNPs available on the X chromosome (N=13), and we relayed on our recorded gender. Eighteen percent of the controls (n=21), 10% of the adenomas (n=13), 7.7% of adenocarcinomas (n=16) and 15.8% of the SNPs (n=225) did not pass through the QC, leaving 238 controls, 113 adenomas and 190 adenocarcinomas for the analysis.

Inference of ancestry proportions

We used the Bayesian clustering algorithm STRUCTURE version 2.2 (Pritchard et al22) under an admixture model to estimate the proportions of European, African and Amerindian ancestry in each of our samples. We used a flat prior before running a burn-in period of 5000 iterations and kept 1 in 5000 iterations. Under the admixture model, the genotype information of each individual is modeled assuming that they inherited a fraction of their genome from ancestors originating from one of the kth populations of origin. We included a set of overlapping SNPs (N=804) genotyped in three reference ancestral populations (k=3) from the HapMap3 project:23 Utah residents with Northern and Western European ancestry, Luhya in Webuye, Kenya and Han Chinese in Beijing, China. The well-established similarities of the allele frequencies between the latter population and Amerindians24 made it a useful alternative to discriminate the Amerindian from African and European ancestry in the study sample. Six hundred and seventy-eight autosomal SNPs remained for analysis after pruning for linkage disequilibrium, excluding one of each pair with R2>0.5, in a windows size of 50 SNPs and a window shift of 5 SNPs.

To verify the admixture estimations using the selected set of SNPs, we also estimated the ancestry fractions of individuals from two admixed populations included in the HapMap3 database: Mexican ancestry from Los Angeles (MEX) and African ancestry in Southwest USA (ASW). Finally, we calculated the informativeness for assignment measure (In) proposed by Rosenberg et al25 to estimate the ancestral information that each SNP included provides.

Statistical analysis

We compared the mean ancestry fractions between cases (adenomas and carcinomas) and controls using one way ANOVA tests. To evaluate the association of genetic ancestry with adenoma and cancer separately, we used binary conditional logistic regression models controlling for potential confounding factors. As the three ancestry fractions are dependant on each other, they cannot be handled as independent variables at once. To overcome this limitation, without leaving out from the analysis any of the ancestry fractions, we include into the model two parameters: first, the arithmetic difference between European and Amerindian ancestry fractions (main genetic substitution in Latin American populations) and, second, the estimated African ancestry fraction. The latter was modeled log transformed, as it was positively skewed in the study population (Figure 1). These two parameters were fitted alternatively as raw continuous (to measure the linear trend) and as categorical variable to measure the variation in risk per 10% increase of African ancestry (from 0.01 to ≥0.30) and European ancestry substitution increase (from ≤−0.30 to ≥0.30). We avoid the pairwise comparison of the resulting ancestry fraction categories in the logistic regression analysis, but we report their distribution for descriptive purposes.26 The multivariate analysis, which had city of enrollment as conditional variable, included the following: gender, age, attained education level (none, elementary school, secondary school, technical studies (ie, college), university or higher), family history of colorectal cancer in first-degree relatives, history of alcohol intake (no intake, <12.50 and ≥12.50 g/day), cigarette smoking (<0.5, 0.5–0.9 and ≥1 packs/year ), red meat consumption (< 2, 2–4 and ≥5 servings per week), physical activity (<10, 10–19, ≥20 h/week), non-steroid anti-inflammatory drugs (at least one per week during the last 6 months, yes or no), dietary fiber and total energy intake (quartiles regarding the distribution among controls).

Figure 1
figure 1

Quantile-normal plot per ancestry fraction in the study population.

Results

Cases were slightly older than controls. Although adenomas where positively associated with higher attained education (P=0.02), adenocancinomas showed the opposite, being positively associated with lower educational level instead (P<0.01). Cancer cases also showed an inverse association with BMI at diagnosis, probably due to reverse causation (Table 1). No other differences were observed among the risk factors included into the analysis.

Table 1 Main demographic characteristics and risk factors in the study population

Among adenomas cases, European (mean=0.44, 95% confidence interval (95% CI)=0.42–0.46) and African (mean=0.13, 95% CI=0.11–0.15) ancestry proportion were higher compared with controls (European: mean=0.39; 95% CI=0.38–0.41; African: mean=0.11, 95% CI=0.10–0.12), whereas the Amerindian mean proportion behave just inversely proportional to the European. In contrast, cancer cases showed a higher mean proportion of African ancestry compared with controls (0.14 vs 0.11 P<0.01; Table 2). When comparing the categorical distribution of African ancestry and European ancestry substitution (European minus Amerindian ancestry fractions) in the study population, we observed similar results: an association of African ancestry with both adenomas (P=0.07) and cancer (P=0.02), whereas the European genetic substitution only was associated with adenoma (P=0.001) but not with cancer cases (P=0.95; Table 3). Ancestry fractions estimated for the MEX and ASW population were very similar to those reported in the literature27 despite the low In values featured by the SNP included in our analysis (max=0.34, mean=0.03, SD=0.04), reassuring the reliability of the ancestry estimations.

Table 2 Mean ancestry fraction and 95% CIs in controls, adenomas, adenocarcinomas and admixed population (MEX and ASW) included into the analysis
Table 3 Categorical distribution of African and European ancestry substitutiona in controls, adenomas and colorectal cancer cases included into the analysis

After controlling for confounding, conditional logistic regression analysis results were consistent with the crude ones, showing a positive marginal association of increasing African ancestry with colorectal adenomas (risk variation per 10% increase (odds ratio (OR))=1.122; 95% CI=0.97, 1.30; P for linear trend=0.08) and statistically significant with adenocarcinomas (risk variation per 10% increase (OR)=1.19; 95% CI=1.05, 1.35; P for trend=0.003; Table 4). In contrast, increasing European ancestry was positively associated only to adenoma (risk variation per 10% increase (OR)=1.25; 95% CI=1.08, 1.46; P for trend=0.001) but not to cancer (risk variation per 10% increase (OR)=1.02; 95% CI=0.90–1.16; P for trend=0.75). In addition, adenoma was associated with university or higher education compared with primary school (the most prevalent category; OR=3.81, 95% CI=1.59–9.17), whereas colorectal cancer risk was marginally associated with no education attained (OR=2.5, 95% CI=0.89–7.36). Adjusting by age and sex did not modify the results significantly (Table 4). There was no evidence of heterogeneity in the mean differences of African ancestry when comparing adenomas or adenocarcinomas with controls across education strata (Figure 2). When exploring the ancestry association stratified by distal and proximal colorectal neoplasms, we did not observed any differences from the overall results (results not shown).

Table 4 OR and 95% CIs of age and sex adjusted and fully adjusted conditional regression models for genetic ancestry fractions and known risk factor of colorectal adenoma and adenocarcinoma
Figure 2
figure 2

Forest plot for the standardized mean differences (SMD) of African ancestry fraction (log scale) in adenomas and adenocarcinomas stratified by education level attained.

Discussion

To the best of our knowledge, this is the first report on the association of genetic ancestry and sporadic colorectal adenomas and adenocarcinomas in an admixed population. Our findings add evidence to the hypothesis that genetic ancestry influences cancer risk in Latino populations. A similar positive association has been reported previously between European ancestry and breast cancer in the Mexican population.28 Genetic ancestry fractions estimations have recently drawn the attention in clinical practice as they have been proposed as a genome-wide biomarker useful to evaluate relapse in children undergoing therapy for acute lymphoblastic leukemia.29

The association of African ancestry not only with adenocarcinoma but also marginally with adenoma supports our hypothesis of the role of genetic ancestry in early stages of colorectal carcinogenesis and may rely on differences in allele frequencies in polymorphism related to colorectal cancer risk.30 We found that this association was not confounded by well-known risk factors including education attained. Education level included into the analysis, was chosen as proxy of socioeconomic status (SES), because it is attained early in life and does not change greatly after the third decade of life. SES has previously been associated independently to colorectal cancer worldwide and to genetic ancestry in Latin America. The nature of the association between SES and colorectal cancer risk is discrepant across continents.31 Whilst studies in Europe, East Asia and Australia, in general, have found a positive association, in the United States and Canada the association observed is inverse.31, 32 This discrepancy is not fully understood but it is partially explained by differences in screening coverage31 and the way environmental factors (mediators) are interrelated with SES (determinants).32, 33 Our results are contrary to a previous study reporting a positive association of colorectal cancer and SES in Colombia.34 Here we found an inverse association of education level with adenocarcinoma, but also a positive association on higher education level with adenoma (Table 4). A previous report has also shown the positive association on European ancestry with higher education among Latinos.35 Thus, the interplay of wealth, European ancestry and education among the Latino population, could partially explain the positive association of European ancestry and adenoma risk found in this analysis.

In contrast, the positive association of African ancestry to both adenoma and adenocarcinoma reported here is hard to explain due to differences in SES given that in our study, first, African Ancestry was not associated with education level (P for trend=0.76; Figure 3) and, second, adenomas and adenocarcinomas showed opposite association to education level. It is worth mentioning that Afro-Colombians are an ethnic minority with large disparities compared with the overall population. In our study population, the African ancestry was not high (interquartile range=0.6–0.24) as African Americans (80%) and its likely that within this range most of the individuals did not identify themselves as being of African descent.

Figure 3
figure 3

European and African ancestry fraction box plot per education level attained in the study population.

Our study features several strengths. It includes both preneoplasic and neoplasic lesions confirmed by histopathology, allowing us to evaluate the association of ancestry in the early events of colorectal carcinogenesis; cases and controls were sampled from the same population (all controls were recruited at general practice consult and cases were mostly referred by general physicians to colonoscopy), assuring a better control for selection bias. Our ancestry estimations are reliable, as those estimated for the ASW and MEX individuals were similar to those published in the literature.27 The differences in the anatomic distribution of adenomatous polyps and cancer showing a shift to the left for cancer cases was observed as previously described,36 and the similar age range in adenomas and adenocarcinomas does not suggest a selection bias.37

There are some limitations of our study that should be considered when interpreting the results. It is likely that educational levels do not reflect, nor control, the entire variability of the socioeconomic status; thus, residual confounding may exist. Nevertheless, here we assessed and included into the analysis the most relevant nutritional and lifestyle factors that may mediate the association of SES with colorectal adenoma and adenocarcinoma. There could be a differential access to colonoscopy, where wealthy people may have better access to these procedures. However, the chance of underrepresentation of individuals with a higher level of education in the control group is not likely, as recent census data38 in Colombia showed that only 9% of the population have university or a higher education degree, a similar value observed in our control sample (11%). Our sample size provides limited statistical power to detect small effects, especially regarding the observed effect in the association between African ancestry and adenomas. Nevertheless, we reduced multiple hypothesis testing to the minimum. In addition, the point estimates observed showed narrow CIs and the crude and adjusted results were consistent. If African ancestry is actually associated with colorectal cancer, the association with adenoma (its main precursor lesion) could also be expected. However, the use of a common set of controls could also explain such association and therefore the results should be interpreted with caution. As an observational study, residual confounding cannot be ruled out. Replication of these results is warranted to validate our results.

The positive association of African genetic ancestry with adenoma and colorectal cancer is consistent with a recent publication reporting that colorectal cancer risk is likely to be mediated through genetic susceptibility to adenomas.39 Early detection of adenomas is a key issue in colorectal cancer control. Newly published evidence supports that detecting adenomas and removing them not only decreases the incidence but also the mortality of colorectal cancer.40 There is now promising evidence showing that genetic markers could discriminate population at increased risk of colorectal cancer. It has been shown, for example, that adding information of SNPs associated with colorectal cancer to family history increases the absolute risk estimation of having the disease at population level.41

Further research should address how these SNPs, discovered mainly in European Caucasian population, influence the genetic association here reported. Admixture mapping42 could be the next step to further explore the mechanism behind this association. There is no admixture mapping analysis published so far on colorectal cancer despite the large number of GWAS study on this cancer. A variant in chromosome 8q24 initially described by admixture mapping for prostate cancer in African Americans also showed a positive association with colorectal cancer risk30 and, recently, a case–control study reported an association of one 8q24 loci variant (rs380284) and adenoma risk in Caucasians.39

In conclusion, we report for the first time that African ancestry (or variants linked to it) contributes to the susceptibility of colorectal cancer in admixed Latin American population. Our results are promising as they may help get insights of colorectal carcinogenesis and even more to find biomarkers useful to stratify colorectal cancer risk in the Latino populations, where colorectal mortality rates are increasing, although not high enough to recommend mass screening programs.43