Large-scale efforts to identify breast cancer (BC) risk alleles have historically taken place among women of European ancestry. Recently, there are new efforts to verify if these alleles increase risk in African American (AA) women as well. We investigated the effect of previously reported AA breast cancer and triple-negative breast cancer (TNBC) risk alleles in our African-enriched International Center for the Study of Breast Cancer Subtypes (ICSBCS) cohort. Using case–control, case-series and race-nested approaches, we report that the Duffy-null allele (rs2814778) is associated with TNBC risk (OR = 3.814, p = 0.001), specifically among AA individuals, after adjusting for self-indicated race and west African ancestry (OR = 3.368, p = 0.007). We have also validated the protective effect of the minor allele of the ANKLE1 missense variant rs2363956 among AA for TNBC (OR = 0.420, p = 0.005). Our results suggest that an ancestry-specific Duffy-null allele and differential prevalence of a polymorphic gene variant of ANKLE1 may play a role in TNBC breast cancer outcomes. These findings present opportunities for therapeutic potential and future studies to address race-specific differences in TNBC risk and disease outcome.
Breast cancer (BC) is caused by a combination of dynamic influences, which are typically unique for each individual, but frequently may include underlying heritable genetic risks. Particularly, breast cancer patients who have early onset, or pre-menopausal incidence, typically are carriers of germline mutations in key cancer genes1,2. However, population studies have shown disparities in BC incidence and mortality among ethnic and racial groups persistently over the past five decades. In the US, White/European Americans (EA) have historically demonstrated the highest incidence of breast cancer, while Black or African Americans (AA) have the highest mortality rates reported in any race/ethnic group3,4. Interestingly, this mortality gap only emerged in the late 1970s, coinciding with implementation of targeted hormone therapies. The consequential decrease of mortality in EA was not matched in AA, which aside from unequal access to these new therapies, unmasked a race-group bias in breast tumor biology and incidence rates of tumor subtypes. Population studies of hormone receptor (HR) status in breast cancer diagnoses indicates a two-fold increased risk of Triple Negative Breast Cancer (TNBC) in AA compared to EA patients, which persists after adjusting for stage and age at diagnosis5,6,7,8. This trend also extends beyond certain social determinants, with AA having the highest rate of TNBC at every poverty level as well9. This finding translates to disproportionate survival benefits in EA patients from the standard-of-care targeted therapies that are primarily designed to target HRs10, which AA diagnosed with TNBC are not eligible to receive. Clinically, TNBC is a confirmed adverse prognostic feature in patients overall11, and in AA patients specifically12, and it underscores a need to identify any unique risk of certain breast cancer subtypes. An investigation of genetic risk across self-identified AA groups becomes more informative with the inclusion of an individual’s genetic ancestry composition, as levels of African versus European or other ancestry may be found at varying levels among this admixed population. For example, genetic risk in particular ancestral groups could be unmasked by investigating risk alleles within the predominant ancestral group, as opposed to the traditional risk studies that were devoid of ancestry data13. However, there is a severe shortage of genetic and GWAS data in non-white populations13,14, where less than 10–15% of individuals in population studies are Black, Indigenous, and People of Color (BIPOC), if race or ethnicity groups are reported at all13. This tragic limitation stifles our efforts to identify population-specific risk alleles outside of European descendant groups. However, recent studies have investigated race-specific risk; including, the Multi-Ethnic Cohort (MEC)15, the African American Breast Cancer Epidemiology and Risk (AMBER) Consortium16,17,18,19 (which includes the MEC), and our International Center for the Study of Breast Cancer Subtypes (ICSBCS), along with others14,20,21,22, are paving the way to more inclusion of AA and African participants in genomic research.
Previous studies inferred that AA-specific risk alleles held race-group specificity due to shared African genetic ancestry among AAs15,23. Through our Oncologic Anthropology epidemiological studies of breast cancer incidence and prevalence across the African Diaspora24,25, we have revealed a common trend of lower incidence but higher mortality among women of African descent26. Globally, there is also higher frequency of TNBC among women of western sub-Saharan African descent within every country that has a substantial population of individuals of African descent, and where we could investigate HR status, coupled with higher distribution of poor prognosis in these groups as well7,27,28,29,30. This strikingly correlates with the social history and unparalleled numbers of Africans dispersed during forced migrations of the Trans-Atlantic Slave trade, where over hundreds of years and a dozen generations, enslaved Africans were scattered across Europe, the Americas and the Caribbean.
We previously reported our independent analysis of AA race-group specific risk and our previous findings were able to replicate some, but not all, BC and TNBC-specific risk alleles in our African-enriched ICSBCS cohort31. Distinctions in risk associations from hazard models between cohorts could be confounded by bias in shared ancestry, due to differences in composition of genetic admixture among AAs. In this report, we reconsidered our previous risk findings to determine their relevance from a more global perspective, by (i) including additional ancestral populations from contemporary African women, and (ii) adjusting risk models for bias in ancestry background within admixed AAs. These efforts will provide further evidence and methodological insight in the role of shared African ancestry in the shared racial disparity of TNBC incidence across the African diaspora.
Multi-ethnic cohort analysis of population-specific BC risk alleles reaffirms race group specific effects
Our overall BC risk assessment model was an all-inclusive analysis, including all breast cancer subtypes and self-indicated race (SIR)/ancestral groups, where we have expanded the number of BC cases from Eastern and Western African nations, investigating previously published BC risk alleles that have been validated among African American women in the AMBER consortium32 (Tables 1, 2, Fig. 1A (left)). No strong linkage disequilibrium was observed among these alleles (maximum r2 of 0.44). Three alleles replicated previous associations of increased overall BC risk in our unadjusted models. These include rs2981578 (FGFR2), rs4849887 (GLI2), and rs3745185 (BABAM1). Interestingly, we found that the T allele of rs2981578 in the FGFR2 gene was associated with increased risk (OR = 1.508, p = 0.008491), which contrasts with previous reports of the C allele as the risk allele. The C allele of rs4849887 in the GLI2 gene was associated with increased risk (OR = 1.654, p = 0.006122), replicating previous findings. We also replicated the protective A allele of rs3745185 in the BABAM1 gene (OR = 0.67, p = 0.008402).
To determine whether these all-inclusive association models may be confounded by race-specific bias in age or allele frequency, we adjusted the risk model to correct for race and age. Interestingly, each unadjusted risk association loses significance in the combined race group model after adjusting for race and age, indicating that the risk alleles may have higher frequency in one of the SIR groups (See Table 1). Specifically, in the case of the risk (C) allele of rs4849887, we find it is 10–15% lower in populations of West African descent (AA = 34.9%, Ghanaians = 32.9%), compared to European Americans (49.5%) and East Africans (44.0%) in our cohort. Two additional alleles gained significance in overall BC risk associations after race and age adjustments in our all-inclusive model, rs2981579 in the FGFR2 gene (OR = 1.899, p = 0.03038) and rs3112572 in the LOC643714 gene (OR = 2.410, p = 0.03055).
Next, we tested whether the associated BC risk of our candidate alleles was different among SIR groups by performing a nested BC risk assessment within each of the SIR groups (Table 2 and Supplemental Table 1). While we observed rs4849887 was associated with overall BC risk prior to adjusting for age and race, this allele is associated with higher overall BC risk only in Ghanaians prior to adjusting for age (OR = 2.472, p = 0.001032) (Fig. 1B, Supplemental Table 1). While we did not observe a significant association between rs609275 and overall BC risk for the whole cohort assessment, a very high overall BC risk was observed specifically for AA prior to adjusting for age (OR = 5.383, p = 0.048). There were no significant associations found between the previously identified variants and breast cancer risk among SIR EA in both unadjusted and age-adjusted models (Supplemental Table 1).
TNBC-specific case-series analysis of population-specific BC risk alleles shows associations within ancestral groups
The higher rate of TNBC among women of African descent worldwide begs the question of whether there is a shared genetic risk among the African diaspora, and we have previously shown that quantified West African ancestry was strongly associated with TNBC disease31. Using a case-series analysis in our African-enriched cohort, we tested whether previously reported AA-specific risk alleles were associated specifically with TNBC disease risk (Table 3, Supplemental Table 2, Fig. 1A (right)). Prior to adjusted covariate modeling, five of the nine AA-risk variants showed significant association with TNBC disease risk. Four of these variants were not previously reported as having ER-negative disease specific risk, and four were predicted to have a protective effect; including, rs2981578 in FGFR2 (OR = 0.667, p = 0.0627), rs3745185 in BABAM1 (OR = 0.503, p = 0.009), rs4849887 in GLI2 (OR = 0.414, p = 0.003), and rs2363956 in ANKLE1 (OR = 0.593, p = 0.0149). Only the SNV rs609275 in MYEOV/CCND1 showed higher hazard/risk for TNBC in the unadjusted model (OR = 2.479, p = 5.68E-05). The ANKLE1 variant rs2363956 replicated in the TNBC/ER-negative specific protective effect that was previously reported and was the only variant to retain significance after adjusting for race and age (OR = 0.542, p = 0.014).
Similar to our BC case–control analysis, we used a nested risk analysis within SIR groups to test for SIR-specific risk. For the admixed AA population, we included quantified West African ancestry (WAa) in the adjusted covariate modeling. The rs2363956 variant in the ANKLE1 gene retained a protective effect for TNBC in AAs, even after covariate adjustments, (age and WAa adjusted OR = 0.4204, p = 0.005), indicating this is not a mere artifact of disequilibrium, or biased distribution of the allele in African populations (Fig. 1C and Table 3).
DARC/ACKR1 alleles in BC and TNBC risk
In addition to the previously implicated AA-risk alleles, we have also included DARC/ACKR1 alleles, including the TNBC risk associated Duffy-null allele31, to investigate whether alternative variants may capture risk due to unique biological contributions of either isoforms or distinct gene regulation (Table 1). Our new analysis found that four DARC/ACKR1 SNVs also had significant potential to confer overall BC risk in our all-inclusive analysis models (rs2814778 OR = 1.512, p < 0.001, rs17838198, OR = 4.798, p < 0.001, rs3027016 OR = 4.586, p = 0.005 and rs12075 OR = 2.534, p < 0.001, respectively), however, after adjusting for age and race, this is mostly lost (Table 4, Fig. 2A (left)). In our SIR nested analysis model, the DARC/ACKR1 variant rs3027013 showed a significant protective effect in EA patients, even after age-adjusted modeling (age-adjusted OR = 0.131, p = 0.03897) (Fig. 2B and Supplemental Table 3).
For DARC/ACKR1 variant associations in TNBC-specific risk, we similarly observed that seven out of eight variants were associated with TNBC disease, in which five of the minor alleles presented a protective effect and two showed increased risk, prior to race/age adjustments (rs6676002, OR = 0.191, p = 0.007; rs3027008, OR = 0.134, p = 0.006; rs17838198, OR = 0.367, p = 0.015; rs3027016, OR = 0.390, p = 0.065; rs12075, OR = 0.380, p = 0.003, rs71782098, OR = 3.403, p = 0.018; and rs2814778, OR = 3.062, p < 0.001) (Table 5, Fig. 2A (right)). Interestingly, as we previously reported with only AA and EA, the Duffy-Null allele, rs2814778, retained significant TNBC-risk association with the addition of West African samples, even after age and SIR adjustments (OR = 3.814, p = 0.001). The Duffy-Null (rs2814778) TNBC-risk association was also retained in our nested SIR analysis among AA, following both age and quantified West African ancestry adjustment (OR = 3.368, p = 0.007) (Fig. 2C and Table 5). This indicates that the TNBC-specific risk conferred by the Duffy-null allele in the DARC/ACKR1 gene is not an artifact of shared ancestry bias, but rather an ancestry-specific risk allele.
Functional consequences of the TNBC-protective rs2362956 variant in ANKLE1
In our TNBC risk analysis, we found that the minor G allele of the rs2363956 ANKLE1 variant was protective against TNBC disease, which has previously been shown for ER-negative disease among AA32. Given its SIR-specific effect, we investigated the frequency of the allele across global 1000 genomes (1 KG) populations33. Population minor allele frequency (MAF) of the protective G allele is relatively equal among European and African groups (57% vs 50%, respectively, Table 1). However, among TNBC cases in our ICSBCS cohort, the frequency of the GG genotype is much lower in AA patients, compared to EA patients (14% and 43%, respectively) (Fig. 3B). This 20% drop in the minor allele frequency in TNBC cases among AA is what explains the interpreted potentially protective effect of the minor allele, inferring the major allele may somehow drive TNBC frequency higher in AAs (MAFEA = 57.1%, MAFAA = 37.2%).
To date, despite being repeatedly reported as a risk allele in both breast and ovarian cancer32,34,35, no investigation has linked a functional impact of this variant to risk or survival in this population. Given that the variant causes a dramatic amino acid change of leucine to tryptophan (L184W, Fig. 3A), there is a high probability that the protein structure is impacted, and subsequently have altered the function. We conducted a 3D rendering of the variant, comparing the structure of the protein with leucine at position 184 (Fig. 3C) to the minor allele change to tryptophan, and found a predicted destabilization of the gene product (Fig. 3D).
The allele’s protective effect through destabilization of ANKLE1 structure, together with its significant loss in AAs who suffer from higher rates of TNBC, suggests the major allele ANKLE1 protein could be a genetic driver of TNBC. We hypothesize that wildtype ANKLE1 expression suppresses TNBC progression, which is most frequently found in EA patients when caused by the rs2363956 variant. To further investigate this theory, we determined whether the expression of ANKLE1 had any impact on survival36. We found that survival trends in TCGA breast cancer cases are significantly impacted by ANKLE1 expression, but that the advantage of ANKLE1 expression only benefits EA patients (Fig. 3E–G). Specifically, we found that when comparing high vs low/medium ANKLE1 expression within SIR groups, EA have a significant survival improvement associated with higher expression (p = 0.035), but AA did not (p = 0.83) (Fig. 3E–F). In fact, when only including patients who had high expression of ANKLE1, EA had a longer survival advantage associated with ANKLE1, compared to AA (Fig. 3G, p = 0.052). This suggests that the benefit of ANKLE1, only found in EA, could be due to the 41–53% chance that EA are expressing the polymorphic version of ANKLE1, which harbors the rs2363956 allele.
While recent findings have delineated breast cancer risk alleles that pose increased or even decreased risk in African Americans specifically, many of these findings do not always replicate in other independent multi-ethnic cohorts. This is likely because of unmeasured individual admixture among the non-white individuals, who through social history are of mixed ancestry (i.e. Caribbean, Latin American and AAs) resulting from recent genetic admixture originating from multiple ancestor lineages37,38,39. This complexity of AA ancestry includes heterogeneity of African origins, spanning multiple African parental lineages through dozens of generations. This undoubtedly creates confounding genetic backgrounds that still pose a significant obstacle in identifying causal risk alleles among “African” Americans. However, measuring this genetic and ancestral diversity, and accounting for ancestry substructure would be a key first step toward clarifying the alleles that may be shared among individuals of common ancestry within SIR groups who display common disease/tumor types. Our latest race and West African ancestry adjustments in risk models demonstrate the power of combining diverse ancestral groups and utilizing ancestry estimates to clarify either false-positive or false-negative results if models do not properly consider the underlying ancestry/genetic background of the cohorts.
Our work represents a uniquely powered cohort that is enriched with a diverse cohort of patients and controls of African ancestry to directly investigate the impact of shared African ancestry in genetic risk for TNBC. We anticipate that our observations account for increased prevalence in women of African descent, at least in part. However, our analysis is still limited by the paucity of hormone receptor status in African cases and so the limited number of patients we can include in this analysis, thus far. Despite this limitation, we have robust findings that are compelling to expound upon in follow-up molecular and clinical studies.
First, our intention to replicate and verify the findings of AA-specific risk alleles is somewhat tenuous with associations fluctuating after adjustments for age and/or race. These covariate adjustments altering significance reflect the varying frequency of these alleles across these strata in our cohort and possibly more broadly in the population. Specifically, rs2981578, rs3745195 and rs4849887 were found to be significant prior to and after race adjustment, and lost significance with age adjustment, while rs2981579 and rs3112572 were found to be significant after race and age adjustment. For alleles that are in significantly different frequency across age categories, their distribution may reflect a difference in early vs. late onset cancers. For alleles that have significantly different frequency across race categories, their distribution may reflect ancestry-specific risk or population-private variants. Either scenario warrants a larger and more inclusive dataset to uncover genetic risk, robustly. This is an unmet need that could be essential to cancer prevention and much needed improvement for cancer risk prediction models.
We have validated our previous finding31 of the Duffy-null allele (rs2814778) as a TNBC-risk allele in our SIR all-inclusive analysis (OR = 3.814, p = 0.001). The Duffy-null allele is an ancestry-specific allele restricted to descendants of Sub-Saharan Africans. The allele arose among Sub-Saharan Africans and removed expression of DARC from erythrocytes, lending immunity from Plasmodium vivax malaria, as this malaria parasite utilized DARC as a portal of entry into erythrocytes40,41. The allele quickly swept to fixation across this population and is found at nearly ~ 100% among West Africans, and ~ 80% among AAs42,43. With the associations between WAa and TNBC that we and others have reported31,44, the potential association of the Duffy-null allele and TNBC is of great interest. With our expanded cohort analysis, we were able to perform the TNBC case-series risk assessment among SIR AAs only, and found that the risk was significantly retained among AA women after adjusting for both age and WAa (OR = 3.368, p = 0.007). This highlights that the Duffy null allele represents an ancestry-specific TNBC risk allele, and that the findings in our SIR all-inclusive analysis were not driven by ancestry-bias in our cohort. This is an important finding among our cohort, as the Duffy-null allele would not have been identified among previous GWAS studies underpowered with individuals of African ancestry.
Second, we have investigated the consequences of the protective rs2363956 variant on the ANKLE1 gene coding region and uncovered a potential functional reason for race-group risk distinction. The allele has repeatedly been associated with breast and ovarian cancer risk and survival34,35, and this association has been replicated among AA women32. In the present analysis, we are the first to report that the ‘protective’ polymorphic ANKLE1 would be the more likely version expressed in EA patients, compared to AA or Ghanaian patients (GG genotype, 43%, 14% and 25%, respectively) (Fig. 3B). This suggests that the major T allele corresponds to a TNBC-specific oncogenic version of the ANKLE1 gene. The potential mechanism of action for increased survival would appear to be DNA damage response, as ANKLE1 has repeatedly been shown to be involved in DNA repair pathways in pre-clinical and ex vivo screening, including endonuclease activity45,46, proliferation, and drug response hits in CRISPR screens in cancer cell lines47,48,49,50. Most intriguingly, one study in non-small-cell lung cancer indicated the combination of ANKLE1 RNAi with paclitaxel increased the efficacy of the drug response51. Altogether, this is a very promising avenue for further investigation of targeted/combinatorial therapy, with potential to be transformative in treatment of TNBC, and with specific impact in AA who have higher expression of ANKLE1.
If validated through additional clinical studies, finding a novel oncogene specific to TNBC could be transformative in two ways: (i) to improve genetic risk models or create AA-tailored risk models, and (ii) to develop prognostic tests to inform survival prediction models, which currently do not include information about ANKLE1. Specifically, if we find that the patients who have longer survival carry the minor protective allele, correlated with higher expression of this polymorphic ANKLE1, we can quickly investigate if this is ultimately related to treatment response. Our preliminary data on survival trends certainly suggests this could be true.
The reported, albeit controversial, findings of TNBC mortality differences between women of African descent compared to women of European descent may be an important indicator of unknown differences in tumor biology. Here, we show that ANKLE1 expression is linked to distinct survival outcomes, and this could potentially be linked to this polymorphic version of the ANKLE1 gene. Intriguingly, this corresponds with differential impact of the gene’s expression on survival when comparing race groups among patients with high expression of the gene. While the functional consequence on mechanistic change is yet unknown, it is a clear indicator of survival and therefore a prognostic indicator. Excitingly, this also reveals a potential opportunity to develop immune-based inhibition of the oncogenic (major allele) version that is more likely expressed in AA. As the frequency of the oncogenic ANKLE1 allele is higher in AA populations, this could present an opportunity for additional research to address its potential in precision therapies to bridge the survival gap in TNBC among race groups. Inclusion of diverse cohorts have powered this discovery and will drive clinical applications in the future.
International center for the study of breast cancer subtypes
The mission of the International Center for the Study of Breast Cancer Subtypes (ICSBCS) is to reduce the global breast cancer burden through advances in research and delivery of care to diverse populations worldwide. The ICSBCS brings together an international consortium of breast cancer clinicians and researchers, all of whom share the goal of addressing genetic and phenotypic variation in breast cancer risk and survival outcomes. We accrued prospective breast cancer patients from 2013 to 2017 as previously described31, extracting germline DNA from saliva samples collected at the time of consent at Komfo Anokye Teaching Hospital (KATH) in Kumasi, Ghana (N = 120), and St. Paul’s Millennium Hospital Medical College in Addis Ababa, Ethiopia. Additional cancer patient samples were collected at the Henry Ford Health System Hospital in Detroit, Michigan, and the University Cancer and Blood Center in Athens, GA (NAA = 192 and NEA = 184). The mean age is 47 ± 15.4 (mean ± sd) for Ghanaian patients, 59 ± 12.8 for AA and 60 ± 12.1 for EA. Healthy controls (N = 271) were recruited to the ICSBCS biospecimen registry through various sources of community engagement efforts throughout the US52 and the breast cancer screening clinic at KATH22. Informed consent was obtained from all individuals participating in the study, which was approved and under the regulation of the Weill Cornell Medical College (WCM) Institutional Review Board (IRB; protocol number 1807019405). All experiments were performed in accordance with the approved IRB protocol.
Immunohistochemistry for BC tumor subtyping
For our TNBC case-series risk analysis, we determined hormone receptor status in our ICSBCS biospecimen registry via immunohistochemistry (IHC) methods that were described in detail in our previous study31. Expression of biomarkers was interpreted in accordance with the American Society of Clinical Oncology/College of American Pathologists guidelines53,54. Briefly, for estrogen and progesterone receptor IHC, staining of at least 1% was determined as positive. HER2/neu staining score of 0 or 1 + was determined as negative, and 3 + was determined as positive. HER2/neu staining score of 2 + was deemed equivocal and was further evaluated by fluorescent in situ hybridization. ICSBCS cases accrued in the USA were reviewed by the treating facility. IHC and pathology review of Ghanaian and Ethiopian cases was completed in Michigan (University of Michigan and Henry Ford Health System Hospital) and New York (Weill Cornell Medicine).
Allele selection for BC case–control and TNBC case-series analyses
In our previous publication, we investigated nine reported AA BC risk variants in our African-enriched ICSBCS cohort, to determine BC or TNBC-specific risk within self-identified race (SIR) groups in our cohort. We additionally included the Duffy-Null allele (rs2814778), a promoter region variant of the DARC/ACKR1 gene in our panel and demonstrated this allele to be a TNBC-specific risk allele among AA. Building upon our previous findings, we have both increased our number of samples across our SIR groups with genotypes available, and included an additional eight DARC/ACKR1 gene variants in our panel that are implicated as ancestry-specific alleles, or sit in regions that are potentially involved in DARC/ACKR1 gene regulation. These eight DARC/ACKR1 gene variants represent upstream variants, 5′ UTR variants, and variants in the coding region of the gene. All alleles that were assessed in subsequent analyses are described in Table 1. Additionally, our African-enriched ICSBCS cohort allows us to also incorporate African ancestry measurements into the association model (below). PLINK (version 2.0)55 was used to assess linkage disequilibrium among these alleles, and no strong linkage disequilibrium was observed (maximum r2 of 0.44).
Global ancestry estimation and genotyping of candidate alleles
Methods to determine global genetic ancestry have been previously reported in detail31,56. Briefly, DNA extracted from saliva samples were genotyped on the Sequenom MassARRAY iPLEX platform using an AIMs panel containing 100 markers specifically selected and validated for estimating continental ancestry among admixed populations57,58. The Sequenom TYPER software (version 4.0) was used for genotype calls, and STRUCTURE (version 2.3) was used to calculate admixture estimates for each individual59.
Similar to our global ancestry estimations, to obtain genotypes for our candidate variants for risk analyses (Table 1), DNA from saliva samples were genotyped for each of the variants using the Sequenom platform. For the Duffy-Null allele (rs2814778), we have obtained additional genotypes using single-target allele amplification reactions, as previously described31.
From our genotyping data, we used PLINK (version 2.0)55 to determine associations between the candidate variants and breast cancer risk in case–control analysis model, and TNBC-specific risk in case-series analysis model as previously described31. In both our BC and TNBC-specific risk analyses, we performed associations without covariates (non-adjusted), with SIR adjustment, and with SIR and age adjustments. We additionally investigated variant and risk associations within each SIR race group, where we performed analyses for non-adjusted and age-adjustments. For our analysis within SIR AA, using the genetic ancestry estimates, we were additionally able to adjust for West African ancestry in our models. For the candidate variants, we conducted the risk association using both a dominant and dosage statistical model31. In the dominance model where the genotypes are AA, Aa, aa (where a is minor allele), the resulting genotypes would be coded as 0, 1, 1 in the analysis model, where risk is weighted based on having at least one minor dominant allele. In the dosage model using the same genotypes, the resulting genotypes would be coded as 0, 1, 2, where the risk is weighted by the number of minor alleles present. In the main figures and tables, we show and discuss risk assessment output from the dosage models, where the full range of genotypes is considered in the analysis. In addition, the Benjamini–Hochberg method was used to adjust for multiple comparisons while controlling false discovery rate (FDR) at 0.05. FDR adjusted p values for Tables 2, 3, 4, and 5 are shown in Supplemental Tables 5–8, respectively.
For both the BC and TNBC-specific analyses, odds ratio output from the dosage risk assessment analyses were log transformed and plotted using the Forest Plot add-in (v8) within JMP Pro 15.0.0 statistical software (SAS Institute Inc., Cary, NC, 1989–2019).
3D modeling of ANKLE1 protein
We used the cBioPortal MutationMapper online program to visualize the ANKLE1 protective variant rs2363956 in the context of the protein domain structure60,61. For 3D modeling of the wild type and rs2363956 missense variant, the ANKLE1 amino acid sequence in FASTA format was obtained from NCBI using the GrCh37.p13 reference and was submitted to I-TASSER62,63,64. The amino acid sequence is 615 residues long, and we performed 3D modeling to obtain the structure with and without the ANKLE1 missense mutation included in our candidate variant analysis (rs2363956, L184W). The estimate of the accuracy of the predictions using I-TASSER is provided based on the confidence score (C-score) of the modeling. The C-score range is between [− 5, 2], where a C-score of a higher value suggests a model with higher confidence and vice-versa. Furthermore, Chimera program65 (version 1.14) was used for visualization and analysis of the predicted 3D ANKLE1 protein structure from I-TASSER.
ANKLE1 survival analysis
The UALCAN online database was accessed to determine potential associations between gene expression and patient survival outcomes in the TCGA BC cohort36. ANKLE1 gene expression was assessed across the patient cohort, and the upper quartile of expression was used to dichotomize expression into high and low/medium ANKLE1 expressing individuals. The log rank p value obtained between comparison groups is reported on the plots.
Rummel, S. K., Lovejoy, L., Shriver, C. D. & Ellsworth, R. E. Contribution of germline mutations in cancer predisposition genes to tumor etiology in young women diagnosed with invasive breast cancer. Breast Cancer Res. Treat 164, 593–601. https://doi.org/10.1007/s10549-017-4291-8 (2017).
Kudela, E. et al. Breast cancer in young women: Status quo and advanced disease management by a predictive, preventive, and personalized approach. Cancers 11, 1791. https://doi.org/10.3390/cancers11111791 (2019).
DeSantis, C. E., Miller, K. D., GodingSauer, A., Jemal, A. & Siegel, R. L. Cancer statistics for African Americans, 2019. CA Cancer J. Clin. 69, 211–233. https://doi.org/10.3322/caac.21555 (2019).
Hunt, B. R., Silva, A., Lock, D. & Hurlbert, M. Predictors of breast cancer mortality among white and black women in large United States cities: An ecologic study. Cancer Causes Control 30, 149–164. https://doi.org/10.1007/s10552-018-1125-x (2019).
Amirikia, K. C., Mills, P., Bush, J. & Newman, L. A. Higher population-based incidence rates of triple-negative breast cancer among young African-American women: Implications for breast cancer screening recommendations. Cancer 117, 2747–2753. https://doi.org/10.1002/cncr.25862 (2011).
Chen, L. & Li, C. I. Racial disparities in breast cancer diagnosis and treatment by hormone receptor and HER2 status. Cancer Epidemiol. Biomarkers Prev. 24, 1666–1672. https://doi.org/10.1158/1055-9965.EPI-15-0293 (2015).
Kohler, B. A. et al. Annual Report to the Nation on the Status of Cancer, 1975–2011, featuring incidence of breast cancer subtypes by race/ethnicity, poverty, and state. J. Natl. Cancer Inst. 107, 048. https://doi.org/10.1093/jnci/djv048 (2015).
Garlapati, C., Joshi, S., Sahoo, B., Kapoor, S. & Aneja, R. The persisting puzzle of racial disparity in triple negative breast cancer: Looking through a new lens. Front. Biosci. 11, 75–88 (2019).
Siegel, R. L., Miller, K. D. & Jemal, A. Cancer statistics, 2019. CA Cancer J. Clin. 69, 7–34. https://doi.org/10.3322/caac.21551 (2019).
Newman, L. A. Parsing the etiology of breast cancer disparities. J Clin Oncol 34, 1013–1014. https://doi.org/10.1200/JCO.2015.65.1877 (2016).
Li, X. et al. Triple-negative breast cancer has worse overall survival and cause-specific survival than non-triple-negative breast cancer. Breast Cancer Res. Treat. 161(2), 279–287 (2017).
Akinyemiju, T., Moore, J. X. & Altekruse, S. F. Breast cancer survival in African-American women by hormone receptor subtypes. Breast Cancer Res Treat 153, 211–218. https://doi.org/10.1007/s10549-015-3528-7 (2015).
Popejoy, A. B. & Fullerton, S. M. Genomics is failing on diversity. Nature 538, 161–164. https://doi.org/10.1038/538161a (2016).
Need, A. C. & Goldstein, D. B. Next generation disparities in human genomics: concerns and remedies. Trends Genet. 25, 489–494. https://doi.org/10.1016/j.tig.2009.09.012 (2009).
Palmer, J. R. et al. Genetic susceptibility loci for subtypes of breast cancer in an African American population. Cancer Epidemiol. Biomarkers Prev. 22, 127–134. https://doi.org/10.1158/1055-9965.EPI-12-0769 (2013).
Nanda, R. et al. Genetic testing in an ethnically diverse cohort of high-risk women: A comparative analysis of BRCA1 and BRCA2 mutations in American families of European and African ancestry. JAMA 294, 1925–1933. https://doi.org/10.1001/jama.294.15.1925 (2005).
Ruiz-Narvaez, E. A. et al. Gene-based analysis of the fibroblast growth factor receptor signaling pathway in relation to breast cancer in African American women: The AMBER consortium. Breast Cancer Res Treat 155, 355–363. https://doi.org/10.1007/s10549-015-3672-0 (2016).
Ruiz-Narvaez, E. A. et al. Genetic variation in the insulin, insulin-like growth factor, growth hormone, and leptin pathways in relation to breast cancer in African-American women: the AMBER consortium. NPJ Breast Cancer https://doi.org/10.1038/npjbcancer.2016.34 (2016).
Ruiz-Narvaez, E. A. et al. Admixture mapping of African-American women in the AMBER consortium identifies new loci for breast cancer and estrogen-receptor subtypes. Front. Genet. 7, 170. https://doi.org/10.3389/fgene.2016.00170 (2016).
Biunno, I. et al. BRCA1 point mutations in premenopausal breast cancer patients from Central Sudan. Fam. Cancer 13, 437–444. https://doi.org/10.1007/s10689-014-9717-4 (2014).
Campbell, M. C. & Tishkoff, S. A. African genetic diversity: Implications for human demographic history, modern human origins, and complex disease mapping. Annu. Rev. Genom. Hum. Genet. 9, 403–433. https://doi.org/10.1146/annurev.genom.9.081307.164258 (2008).
Jiagge, E. et al. Breast cancer and African ancestry: Lessons learned at the 10-year anniversary of the Ghana-Michigan research partnership and international breast registry. J. Glob. Oncol. 2, 302–310. https://doi.org/10.1200/JGO.2015.002881 (2016).
Chen, F. et al. A genome-wide association study of breast cancer in women of African ancestry. Hum. Genet. 132, 39–48. https://doi.org/10.1007/s00439-012-1214-y (2013).
Newman, L. A. & Kaljee, L. M. Health disparities and triple-negative breast cancer in African American women: A review. JAMA Surg. 152, 485–493. https://doi.org/10.1001/jamasurg.2017.0005 (2017).
Rotimi, C. N., Tekola-Ayele, F., Baker, J. L. & Shriner, D. The African diaspora: History, adaptation and health. Curr. Opin. Genet. Dev. 41, 77–84. https://doi.org/10.1016/j.gde.2016.08.005 (2016).
Lindquist, K. J. et al. Mutational landscape of aggressive prostate tumors in African American men. Cancer Res. 76, 1860–1868. https://doi.org/10.1158/0008-5472.CAN-15-1787 (2016).
Newman, L. A., Reis-Filho, J. S., Morrow, M., Carey, L. A. & King, T. A. The 2014 Society of Surgical Oncology Susan G. Komen for the cure symposium: triple-negative breast cancer. Ann. Surg. Oncol. 22, 874–882. https://doi.org/10.1245/s10434-014-4279-0 (2015).
Jiagge, E. et al. Comparative analysis of breast cancer phenotypes in African American, White American, and West Versus East African patients: correlation between African ancestry and triple-negative breast cancer. Ann. Surg. Oncol. 23, 3843–3849. https://doi.org/10.1245/s10434-016-5420-z (2016).
Brewster, A. M., Chavez-MacGregor, M. & Brown, P. Epidemiology, biology, and treatment of triple-negative breast cancer in women of African ancestry. Lancet Oncol. 15, e625-634. https://doi.org/10.1016/S1470-2045(14)70364-X (2014).
Davis, M. B. & Newman, L. A. Oncologic anthropology: An interdisciplinary approach to understanding the association between genetically-defined African ancestry and susceptibility for triple negative breast cancer. Curr. Breast Cancer Rep. In Press (2020).
Newman, L. A. et al. hereditary susceptibility for triple negative breast cancer associated with Western Sub-Saharan African Ancestry: Results from an international surgical breast cancer collaborative. Ann. Surg. 270, 484–492. https://doi.org/10.1097/SLA.0000000000003459 (2019).
Zhu, Q. et al. Trans-ethnic follow-up of breast cancer GWAS hits using the preferential linkage disequilibrium approach. Oncotarget 7, 83160–83176. https://doi.org/10.18632/oncotarget.13075 (2016).
Genomes Project, C. et al. A global reference for human genetic variation. Nature 526, 68–74. https://doi.org/10.1038/nature15393 (2015).
Antoniou, A. C. et al. A locus on 19p13 modifies risk of breast cancer in BRCA1 mutation carriers and is associated with hormone receptor-negative breast cancer in the general population. Nat. Genet. 42, 885–892. https://doi.org/10.1038/ng.669 (2010).
Bolton, K. L. et al. Common variants at 19p13 are associated with susceptibility to ovarian cancer. Nat. Genet. 42, 880–884. https://doi.org/10.1038/ng.666 (2010).
Chandrashekar, D. S. et al. UALCAN: A portal for facilitating tumor subgroup gene expression and survival analyses. Neoplasia 19, 649–658. https://doi.org/10.1016/j.neo.2017.05.002 (2017).
Smith, M. W. et al. Markers for mapping by admixture linkage disequilibrium in African American and Hispanic populations. Am. J. Hum. Genet. 69, 1080–1094. https://doi.org/10.1086/323922 (2001).
Cruz-Correa, M. et al. Clinical Cancer Genetics Disparities among Latinos. J Genet Couns 26, 379–386. https://doi.org/10.1007/s10897-016-0051-x (2017).
Hines, L. M. et al. The interaction between genetic ancestry and breast cancer risk factors among Hispanic women: The breast cancer health disparities study. Cancer Epidemiol. Biomarkers Prev. 26, 692–701. https://doi.org/10.1158/1055-9965.EPI-16-0721 (2017).
Livingstone, F. B. The Duffy blood groups, vivax malaria, and malaria selection in human populations: A review. Hum. Biol. 56, 413–425 (1984).
Tournamille, C., Colin, Y., Cartron, J. P. & Le Van Kim, C. Disruption of a GATA motif in the Duffy gene promoter abolishes erythroid gene expression in Duffy-negative individuals. Nat. Genet. 10, 224–228. https://doi.org/10.1038/ng0695-224 (1995).
Davis, M. B. et al. Distinct transcript isoforms of the atypical chemokine receptor 1 (ACKR1)/Duffy antigen receptor for chemokines (DARC) gene are expressed in lymphoblasts and altered isoform levels are associated with genetic ancestry and the Duffy-Null Allele. PLoS ONE 10, e0140098. https://doi.org/10.1371/journal.pone.0140098 (2015).
Howes, R. E. et al. The global distribution of the Duffy blood group. Nat. Commun. 2, 266. https://doi.org/10.1038/ncomms1265 (2011).
Jiagge, E., Chitale, D. & Newman, L. A. Triple-negative breast cancer, stem cells, and African Ancestry. Am. J. Pathol. 188, 271–279. https://doi.org/10.1016/j.ajpath.2017.06.020 (2018).
Brachner, A. et al. The endonuclease Ankle1 requires its LEM and GIY-YIG motifs for DNA cleavage in vivo. J. Cell. Sci. 125, 1048–1057. https://doi.org/10.1242/jcs.098392 (2012).
Zlopasa, L., Brachner, A. & Foisner, R. Nucleo-cytoplasmic shuttling of the endonuclease ankyrin repeats and LEM domain-containing protein 1 (Ankle1) is mediated by canonical nuclear export- and nuclear import signals. BMC Cell. Biol. 17, 23. https://doi.org/10.1186/s12860-016-0102-z (2016).
Toledo, C. M. et al. Genome-wide CRISPR-cas9 screens reveal loss of redundancy between PKMYT1 and WEE1 in glioblastoma stem-like cells. Cell. Rep. 13, 2425–2439. https://doi.org/10.1016/j.celrep.2015.11.021 (2015).
Wang, T. et al. Gene essentiality profiling reveals gene networks and synthetic lethal interactions with oncogenic ras. Cell 168, 890–903. https://doi.org/10.1016/j.cell.2017.01.013 (2017).
MacLeod, G. et al. Genome-wide CRISPR-Cas9 screens expose genetic vulnerabilities and mechanisms of temozolomide sensitivity in glioblastoma stem cells. Cell. Rep. 27, 971–986. https://doi.org/10.1016/j.celrep.2019.03.047 (2019).
Kabir, S. et al. The CUL5 ubiquitin ligase complex mediates resistance to CDK9 and MCL1 inhibitors in lung cancer cells. Elife https://doi.org/10.7554/eLife.44288 (2019).
Whitehurst, A. W. et al. Synthetic lethal screen identification of chemosensitizer loci in cancer cells. Nature 446, 815–819. https://doi.org/10.1038/nature05697 (2007).
Newman, L. A. & Jackson, K. E. The sisters network: A National African American breast cancer survivor advocacy organization. J. Oncol. Pract. 5, 313–314. https://doi.org/10.1200/JOP.091037 (2009).
Wolff, A. C. et al. Recommendations for human epidermal growth factor receptor 2 testing in breast cancer: American Society of Clinical Oncology/College of American Pathologists clinical practice guideline update. Arch. Pathol. Lab. Med. 138, 241–256. https://doi.org/10.5858/arpa.2013-0953-SA (2014).
Hammond, M. E. et al. American Society of Clinical Oncology/College of American Pathologists guideline recommendations for immunohistochemical testing of estrogen and progesterone receptors in breast cancer (unabridged version). Arch. Pathol. Lab. Med. 134, e48-72. https://doi.org/10.1043/1543-2165-134.7.e48 (2010).
Purcell, S. et al. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575. https://doi.org/10.1086/519795 (2007).
Al-Alem, U. et al. Association of genetic ancestry with breast cancer in ethnically diverse women from Chicago. PLoS ONE 9, e112916. https://doi.org/10.1371/journal.pone.0112916 (2014).
Kosoy, R. et al. Ancestry informative marker sets for determining continental origin and admixture proportions in common populations in America. Hum. Mutat. 30, 69–78. https://doi.org/10.1002/humu.20822 (2009).
Nassir, R. et al. An ancestry informative marker set for determining continental origin: Validation and extension using human genome diversity panels. BMC Genet. 10, 39. https://doi.org/10.1186/1471-2156-10-39 (2009).
Falush, D., Stephens, M. & Pritchard, J. K. Inference of population structure using multilocus genotype data: Linked loci and correlated allele frequencies. Genetics 164, 1567–1587 (2003).
Gao, J. et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci. Signal 6, 1. https://doi.org/10.1126/scisignal.2004088 (2013).
Cerami, E. et al. The cBio cancer genomics portal: An open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2, 401–404. https://doi.org/10.1158/2159-8290.CD-12-0095 (2012).
Zhang, Y. I-TASSER: fully automated protein structure prediction in CASP8. Proteins 77(Suppl 9), 100–113. https://doi.org/10.1002/prot.22588 (2009).
Yang, J. & Zhang, Y. I-TASSER server: New development for protein structure and function predictions. Nucleic Acids Res. 43, W174-181. https://doi.org/10.1093/nar/gkv342 (2015).
Roy, A., Yang, J. & Zhang, Y. COFACTOR: An accurate comparative algorithm for structure-based protein function annotation. Nucleic Acids Res. 40, W471-477. https://doi.org/10.1093/nar/gks372 (2012).
Pettersen, E. F. et al. UCSF Chimera: A visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612. https://doi.org/10.1002/jcc.20084 (2004).
Funding—R21 5R21CA210237-03 NIH/NCI (to MD), Susan G Komen (to LN), Fashion Footwear Association of New York Shoes on Sale (to LN), U54-MD007585-26 NIH/NIMHD (to CY) and U54 CA118623 (NIH/NCI) (to CY). We would like to acknowledge all of our ICSBCS team members who assisted with consent, biospecimen aggregation and logistics, and of course all breast cancer patients and healthy control volunteers for consenting and contributing to this important work.
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Martini, R., Chen, Y., Jenkins, B.D. et al. Investigation of triple-negative breast cancer risk alleles in an International African-enriched cohort. Sci Rep 11, 9247 (2021). https://doi.org/10.1038/s41598-021-88613-w
This article is cited by
Nature Communications (2022)
Analysis of the genomic landscapes of Barbadian and Nigerian women with triple negative breast cancer
Cancer Causes & Control (2022)